Attending this event?
December 5-6, 2022
Yokohama, Japan + Virtual
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit Japan 2022 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Japan Standard Time (UTC +9). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: The timing of sessions and room locations are subject to change.

Back To Schedule
Tuesday, December 6 • 17:40 - 18:20
OpenDataology - A Call to Arms for Fixing the Dataset Licensing Landscape - Gopi Krishnan Rajbahadur, Huawei Technologies Canada

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Log in to leave feedback.
Publicly available datasets fuel the success of commercial AI software. Dataset licenses govern the use of these datasets. However, ensuring dataset license compliance when using publicly available datasets is a complex endeavor. First, unlike OSS licenses, dataset licenses do not clearly outline the rights and obligations associated with them. Second, datasets can be created from multiple data sources each of which may have different licenses which further compounds the issue. In this talk, we introduce our project OpenDataology, an open source initiative that proposes a novel approach to assess the potential license compliance violations associated with a dataset. OpenDataology (an LF AI and Data Sandbox project) can act as a crowd-sourced medium that enables identifying and documenting the license compliance risks associated with using publicly available datasets for AI software. Different the previous version of this talk, we will focus on how different contributors can get involved, the tools available and how OpenDataology is enhancing SPDX to enable better dataset license compliance analysis. We further discuss the risks associated with using more than 100 publicly available AI datasets (compared to the six datasets that we explored last time).


Gopi Krishnan Rajbahadur

Senior Researcher, Huawei
Gopi Krishnan Rajbahadur is a Senior Researcher at the Centre for Software Excellence at Huawei, Canada. He holds a PhD in computer science from Queen's University, Canada. He received his BE in Computer Science and Engineering from SKR Engineering College, Anna University, India... Read More →

Tuesday December 6, 2022 17:40 - 18:20 JST
  Open AI & Data Forum