The Data Commons and GitHub

Over the past few months there’s been a growing interest in archiving software projects, such as can be done via Zenodo. This is part of more general problem – as researchers increasing use environments such as iPython notebooks for their resercah there’s a growing need to archive these notebooks to allow a replay of the determination of the results.

Inspired by Stuart Lewis’s recent post on a GitHub to repository deposit we’ve recently added a mechanism to the Data Commons to allow the import of metadata from GitHub to the Data Commons, allowing the creation of an object record for a GitHub project.

Rather than import the code, we create a referential entry for the project, although of course files could be downloaded and added manually if a local copy was required.

We would hope to generalise this mechanism in the future to allow the import of dataset records from other repositories and stores meaning that content need not always be in the same place as the metadata …

