Organizing and cataloging datasets and data access methods used in OHW projects.

Project Description

Here are my thoughts on this project: It would be cool to have a list of all the datasets OHW participants have used over the years and some code examples of how to use them. Data access and wrangling are valuable skills to have and to struggle with, but sometimes, finding the right dataset or navigating server requests can be a barrier to success during the condensed timeline of OHW. I figure if we can organize the projects from past OHW events and make them easily navigable on the website, it’ll help future participants get started quicker working with the data they’re looking for.

Planning

The general plan is to complete and document the full workflow for 1 project to start with. The workflow might be something like:

  • Pick a project repo.
  • Identify what datasets they used (and potentially for what).
  • Find their data access code.
  • If found, copy it into a new file/notebook in this repository and test it out.
  • If it’s broken, try to fix it! Document all external resources used when fixing it.
  • If/when it works, add it to the OHW website.
    • Check with the larger OHW team about where these examples should go.
  • Get input from others on the usefulness before doing any more.

We already have organized lists of projects from each year of OHW on the various websites:

Collaborators

  • Derya Gumustel
  • Adam Kemberling
  • Kasanda Lassagne
  • Boris Shapkin
  • Valentina Staneva
  • (other collaborators, please feel free to add yourselves here!)

Files

OHW_project_list.md is where I’m throwing every project from over the years. Each project’s title links to the corresponding GitHub repository, and as many linked datasets as can be found will be listed with each project.

Chatbot (Experimental)

An AI chatbot to assist OceanHackWeek participants with questions about projects, datasets, and methods.