MDataHack 2019


The first MDataHack was an absolute success! Thank you for everyone who was able to come out and join us! These were some challenging datasets and the work that these 28 teams were able to accomplish in just 24 hours was extremely impressive!

The data hack consisted of food, games, tutorials, smash competitions, and of course, hacking! Click here to view the full schedule and more info.

Datasets & Prompts

Participants chose from one of three datasets from the following domains:

  • Catapult - Sports Analytics (good for beginners)
    • The dataset you will be working with contains catapult data from The University of Michigan Women’s Field Hockey team from the 2016-2017 season, the 2017 offseason, and the 2017-2018 season.
    • The original, uncleaned dataset contains over 23,000 rows. Each row contains information on the time and date of the event, the position of the player, the type of event, and multiple catapult measurements (as discussed beforehand)
    • Attempt to develop a relationship that would be valuable to a coach or player.

  • Cincinnati Fire - Public Health
    • Responds to 100,000+ incidents per year
    • All incidents records publicly available online
    • In recent years, many calls are for opioid overdoses
    • All firefighter responses in Cincinnati responses since 2015
      • 330,000+ incidents
      • Address
      • Latitude/Longitude
      • Response time
      • Incident type

  • Canvas Network - Learning Analytics
    • released in 2016
    • De-identified learning data
    • 238 MOOC courses
    • Course offered 01/14 - 09/15
    • 325K aggregated records
    • Dataset structure based on the HarvardX-MITx Person-Course data release of 2014
    • Ask questions about online course design and user behavior of MOOC
    • Build a predictive model for MOOC completion
    • Find similarities and differences for user from different disciplines
    • Build a course recommendation model for students, based on their background info and likelihood of completing the MOOC


Participants were judged base on the following criteria:

Quality of Analysis

  • Are the technical contributions sound? Are the models appropriate for the task?

Quality of Presentation

  • Are the technical contributions explained clearly? Do the visualizations match the analysis?


  • Does the analysis provide insights that advance the field or application area? Is the prompt addressed?

Winners & Presentations

All presentations for MDataHack can be viewed here!

Congratulations to the following people and teams for winning this year's MDataHack!

  • Patrick Kinnunen
  • Jacques Esterhuizen
  • Frank Doherty
  • Max Krogius
  • Akihiro Tomita
  • Rajiv Khattar
  • Matt Chatiwat Lerdvongveerachai
  • Evan Koerschner
  • Jen Sheng Wong
  • Rakshit Gogia

Organizers & Sponsors

  • Michigan Data Science Team (MDST)
    • Jonathan Stroud, Wesley Tian, Mukai Wang, Cory Laban, Seth Saperstein, Eris Llangos
  • Michigan Sports Analytics Society (MSAS)
    • Rohit Mogalayapalli, Garrett Folbe
  • Information Technology Services (ITS) Teaching and Learning Group
  • Michigan Institute for Data Science (MIDAS)