Week 4:The World of Open Data
This week I had the pleasure of hearing from Deena Engel and Vicky Steeves, both experts of “Open Data”. My peers and I learned about the various resources and tools one uses when working with Open Data, the challenges around working with propriety data, how to “clean” data, and overall the possibilities and applications of working with open data.
I have actually taken Deena’s class “Database Design and Web Implementation” where I learned about different places to find Open Data. I know my personal favorite when I was in her class was “NYC Open Data”. For my final project in that class I was able to access data regarding health inspection grades across all NYC Restaurants. While NYC Open Data was great for the purpose of my assignment, I learned about so many other dynamic resources from the presentations this week. Particularly I learned from Vicky about awesome-public-datasets. Not only was it cool learning about this particular repository, but it also introduced to me to all the “awesome” lists on Github that I will definitely be checking out. The great thing about awesome-public-datasets is you know the data is up to date and reliable. Also it is organized by different categories if you are looking for data in a particular area.
Aside from learning where to find data we also learned about how to clean data. In the class I took with Deena I mostly cleaned datasets using Python leveraging libraries like Beautiful Soup. From the presentation this week I enjoyed learning about other tools like OpenRefine and Tableau. With tools like these they do not require any Python Code. Also for Tableau you can create great graphs and images to help visualize your data.
Overall I really enjoyed hearing from both Deena and Vicky. Not only did I learn about practical tools to help get me started, but I was also inspired by how exciting the world of Open Data is. There are so many resources out there to help you get started and a community that wants to help you.