Cleaning up and combining data, a dataset for practice
tldr: I created an open dataset for the explicit practice of data munging. Feel free to use it in assignments, but do mention where you got it from (CC-by-4.0). Also unicorns are awesome.
Find the dataset at: https://github.com/RMHogervorst/unicorns_on_unicycles
Data munging / cleaning / engineering At work I was working with a two excel files that were slightly different but could be combined into 1 dataset. This is very typical for day to day cleaning operations that analysts and data scientists do (statisticians too).
[Read More]