Written by Yash Tekriwal – iX Data Science ‘16, UVA ’18
The first question I’ve gotten from every person I’ve met this past week has been, without fail, is:
“What is data science?”
I usually make some joke about it being Hogwarts for the technically inclined and venture to a different topic. Then I was asked to write a reflection on it and I was forced to sit down and actually come up with a real answer.
Our class is kind of a strange group – all with drastically different levels of previous experience but the same level of geekiness. A brilliant man who has literally taught himself everything he’s teaching us leads us: known as Andrew by day and Data Wookie by night, he is accompanied by two college graduates who put the “wizard” in “data wizard”. And what unites this seemingly strange group of misfits?
A search for meaning.
Or at least that’s what I came up with. At its core, we all chose to do data science because we know that we live in an increasingly data driven world. Every move we make leaves a trail. Our grocery trips, early morning runs, music choices, and so much more all leave behind a trail that can be followed. And it turns out that following that trail could lead to conclusions about our manners and habits that even we couldn’t know.
In an attempt to make some headway along that trail, we were given the task of designing a package for R. Our group decided to parse through Reddit comment threads, and use a sentiment analysis package to create a score to determine whether a thread had a positive or negative vibe. After hours of figuring out how to share files on git, our package finally began to function.
We were a bit confused at first, because on a test run, we found that a thread about “happy dogs in the park” had an overall negative score, with the most frequent words being “weed”, “drugs”, and “anger”. Then we actually looked at the thread and realized that’s just the nature of the Reddit community – trolling. Figures.
So after a trial by R, a meet and greet with SQL, and widespread confusion with Tableau, I think I can say I’m a couple of steps further along that search for meaning. If not, I’ve gained a family of 14 amazing people to help me find that meaning. At the end of the day, that’s what data science is. A quest to find meaning where it didn’t exist before.