Data science: The Ultimate Multitool

In our last piece, we showed how important data science has become to the world’s largest and most influential organizations and businesses. One core reason: data science is multidisciplinary. Current, mature applications of the field have been in areas where excellent software engineering and data collection methods have been the norm, such as product personalization in the tech industry, customer analytics in digital marketing, supply chain optimization in retail, quantitative trading in finance, product quality management in manufacturing.

One question you may be thinking – what makes data scientists so effective? The fundamental value of data scientists comes from their ability to handle all parts of the data collection, analysis, and synthesis process. First, data scientists usually have a strong understanding of various data storage and computing systems, such as databases technology (SQL and NoSQL) and big data storage solutions (Hadoop). Second, data scientists have a strong theoretical and technical knowledge of how to apply statistics and machine learning to everyday problems. Third, they can communicate these results to various stakeholders to drive decision-making.

However, the most powerful applications of data science have yet to be realized in other areas. These include the health sciences, biotechnology, affective computing, and robotics; fields where the deepest subject-matter expertise is required in order to truly solve problems and introduce innovation. So long as data exists, new insights can be found, resulting in even better decisions made. Data science helps scientists in other fields push the frontier.

Prominent venture capitalist and Sun Microsystems co-founder Vinod Khosla agrees. In 2013, Khosla argued “in the next 10 years, data science and software will do more for medicine than all of the biological sciences together.” Most importantly, he believes in the synergy of data science – not displacement – of subject matter experts in the field, requiring “Ph.D.’s with degrees in robotics, machine vision, machine learning and similar topics to make this a reality.”

So far, Khosla’s point has been proving true. Last week, researchers at Microsoft released a promising study linking patterns of Bing searches to pancreatic cancer diagnoses. Epidemiologists and data scientists now work in tandem to address pandemics like the Zika virus. Data science can help public health officials predict where the virus will spread, who’s afflicted by it, and how to mitigate future outbreaks.

Beyond health, engineers in the most advanced disciplines have made data science core to their processes. Data scientists at SpaceX work to find anomalies in production and identify optimal design for propulsion. Physicists and researchers at CERN just released 300 terabytes worth of particle collision data – inviting other data scientists to make new discoveries.

In addition to the hard sciences, data science has been fundamentally changing the social sciences and policymaking. Psychologists increasingly use big data collected across mobile devices and social networks to answers questions infeasible to explore before. Computational journalists use advanced data analysis methods to shed greater light on complex issues, as the International Consortium of Investigative Journalists did with 2.6 terabytes of text documents and image files.

Universities and other educational programs have taken notice at the widespread applications of data science to research. Several are starting to respond accordingly, by launching masters programs in data science and, in some cases, new undergraduate programs.

Data science is changing the game across the board, helping to advance innovation across all fronts – regardless of the discipline.