Data Science – the Emergence of a New Discipline
Data science goes beyond the use of data mining, business analytics and statistical analysis to look for patterns in large data sets. It is more multidisciplinary in nature. According to Wikipedia: “Data science incorporates varying elements and builds on techniques and theories from many fields, including math, statistics, data engineering, pattern recognition and learning, advanced computing, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products.”
Data Science: An Introduction, a wikibook being developed as a tutorial on the subject, describes data science as “a child born of the mature parental disciplines of scientific methods, data and software engineering, statistics, and visualization, . . . a mash-up of several different disciplines.”.. Perhaps the most exciting part of data science is that it can be applied to just about any domain of knowledge, given our newfound ability to gather valuable data on almost any topic. However, doing so effectively requires domain expertise to identify the important problems to solve in a given area, the kinds of answers one should be looking for, and the best way to present whatever insights are discovered in a way that they can be best understood by domain practitioners in their own terms... “Merely using data isn’t really what we mean by data science,” he writes. “[Data scientists] are inherently interdisciplinary. They can tackle all aspects of a problem, from initial data collection and data conditioning to drawing conclusions. They can think outside the box to come up with new ways to view the problem, or to work with very broadly defined problems: ‘here’s a lot of data, what can you make from it?’”.. According to experts Loukides interviewed for his article, the best data scientists tend to be physicists and other scientists. “Physicists have a strong mathematical background, computing skills, and come from a discipline in which survival depends on getting the most from the data... It’s too early to tell whether data science will similarly become a distinct discipline, – with its own research agenda and educational programs that will train future generations of data scientists, – or whether over time it will be absorbed by its parent disciplines...Analysis is essentially rational decision making and problem solving. It’s the standard approach underlying management and engineering practice It involves a relatively linear set of steps and works quite well when you are looking for a solution to a relatively well defined problem.But where do the problems come from in the first place? How do you decide what problems to work on and try to solve? This second kind of innovation, – which they call interpretation – is very different in nature from analysis. You are not solving a problem but looking for a new insight about customers and the marketplace, a new idea for a product or a service, a new approach to producing and delivering them, a new business model. Their research showed that interpretive innovation generally takes place through a process of conversations among people and organizations with different backgrounds and perspectives, until the problems can be identified and clarified to the point where a solution can be developed.