Scientific inquiry is all about finding non-obvious patterns in observational data. It’s no surprise that that is also the core of data science.
Patterns may be obvious to any sentient creature, or they may be deeply invisible - until we invent the conceptual or technological tools to bring them to the surface. The conceptual tools may be groundbreaking paradigm shifts, such the “thought experiment” that shaped Einstein’s insight into special relativity, or powerful new frameworks of visual notation, such as Feynman’s diagrams of subatomic particle interactions.
Patterns feel ghostly and unreal until we can actually see them, on some level, with our eyes. The chief technological tools are whatever scientists and engineers can use to bring these ghosts to light. In the realm of the subatomic, the magical inventions have been visualization technologies such as the cloud chamber and the scanning tunneling microscope (the latter was invented by IBM, by the way).
Most real-world data science serves commercial interests, rather than pure science. But the restless search for deep patterns is no less critical in the business wars than among geniuses vying for Nobel Prizes. Today’s data scientists have two broad sets of pattern-sensing tools: advanced visualizations and statistical algorithms. No advanced analytic toolkit is complete without a best-of-breed library of them, with visualizations serving as the core interface at the heart of every step in the development, maintenance, and governance processes. You will find these complementary technologies - visualizations and algorithms - supported within IBM SPSS Modeler and in the complementary Big Data platforms, such as IBM Netezza Analytics, IBM InfoSphere BigInsights, and IBM InfoSphere Streams, where data is stored and resource-hungry computations are performed.