Data Scientists: Illuminate Your Patterns with Pictures

Senior Program Director, Product Marketing, Big Data Analytics

Scientific inquiry is all about finding non-obvious patterns in observational data. It’s no surprise that that is also the core of data science.

Patterns may be obvious to any sentient creature, or they may be deeply invisible - until we invent the conceptual or technological tools to bring them to the surface. The conceptual tools may be groundbreaking paradigm shifts, such the “thought experiment" that shaped Einstein’s insight into special relativity, or powerful new frameworks of visual notation, such as Feynman’s diagrams of subatomic particle interactions.

Patterns feel ghostly and unreal until we can actually see them, on some level, with our eyes. The chief technological tools are whatever scientists and engineers can use to bring these ghosts to light. In the realm of the subatomic, the magical inventions have been visualization technologies such as the cloud chamber and the scanning tunneling microscope (the latter was invented by IBM, by the way). 

Most real-world data science serves commercial interests, rather than pure science. But the restless search for deep patterns is no less critical in the business wars than among geniuses vying for Nobel Prizes. Today’s data scientists have two broad sets of pattern-sensing tools: advanced visualizations and statistical algorithms. No advanced analytic toolkit is complete without a best-of-breed library of them, with visualizations serving as the core interface at the heart of every step in the development, maintenance, and governance processes. You will find these complementary technologies - visualizations and algorithms - supported within IBM SPSS Modeler and in the complementary Big Data platforms, such as IBM Netezza AnalyticsIBM InfoSphere BigInsights, and IBM InfoSphere Streams, where data is stored and resource-hungry computations are performed.

Watson leader highlights list of eight new IBM Fellows |  CNET News
For months, IBM’s “Jeopardy” champion computer Watson has been a major  PR win for the company, and tonight, its lead developer was awarded Big  Blue’s highest technical honor.
At a ceremony in New York, CEO Samuel Palmisano celebrated Watson team  leader David Ferrucci and seven other employees as IBM’s newest Fellows.  The eight new Fellows join a group of just 209 previous winners, among  whom have been the creators of technologies such as DRAM, the scanning  tunneling microscope, Fortran, and relational databases.
And while the other seven 2011 winners include scientists and innovators  who have broken important new ground in a variety of fields, it is no  surprise that IBM put forth Ferrucci as the face of the group.
"It’s a great honor for me," Ferrucci told CNET in an interview today,  "and something that oddly enough, I’ve been inspired by since high  school."
Ferrucci, whose IBM Grand Challenge project, the Watson supercomputer, gained international notoriety in February by beating “Jeopardy” champions Ken Jennings and Brad Rutter in a head-to-head match-up, said that he’d found an ad about the IBM  Fellows program in one of his father’s magazines when he was a teenager  and had taped it to his wall. Where most high school-age boys dream of  being the hero in the World Series, Ferrucci seems to have been inspired  by dreams of incredible technological successes.
According to an IBM release, the other 2011 Fellows included:
• Bob Blainey, who for “more than two decades…has focused on ensuring  not only that software can exploit hardware capabilities optimally, but  also that hardware designs evolve to support higher-performing  software.”
• Bradford Brooks, who was recognized for “his sustained achievement and  leadership regarding IBM’s involvement with complex materials that are  used in the electronics and information technology industries.”
• Nagui Halim, whose “technical vision and leadership launched the era of stream computing at IBM.”
• Steve Hunter, who “is a foremost industry expert in networking technologies and networking computing convergence.”
• Stefan Pappe, who leads IBM’s Specialty Service Area for Cloud  Services in the company’s global technology services delivery technology  and engineering group.
• Renato Recio, who is seen as a “world renowned technical expert in  data center networking server IO, network visualization, and related  architectures.”
• Wolfgang Roesner, who IBM calls “an expert in verification and [who]  has architected the verification tools and methodologies being used  across all IBM systems.”
Read more:

Watson leader highlights list of eight new IBM Fellows |  CNET News

For months, IBM’s “Jeopardy” champion computer Watson has been a major PR win for the company, and tonight, its lead developer was awarded Big Blue’s highest technical honor.

At a ceremony in New York, CEO Samuel Palmisano celebrated Watson team leader David Ferrucci and seven other employees as IBM’s newest Fellows. The eight new Fellows join a group of just 209 previous winners, among whom have been the creators of technologies such as DRAM, the scanning tunneling microscope, Fortran, and relational databases.

And while the other seven 2011 winners include scientists and innovators who have broken important new ground in a variety of fields, it is no surprise that IBM put forth Ferrucci as the face of the group.

"It’s a great honor for me," Ferrucci told CNET in an interview today, "and something that oddly enough, I’ve been inspired by since high school."

Ferrucci, whose IBM Grand Challenge project, the Watson supercomputer, gained international notoriety in February by beating “Jeopardy” champions Ken Jennings and Brad Rutter in a head-to-head match-up, said that he’d found an ad about the IBM Fellows program in one of his father’s magazines when he was a teenager and had taped it to his wall. Where most high school-age boys dream of being the hero in the World Series, Ferrucci seems to have been inspired by dreams of incredible technological successes.

According to an IBM release, the other 2011 Fellows included:

• Bob Blainey, who for “more than two decades…has focused on ensuring not only that software can exploit hardware capabilities optimally, but also that hardware designs evolve to support higher-performing software.”

• Bradford Brooks, who was recognized for “his sustained achievement and leadership regarding IBM’s involvement with complex materials that are used in the electronics and information technology industries.”

• Nagui Halim, whose “technical vision and leadership launched the era of stream computing at IBM.”

• Steve Hunter, who “is a foremost industry expert in networking technologies and networking computing convergence.”

• Stefan Pappe, who leads IBM’s Specialty Service Area for Cloud Services in the company’s global technology services delivery technology and engineering group.

• Renato Recio, who is seen as a “world renowned technical expert in data center networking server IO, network visualization, and related architectures.”

• Wolfgang Roesner, who IBM calls “an expert in verification and [who] has architected the verification tools and methodologies being used across all IBM systems.”


Read more:

IBM Bolsters Scientific Research to Improve Healthcare Quality and Costs (via IBMLabs)

IBM is enlisting some of the company’s leading scientists and technologists to help medical practitioners and insurance companies provide high-quality, evidence-based care to patients. As part of this initiative, IBM is collaborating with clinicians in numerous medical institutions and hiring medical doctors to work alongside its researchers to develop new technologies, scientific advancements, and business processes for healthcare and insurance providers. 

Dedicating $100 million over the next three years, the initiative will draw on IBM’s leadership in systems integration, services research, cloud computing, analytics and emerging scientific areas — such as nanomedicine and computational biology — to drive innovations that empower practitioners to focus their efforts on patient care.

FOAK Tales: A Prescription for Prediction (via IBMLabs)

True stories from IBM’s First-of-a-Kind (FOAK) program, which pairs IBM researchers with clients to bring incredible discoveries and possibilities into view. This first episode brings you the wonderful tale about how IBM researchers and clients came together to create an innovative solution for a hospital based on clever stream computing software.

Stream computing - the breakthrough that could make our planet smarter
As our world becomes increasingly instrumented and interconnected, the amount of digital information produced within our systems—transportation, healthcare, governance, etc.—is growing at rates hard to fathom. Stream computing offers tremendous potential to help a variety of industries become more “real world aware”: able to see, respond and even predict instantaneous changes across complex systems.

Stream computing - the breakthrough that could make our planet smarter

As our world becomes increasingly instrumented and interconnected, the amount of digital information produced within our systems—transportation, healthcare, governance, etc.—is growing at rates hard to fathom. Stream computing offers tremendous potential to help a variety of industries become more “real world aware”: able to see, respond and even predict instantaneous changes across complex systems.


Stream Computing: For decades we’ve plugged computers into computers. Now we can plug them into the real world.

A neonatal intensive care unit. A buoy tethered deep in the waters of Galway Bay. A space center in Sweden. All three are sites where stream computing is being tested as a powerful new way of processing data.

IBM’s new middleware platform, also known as InfoSphere Streams, can ingest and analyze massive amounts of diverse data in real time and issue predictive bits of intelligence that can help its users make smarter decisions… about caring for critically ill preemies… managing a fragile marine ecosystem… and forecasting disturbances in “space” weather.

IBM - Stream computing

Remember not so very long ago how amazed you were that Google could return so many useful search results in just a split-second? Today, that kind of speed isn’t enough for many people. The real-time communications inside social networks and microblogging services such as Facebook, FriendFeed, and most of all Twitter have introduced a new immediacy to online interaction and news. Even Google concedes it’s not yet providing an adequate search experience for such real-time streams of information. (via Collecta Launches *Really* Real-Time Search Engine - BusinessWeek)

Remember not so very long ago how amazed you were that Google could return so many useful search results in just a split-second? Today, that kind of speed isn’t enough for many people. The real-time communications inside social networks and microblogging services such as Facebook, FriendFeed, and most of all Twitter have introduced a new immediacy to online interaction and news. Even Google concedes it’s not yet providing an adequate search experience for such real-time streams of information. (via Collecta Launches *Really* Real-Time Search Engine - BusinessWeek)

White paper
IBM InfoSphere Streams: Based on the IBM Research System S Stream Computing System

IBM System S, which was released on Wednesday, has been in the works for more than 20 years. The software uses a new streaming architecture and mathematical algorithms that can analyze thousand of simultaneous data streams in real-time. Officials say organizations, like those in the healthcare industry, will benefit from the technology’s ability to help them improve decision-making. Traditional computing models retrospectively analyze stored data and don’t have the ability to continuously process massive amounts of incoming data streams, they say. (via IBM unveils “stream computing” software | Healthcare IT News)

IBM System S, which was released on Wednesday, has been in the works for more than 20 years. The software uses a new streaming architecture and mathematical algorithms that can analyze thousand of simultaneous data streams in real-time. Officials say organizations, like those in the healthcare industry, will benefit from the technology’s ability to help them improve decision-making. Traditional computing models retrospectively analyze stored data and don’t have the ability to continuously process massive amounts of incoming data streams, they say. (via IBM unveils “stream computing” software | Healthcare IT News)

In the last week, the two most successful technology companies in the world, IBM (IBM) and Google (GOOG) have announced major new products. These are developments that will probably help the firms take business away from their competitors. The scope of the products’ applications is broad enough that the R&D investment to create them must have been extensive. IBM released “stream computing” applications that allow businesses to look at and analyze huge amounts of data in real time. Describing the product, IBM said “System S is built for perpetual analytics — utilizing a new streaming architecture and breakthrough mathematical algorithms, to create a forward-looking analysis of data from any source — narrowing down precisely what people are looking for and continuously refining the answer as additional data is made available.” The ability to have access to that kind of information will undoubtedly be valuable to governments, the financial industry, and large multinationals with thousands of retail outlets. The new software is unique and does not appear to have any direct competition. (via The Reasons Behind Google and IBM Being Ahead of the Competition - TIME)

In the last week, the two most successful technology companies in the world, IBM (IBM) and Google (GOOG) have announced major new products. These are developments that will probably help the firms take business away from their competitors. The scope of the products’ applications is broad enough that the R&D investment to create them must have been extensive. IBM released “stream computing” applications that allow businesses to look at and analyze huge amounts of data in real time. Describing the product, IBM said “System S is built for perpetual analytics — utilizing a new streaming architecture and breakthrough mathematical algorithms, to create a forward-looking analysis of data from any source — narrowing down precisely what people are looking for and continuously refining the answer as additional data is made available.” The ability to have access to that kind of information will undoubtedly be valuable to governments, the financial industry, and large multinationals with thousands of retail outlets. The new software is unique and does not appear to have any direct competition. (via The Reasons Behind Google and IBM Being Ahead of the Competition - TIME)

IBM InfoSphere Streams enables continuous and extremely fast analysis of massive volumes of information-in-motion to help improve business insights and decision making. A high-performance computing system that rapidly analyzes information as it streams from thousands of real-time sources, increasing the speed and accuracy of decision making in diverse fields such as healthcare, astronomy, manufacturing and financial trading etc.

IBM - Stream Computing - IBM InfoSphere Streams