To achieve Big Data Singularity - information infrastructure must undergo a drastic redesign
Mo Data stashed this in Big Data Philosophy
http://www.informatica.com/us/potential-at-work/developers/what-would-an-intelligent-data-platform-look-like.aspxInformation that can understand itself and actively help developers could be in your future. Make sure your infrastructure is ready.
We live in a world that’s experiencing an exponential growth of new data sources and data types. So much so that the term “big data singularity” has emerged. This is the hypothetical point at which data enables machines to become more intelligent than their human handlers.
But to unlock the potential of this data, our information infrastructure must undergo a drastic redesign. An intelligent data platform is required to provide a limitless supply of clean, safe, secure, and reliable data. Otherwise, the typical lifecycle for a change request—the building, testing, and deployment—will be error prone, slow, and expensive.
There are two sides to this equation. One is the data itself, the other is the platform. Let’s start with the data itself.
How will we know when data is truly intelligent? For a start, it would have these attributes:
- Trusted. The data is clean, safe, and connected. In other words, it is useful immediately.
- Contextual. The data is rich in metadata and context that’s instantly accessible.
- Helpful. The data will not only tell you about itself, it will help you find additional pieces of information that are related to it. Or recommend actions to enrich the data. Imagine thecontextual shopping experience of Amazon.com, and then imagine your data as helpful as its recommendation engine.
It’s important to note that these attributes for smart data are part of a vision of the future. For data to be truly intelligent, there would have to be a new universal standard. This standard would ensure that the “trusted, contextual, secure, and helpful” metadata would be passed in a data container between users. In other words, no matter what ecosystem the data resides in, when it’s introduced into a new ecosystem, it will still live in its data container. It will stay “smart” regardless of where it lives.
Now, on to platform. Intelligent “data platforms”—where the data is not necessarily intelligent but the processes around it are—already exist in some form today.
To distinguish whether a data platform is intelligent, take data masking as an example. If an administrator has to decide what fields to mask and set up dynamic masking for those fields, then the platform is not intelligent. If, however, the platform highlights the fields that look like credit card numbers and suggests that those fields become masked, then the platform is intelligent.
In other words, an intelligent data platform relies on automation for tasks both mundane and complicated, and will free up the developer’s time. This way, developers can think more holistically about their project.
With an intelligent data platform, you will prepare, manage, and provision more data, from more systems, in less time than with any other method. You’ll get data that is freely yet securely shared and integrated, cleansed at will, and matched and correlated in real-time. The end result: more agile development and more effective business intelligence.
See what Ray Kurzweil, inventor and futurist, has to say about data’s potential at the Informatica World 2014 keynote address.
I like the concept here of information that can understand itself. However, the comment that "For data to be truly intelligent, there would have to be a new universal standard." feels like an old RDBMS statement. This is the database's view of the world, where all data needs to conform to a single view - and as new types of data are continuously being created, then it won't necessarily conform to any standard, particularly one that has not yet been invented.
I think the main thing is that there has to be metadata that can be used to approximately map the concept onto other similar concepts.
Imagine a new concept with 1,000 attributes. If 650 of those attributes are in common with another concept, then those objects can be said to be similar. (A concept is only the sum of its attributes) But is there a system that is able to map all concepts and heretofore undiscovered new attributes against all other concepts to determine a degree of similarity.
Mo-Data has a patent that attempts this:
In a method, system, and computer-readable medium having instructions for semantic matching, a configuration for one or more ontologies is determined with an ontology that has one or more concepts and a representation for the one or more concepts, and the configuration has an assignment of concepts to positions and one or more relationships between concepts in accordance with the representation. The configuration is optimized in accordance with one or more constraints, and a constraint has a relationship defined in a representation for an ontology and a judgment on a similarity of a plurality of concepts from the one or more ontologies, and an estimate is calculated for a similarity between a first concept and a second concept using the configuration.
Nice patent. Information that turns itself into understanding.