Sign up FAST! Login

Gartner Magic Quadrant for Data Integration Tools

Opportunities in the data integration tool market favor breadth of functionality in a well-integrated product set. Offerings that are flexible with regard to time to value, broad applicability, cost over value and data management synergy harness market momentum to capitalize on demand trends.


Market Definition/Description

The data integration tool market comprises vendors that offer software products to enable the construction and implementation of data access and data delivery infrastructure for a variety of data integration scenarios, including:

  • Data acquisition for business intelligence (BI), analytics and data warehousing: Extracting data from operational systems, transforming and merging that data, and delivering it to integrated data structures for analytics purposes. BI and data warehousing remain mainstays of the demand for data integration tools. The variety of data and context for analytics is expanding as emergent repositories, such as Hadoop distributions for supporting big data, in-memory database management systems (DBMSs), and logical data warehouse architectures, increasingly become parts of the information infrastructure.
  • Consolidation and delivery of master data in support of master data management (MDM): Enabling the consolidation and rationalization of the data representing critical business entities, such as customers, products and employees. MDM may or may not be subject-based, and data integration tools can be used to build the data consolidation and synchronization processes that are key to success.
  • Data migrations/conversions: Although traditionally addressed most often via the custom coding of conversion programs, data integration tools are increasingly addressing the data movement and transformation challenges inherent in the replacement of legacy applications and consolidation efforts during mergers and acquisitions.
  • Synchronization of data between operational applications: In a similar concept to each of the previous scenarios, data integration tools provide the ability to ensure database-level consistency across applications, both on an internal and an interenterprise basis (for example, involving data structures for software as a service [SaaS] applications or cloud-resident data sources), and in a bidirectional or unidirectional manner.
  • Interenterprise data sharing: Organizations are increasingly required to provide data to, and receive data from, external trading partners (customers, suppliers, business partners and others). Data integration tools are relevant in addressing these challenges, which often consist of the same types of data access, transformation and movement components found in other common use cases.
  • Delivery of data services in a service-oriented architecture (SOA) context: An architectural technique, rather than a use of data integration itself, data services represent an emerging trend for the role and implementation of data integration capabilities within SOAs. Data integration tools will increasingly enable the delivery of many types of data services.

Gartner has defined multiple classes of functional capability that vendors of data integration tools provide to deliver optimal value to organizations in support of a full range of data integration scenarios:

  • Connectivity/adapter capabilities (data source and target support): The ability to interact with a range of different types of data structure, including:
    • Relational databases
    • Legacy and nonrelational databases
    • Various file formats
    • XML
    • Packaged applications, such as CRM and supply chain management
    • SaaS and cloud-based applications and sources
    • Industry-standard message formats, such as electronic data interchange (EDI), Swift and Health Level Seven International (HL7)
    • Externalized parallel distributed processing (such as Hadoop Distributed File System [HDFS] and other NoSQL-type repositories)
    • Message queues, including those provided by application integration middleware products and standards-based products (such as Java Message Service [JMS])
    • Data types of a less structured nature, such as social media, email, websites, office productivity tools and content repositories
    • Emergent sources, such as data on in-memory DBMSs, mobile platforms and spatial applications
  • Data integration tools must support different modes of interaction with this range of data structure types, including:
    • Bulk acquisition and delivery
    • Granular trickle-feed acquisition and delivery
    • Changed data capture (CDC) — the ability to identify and extract modified data
    • Event-based acquisition (time-based or data-value-based)
  • Data delivery capabilities: The ability to provide data to consuming applications, processes and databases in a variety of modes, including:
    • Physical bulk data movement between data repositories
    • Federated views formulated in memory
    • Message-oriented movement via encapsulation
    • Replication of data between homogeneous or heterogeneous DBMSs and schemas
  • In addition, support for the delivery of data across the range of latency requirements is important, including:
    • Scheduled batch delivery
    • Streaming/near-real-time delivery
    • Event-driven delivery of data based on identification of a relevant event
  • Data transformation capabilities: Built-in capabilities for achieving data transformation operations of varying complexity, including:
    • Basic transformations, such as data type conversions, string manipulations and simple calculations
    • Intermediate-complexity transformations, such as lookup and replace operations, aggregations, summarizations, deterministic matching, and the management of slowly changing dimensions
    • Complex transformations, such as sophisticated parsing operations on free-form text and rich media
  • In addition, the tools must provide facilities for developing custom transformations and extending packaged transformations.
  • Metadata and data modeling capabilities: As the increasingly important heart of data integration capabilities, metadata management and data modeling requirements include:
    • Automated discovery and acquisition of metadata from data sources, applications and other tools
    • Discerning relationship between data models and business process models
    • Data model creation and maintenance
    • Physical to logical model mapping and rationalization
    • Defining model-to-model relationships via graphical attribute-level mapping
    • Lineage and impact analysis reporting, via graphical and tabular format
    • An open metadata repository, with the ability to share metadata bidirectionally with other tools
    • Automated synchronization of metadata across multiple instances of the tools
    • Ability to extend the metadata repository with customer-defined metadata attributes and relationships
    • Documentation of project/program delivery definitions and design principles in support of requirements definition activities
    • Business analyst/end-user interface to view and work with metadata
  • Design and development environment capabilities: Facilities for enabling the specification and construction of data integration processes, including:
    • Graphical representation of repository objects, data models and data flows
    • Workflow management for the development process, addressing requirements such as approvals and promotions
    • Granular, role-based and developer-based security
    • Team-based development capabilities, such as version control and collaboration
    • Functionality to support reuse across developers and projects, and to facilitate the identification of redundancies
    • Support for testing and debugging
  • Data governance support capabilities (via interoperation with data quality, profiling and mining capabilities): Mechanisms to work with related capabilities to help the understanding and assurance of data quality over time, including interoperability with:
    • Data profiling tools
    • Data mining tools
    • Data quality tools
  • Deployment options and runtime platform capabilities: Breadth of support for the hardware and operating systems on which data integration processes may be deployed, and the choices of delivery model; specifically:
    • Mainframe environments, such as IBM z/OS and z/Linux
    • Midrange environments, such as IBM System i (formerly AS/400) or HP Tandem
    • Unix-based environments
    • Windows environments
    • Linux environments
    • Traditional on-premises (at the customer site) installation and deployment of software
    • Cloud deployment support as a multitenant implementation but requires organizations deploy software in cloud infrastructure
    • Platform as a service (provided and consumed as a cloud service where clients don't need to deploy software in cloud infrastructure)
    • In-memory infrastructure
    • Server virtualization (support for shared, virtualized implementations)
    • Parallel distributed processing (such as Hadoop and MapReduce)
  • Operations and administration capabilities: Facilities for enabling adequate ongoing support, management, monitoring and control of the data integration processes implemented via the tools, such as:
    • Error handling functionality, both predefined and customizable
    • Monitoring and control of runtime processes, both via functionality in the tools and interoperability with other IT operations technologies
    • Collection of runtime statistics to determine use and efficiency, as well as an application-style interface for visualization and evaluation
    • Security controls, for both data in flight and administrator processes
    • A runtime architecture that ensures performance and scalability
  • Architecture and integration capabilities: The degree of commonality, consistency and interoperability between the various components of the data integration toolset, including:
    • A minimal number of products (ideally one) supporting all data delivery modes
    • Common metadata (a single repository) and/or the ability to share metadata across all components and data delivery modes
    • A common design environment to support all data delivery modes
    • The ability to switch seamlessly and transparently between delivery modes (bulk/batch vs. granular real-time vs. federation) with minimal rework
    • Interoperability with other integration tools and applications, via certified interfaces and robust APIs
    • Efficient support for all data delivery modes, regardless of runtime architecture type (centralized server engine versus distributed runtime)
  • Service enablement capabilities: As acceptance of data service concepts continues to grow, data integration tools must exhibit service-oriented characteristics and provide support for SOA deployments, such as:
    • The ability to deploy all aspects of runtime functionality as data services
    • Management of publication and testing of data services
    • Interaction with service repositories and registries
    • Service enablement of development and administration environments, so that external tools and applications can dynamically modify and control the runtime behavior of the tools

Return to Top

Magic Quadrant

Figure 1. Magic Quadrant for Data Integration Tools

Source: Gartner (July 2013)

Read it all here:

Stashed in:

To save this post, select a stash from drop-down menu or type in a new one:

You May Also Like: