Open Platform: To Tame the ‘Zoo’ of Market Data, First Tame the Metadata Monster

ian-mainwaring-citisoft — Ian Mainwaring, senior consultant at Citisoft

- Ian Mainwaring
- 29 Oct 2012

Over recent years, the quantity and complexity of the data required for a financial institution to operate effectively has increased dramatically, making it a reasonable assumption that demand for growing quantities of data will continue at an increased rate in the future.

Some of the factors driving this growth are already clear: the tightened regulatory regimes being designed post-credit crunch (Dodd-Frank, EMIR); the possible (if not inevitable) structural Retail Bank and Investment Bank separation; the continuing globalization of the financial markets; and the increasingly pervasive nature of social media information sources. All of these influences will result in more data and increasingly complex data requirements.

This is a multi-faceted problem for financial institutions: how to conform to a new regulatory data regime, take advantage of more sources of data, and turn potentially vast heaps of unstructured and unrelated data into useful information. In addition, data must flow between all parties involved in the financial process—from a client to an outsourced administrator, through to existing and new reporting bodies. More interfaces will be required, resulting in more data flows.

For players in the financial markets, all these factors point to escalating costs to cope with the increasing challenge of managing data in the future. Let’s examine just one small part of this problem—the onboarding of clients by a third-party administrator, which today can already be a lengthy and costly exercise that is unlikely to get easier or cheaper in the future without actively taking steps to tame the problem. But how do you tame something like data, which is rapidly moving from being a semi-domesticated animal to a feral, free-roaming beast?

In particle physics, the term “particle zoo” is used colloquially to describe a relatively extensive list of the known elementary particles. These look almost like hundreds of species in the zoo. According to this theory, however, all particles in the “zoo” have a common ancestry. In a similar fashion, the ever-increasing number of data sources and complexity of data can present a confusing landscape both from a business and technology perspective. To control this “data zoo,” one must first fully “comprehend” it—both in the sense of understanding how it is organized at a basic level, and in the sense of understanding what it encompasses. This order and control is achieved by identifying, analyzing and recording its meaning and behavior in a store—in other words, a metadata repository (or perhaps a “cage” in the “data zoo”).

To take a specific example, from both a client and an outsourced service provider’s perspective, there are significant initial and ongoing costs to both onboarding and maintaining a client’s book of business. The initial costs typically arise in understanding a client’s data thoroughly—both from a semantic and an end-to-end flow perspective. Given the disparate and non-homogenous nature of client front-end systems, data formats and the services being provided, this can be a significant undertaking. The services provided can include trade validation, trade enrichment, matching, settlement, custody, accounting (on occasions, across multiple accounting platforms) and client reporting. All of the systems in a particular business function will invariably have their own particular understanding of data and specific data formats. Further, any mapping and transformation documentation (if it exists) will typically consist of manually produced and maintained spreadsheets—which more often than not contain errors and are out of date. In essence, a large element in the onboarding cost arises because of poor documentation and accessible descriptions of client data, internal data, and data flows.

A vital part of addressing the problems outlined above is to implement a metadata repository. The primary function of this repository should be to consume data models such as messaging formats, data structures, report contents and the associated mappings, transformations and any business rules applied as data moves between these structures. A second function is to hold business descriptions and knowledge about the data: Recording information about the data itself—what it means, how it flows from place to place, and how it is transformed when it is moved—enables users to make use of the data more easily and consistently without being distracted by technical details of how or where data is stored or represented.

Halving Onboarding Costs
Taking this metadata-centric approach, firms can substantially reduce the time—and as a result, cost—of initially onboarding clients from a transaction processing perspective. We estimate this approach can reduce onboarding time and cost by between 50 and 70 percent. This benefit arises primarily from now having a clear and accurate understanding of internal data and data flows, and how these interact with a client’s data.

To achieve this, firms would incur an initial overhead of capturing the internal data flows within the metadata repository, but this is a one-off exercise. Once completed, the results are re-usable for many clients, and the repository starts to yield a host of additional analysis and reporting benefits.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: https://subscriptions.waterstechnology.com/subscribe

You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.

If you would like to purchase additional rights please email info@waterstechnology.com

You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.

If you would like to purchase additional rights please email info@waterstechnology.com

More on Data Management

LSEG partners with Citi, DTCC goes on-chain, AI on the brain, and more

The Waters Cooler: Trading Technologies buys OpenGamma, CT Plan updates, and the beginning of benchmarking in this week’s news roundup.

19 Dec 2025

AI & data enablement: A looming reality or pipe dream?

Waters Wrap: The promise of AI and agents is massive, and real-world success stories are trickling out. But Anthony notes that firms still need to be hyper-focused on getting the data foundation correct before adding layers.

18 Dec 2025

Data managers worry lack of funding, staffing will hinder AI ambitions

Nearly two-thirds of respondents to WatersTechnology’s data benchmark survey rated the pressure they’re receiving from senior executives and the board as very high. But is the money flowing for talent and data management?

17 Dec 2025

Data standardization is the ‘trust accelerator’ for broader AI adoption

In this guest column, data product managers at Fitch Solutions explain AI’s impact on credit and investment risk management.

16 Dec 2025

As AI pressures mount, banks split on how to handle staffing

Benchmarking: Over the next 12 months, almost a third of G-Sib respondents said they plan to decrease headcount in their data function.

15 Dec 2025

Everyone wants to tokenize the assets. What about the data?

The IMD Wrap: With exchanges moving market data on-chain, Wei-Shen believes there’s a need to standardize licensing agreements.

11 Dec 2025

FIX Trading Community recommends data practices for European CTs

The industry association has published practices and workflows using FIX messaging standards for the upcoming EU consolidated tapes.

08 Dec 2025

TCB Data-Broadhead pairing highlights challenges of market data management

Waters Wrap: The vendors are hoping that blending TCB’s reporting infrastructure with Broadhead’s DLT-backed digital contract and auditing engine will be the cure for data rights management.

04 Dec 2025

You are currently on corporate access.