Plumbing the Depths of Unstructured Data
Turning to organizing measures like collaboration, ‘microdata’ and application of standards can help manage a deep source of information.
![michael-shashoua michael-shashoua](/sites/default/files/styles/landscape_750_463/public/import/IMG/626/345626/michael-shashoua-cutout-580x358.png.webp?itok=UrioykGk)
Can collaboration and sharing of data, within the context of adhering to a standard when devising data governance plans, help solve issues with handling unstructured data? This question arises when hearing how firms try to build data foundations to produce higher quality data analytics.
Data governance and analytics can be improved by building mechanisms to share data, says Leif Hanlen, a business development executive at Data61, a data requirements organization that is part of CSIRO, a digital development agency backed by the federal and state governments of Australia.
Yet, however advanced a data sharing system is, the rules that data vendors and exchanges have about their data make it challenging to join such disparate sets together, says John Denheen of Tyler Capital in London. “One of the big issues with getting good reference data is that you need to be able to tack it on to other data sets,” he says.
Before compatible standards can be achieved, firms have to identify the business practices that differ, says Sydney Hassal, a director at Scotiabank. Along with this, standards can vary among jurisdictions, observes Allie Harris of Bank of Montreal.
The US Office of Financial Research (OFR) points to “microdata,” which OFR chief data officer Cornelius Crowley defines as granular, targeted and specific information about entities, instruments, transactions or products. He advises that breaking down standards efforts to individual data elements like those he calls microdata can increase accuracy. In this context, one wonders whether unstructured data can be organized with a “microdata” approach—and what that can mean for following standards.
Harnessing unstructured data could be most useful to support consistent application of data standards.
Analytics Goldmine
Unstructured data is emerging as a “goldmine” for the application of data analytics, CSIRO’s Hanlen says, although it is nothing new—it is akin to “50 million Word documents saved over the last 10 years and put in a box.”
Firms should now understand the two types of unstructured data, says Hanlen. “Unstructured data sitting inside the enterprise—in the customer relationship-management system, in fields called ‘other’—is like a hole in the ground that’s yet to become a goldmine,” he says. External unstructured data from sources such as social media and market surveillance is already being tapped for its golden value, he says.
Analytics can come into play for the former type of unstructured data. “The task for analytics of unstructured data is not to build a brand new goldmine, but to extract elements of information from that unstructured data,” he says.
CSIRO and Data61 have worked on analyzing payments, documents predicting market outcomes and price signals based on the likelihood of future events. The challenge Hanlen finds in working with such unstructured data of a predictive—rather than measuring—nature is placing that data in a credible spectrum, which can be used to anticipate the future or a likely outcome.
Unstructured data may be a goldmine, but it is one that industry experts are just figuring out how to tap, for a variety of purposes. In March, ISITC executive Jeff Zoller pointed to unstructured data about investment behavior and patterns, as well as industry analysts’ commentary, and social and economic behavior, potentially supporting the disruption of data-management technology. In May, MUFG Canada Branch chief information and operations officer Ron Lee, speaking about how his firm is consolidating systems, noted this effort would increase the capability to work with unstructured databases.
Before any of these more aspirational purposes, however, harnessing unstructured data could be most useful to support the consistent application of data standards. The legal entity identifier (LEI) is ripe for this, as Karla McKenna, head of standards at the Global Legal Entity Identifier Foundation, observed in June, saying entity legal form information—a data element in the LEI repository—remains in “unstructured form.”
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Trading Tech
Lucrative market data deal with LSEG fuels Tradeweb’s record quarter
The fixed-income trading venue realized gains from its 2023 deal with the London Stock Exchange Group, amid soaring revenues from market data providers industry-wide.
Is overnight equities trading a fad or the future?
Competition is heating up in US equity markets as more venues look to provide trading from twilight to dawn. But overnight trading has skeptics, and there are technical considerations to address.
We’re running out of datacenters! (But maybe AI can help?)
The IMD Wrap: Datacenter and cloud adoption is being pushed to its limits by AI. Will we simply run out of space and power building AIs before AI figures out how to fix it?
Regis-TR and the Emir Refit blame game
The reporting overhaul was been marred by problems at repositories, prompting calls to stagger future go-live dates.
Ongoing uncertainty, volatility force new tech approach to collateral management
With market volatility and geopolitical uncertainty here to stay, Nasdaq’s Gil Guillaumey argues that firms must rethink their approach to collateral management.
What does it really mean to be a mid-tier OMS?
With Clearwater Analytics’ proposed $1.5 billion buy of Enfusion earlier this month, the market for order management systems appears to be evolving.
Agentic AI and big questions for the technologists
Waters Wrap: Much the same way that GenAI dominated tech discussions over the last two years, the road ahead will feature a lot of agentic AI talk—and CIOs and CTOs better be prepared.
Bloomberg offers auto-RFQ chat feed—but banks want a bigger prize
Traders hope for unfettered access to IB chat so they can build their own AI-enhanced trading tools