Big Data Vs. Dark Data

- Michael Shashoua
- 13 Feb 2015

In a story last week, I had looked at the use of the term "dark data" and what it means. That is a newer buzzword than one that has been thrown around much more over the past several years, "big data."

Often when "big data" has been used in financial services operations discussions, particularly in data management operations discussions, it really pertains to what is being done with data, such as how firms mine larger amounts of data or how they organize it to get the most insight and value out of it.

So this raises the question of whether "big data" is really even an appropriate term. Increased activity concerning know-your-customer (KYC) data might lead one to think that type of data is rising to the level of "big data," but if compared to something like phone call records, even from just one major cell service provider, KYC data could look small in volume by comparison.

Is there a threshold that KYC data or financial services industry data as a whole needs to cross to truly be "big" and not just "medium," perhaps? Customer details, if cross-referenced with extensive transaction data, could be that "big" in this industry.

Failing that, firms may be better off not getting caught up in "big data" hysteria. It's more effective to work to know exactly what data you do have, how frequently you are getting it and what you are trying to achieve with it—namely, what insights you are trying to generate from it.

In effect, dark data and big data are really about the same thing—data management. Dark data describes knowing where all the meaningful data is and how to aggregate it, while big data infers understanding the complete picture of what data is coming in, especially if the volume of that data is getting larger.

To join a discussion on how "big data" should be defined, visit Inside Reference Data's LinkedIn discussion group.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: https://subscriptions.waterstechnology.com/subscribe

You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.

If you would like to purchase additional rights please email info@waterstechnology.com

You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.

If you would like to purchase additional rights please email info@waterstechnology.com

More on Data Management

CME, LSEG align on market data licensing in GenAI era

The two major exchanges say they are licensing the use case—not the technology.

02 Dec 2025

Data infrastructure must keep pace with pension funds’ private market ambitions

As private markets grow in the UK, Keith Viverito says the infrastructure that underpins the sector needs to be improved, or these initiatives will fail.

25 Nov 2025

AI enthusiasts are running before they can walk

The IMD Wrap: As firms race to implement generative and agentic AI, having solid data foundations is crucial, but Wei-Shen wonders how many have put those foundations in.

20 Nov 2025

People running in marathon on city streets. Motion blur.

Jump Trading spinoff Pyth enters institutional market data

The data oracle has introduced Pyth Pro as it seeks to compete with the traditional players in market data more directly.

19 Nov 2025

50% of firms are using AI or ML to spot data quality issues

How does your firm stack up?

18 Nov 2025

Multicolored home grown fresh organic carrots arranged in size on a light gray background

FCA files to lift UK bond tape suspension, says legal claims ‘without merit’

After losing the bid for the UK’s bond CT, Ediphy sued the UK regulator, halting the tape’s implementation. Now, the FCA is asking the UK’s High Court to end the suspension and allow it to fight Ediphy’s claims in parallel.

18 Nov 2025

Man holding ticker tape showing stock prices

Waters Wavelength Ep. 339: Northern Trust Asset Management’s Jan Rohof

This week, Jan Rohof from Northern Trust Asset Management joins to discuss how asset managers and quants get more context from data.

18 Nov 2025

Tokenization & Private Markets: Where mixed data finds a needed partner?

Waters Wrap: Reading the tea leaves, Anthony predicts BlackRock’s Preqin deal, Securitize’s IPO, and numerous public comments from industry leaders are just the tip of the iceberg.

13 Nov 2025

Looking at what the buzzword terms really describe for financial data management