Cranking Up To Eleven

Every conference agenda, and nearly every conversation regarding data, now has an element of big data to it. From Doug Laney's original Three Vs definition of velocity, variety and volume through to the more multifaceted understanding now, debate still rages over the key point regarding big data itself ─ is it actually happening?
Yes, is the simple answer. It's impossible to argue against the idea that datasets are getting bigger, that they're coming from many different sources and the time horizons for processing them are shrinking, whether that's due to technology development or function demand. But there's also an argument for no, as well. Big data, by its very definition, has always been there from a pure development perspective, in that it's the upper limit of what we can feasibly process in realistic timeframes, and what's beyond the ken of current computing ability.
Relativism and Realism
Taking that explanation into account, big data itself isn't actually a distinct phenomenon, but an intrinsic part of data usage, structured or otherwise. Labeling it as such, some argue, betrays a lack of understanding about the essence of information usage ─ it will always get bigger, and there will always be a limit. Finite resources and all that.
But neither seems particularly correct, and both have an element of facetiousness to their basic descriptions. Yes, data sets get bigger with the iterative expansion of computer power. Yes, what was considered big data even ten years ago, for instance, the organic database of every position across a large investment firm's enterprise, updated with corporate actions and projection logic, can comfortably fit onto most peoples' iPhones today. But given the level of electronic expansion in the past decade, and the consummate bloating of data generated, what we currently experience in terms of data growth is far beyond the precedent.
It's all relative, of course. The level of data that Google takes in, as an extreme example, is far in excess of what a small brokerage in the south of England may concern itself with. But both experience the same essential requirement ─ even for the small company, the addition of an extra source of market data, and incoming regulations over record keeping present it with a big data issue of processing and storage, even if it's not on a par with the sheer amount of data that Google deals with. Without reconciling two separate industries, as well, a tier-one investment bank with retail operations will generate an enormous level of data that fits what is becoming the traditional big data paradigm, but a small regional bank will also struggle with increased electronification and the information that produces as well.
The level of data that Google takes in, as an extreme example, is far in excess of what a small brokerage in the south of England may concern itself with. But both experience the same essential requirement.
Living, Breathing
But even the issue becomes the process eventually. Take our coverage of big data at Waters, for instance. A while back, I filed a lengthy story on the basics of big data, and we decided that, as a rule going forward, we would capitalize the words in order to differentiate the idea of big data from the fact that there was a large amount of data. Last week, we decided to revert to non-capitalized forms, as the concept is so widespread now that it almost seems ridiculous to refer to it as a proper noun.
In summary, I'm falling somewhere between the lines, as arguably should be the case as a journalist. While I agree that data naturally evolves and expands, I also think that the sharp incline, plotted on the metaphorical graph, can't be ignored. But I'm mainly interested in what you, the end users and the experts, think.
The keen readers among you, finally, may also have noticed a new byline on the staff this past week. Marina Daras, formerly of Private Banker International and Investment Europe, among other titles, joins us in London as our European staff writer. She can be reached at marina.daras@incisivemedia.com.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Data Management
Bloomberg, the SIPs, Broadridge, EDI, and more
The Waters Cooler: State Street’s interop play, Citi’s XiNG risk platform, power companies explore alternative nuclear supply options to datacenters.
As costs rise, buy-side CIOs urge caution on AI
Conference attendees encouraged asset managers to tread carefully when looking to deploy AI-driven solutions, citing high cost pressures.
XiNG: Inside Citi’s all-encompassing risk platform
Voice of the CTO: Citi’s chief information officer, Jon Lofthouse, explains how and why the bank has extended its enterprise-wide risk platform so that every trade in any asset class goes through it.
Demand for private markets data turns users into providers
Buy-side firms seeking standardized, user-friendly datasets are turning toward a new section of the alternatives market to get their fix—each other.
LSEG-AWS extend partnership, Deutsche Bank’s AI plans, GenAI (and regular AI) concerns, and more
The Waters Cooler: Nasdaq and MTFs bicker about data fees, Craig Donohue to take the reins at Cboe, and Clearwater closes its Beacon deal, in this week’s news roundup.
From server farms to actual farms, ‘reuse and recycle’ is a winning strategy
The IMD Wrap: Max looks at the innovative ways that capital markets are applying the principles of “reduce, reuse, and recycle” to promote efficiency and keep datacenters running.
Study: RAG-based LLMs less safe than non-RAG
Researchers at Bloomberg have found that retrieval-augmented generation is not as safe as once thought. As a result, they put forward a new taxonomy to help firms mitigate AI risk.
Friendly fire? Nasdaq squeezes MTF competitors with steep fee increase
The stock exchange almost tripled the prices of some datasets for multilateral trading facilities, with sources saying the move is the latest effort by exchanges to offset declining trading revenues.