Artificial Intelligence, Real Intel: Data Vendors Make Strides on AI
Data companies are investing in AI and machine learning in a concerted push toward innovations that reveal insights and alpha.
After largely lagging behind other industries to adopt artificial intelligence (AI) and machine-learning technology, the financial markets are playing catch-up with a vigor that borders on obsession. However, some of the major financial information providers that collect, manage and disseminate massive amounts of information and data have been developing and applying these technologies—though often behind the scenes—for several years.
Over the past two years in particular, there has been a marked uptick in research and development activities, with vendors using machine-learning technology to create efficiencies in their internal processes and to improve the user experience for their clients. They are also enlisting a growing army of engineers and data scientists, launching machine-learning education programs for staff internally, and this year, S&P Global paid more than half a billion dollars to acquire machine-learning technology startup Kensho—reportedly the largest price paid for an AI company to date.
Despite the current focus, AI is not new. Indeed, as a scientific discipline, AI has existed for more than 60 years, but has experienced a renaissance in recent years due to a number of factors, including an exponential increase in computer processing power; the declining price and growing convenience of data storage solutions, such as the cloud; and, of course, the explosion of data and proliferation of “free” information—much of it unstructured—and the corresponding need to process and understand it. Thomson Reuters reports it now processes and collects more data in a single day today than it did in a month just five years ago.
Valuable Tools
In an environment of high volume, time-sensitive news and social media stories, and with the unstructured nature of information—it is estimated that 80 percent of all data within the financial industry is unstructured—AI technologies and machine-learning techniques have become valuable tools for financial information providers to process and to extract actionable information and value-added meaning for customers.
At Thomson Reuters, AI and machine-learning technologies are already driving innovation and are the engines underlying many of its products and services after the organization began phasing them in 25 years ago.
In 2016, the data giant created a Center for AI and Cognitive Computing (CAICC), comprised of scientists, engineers and designers focused on the development of smart applications through the extension of natural-language processing, machine learning, information retrieval, text analytics and human–computer interactions. CAICC partners with internal teams, customers and third parties, including start-ups and academics, to prototype and validate new solutions. It is located at Thomson Reuters’ Technology Center in Toronto and is a branch of the much larger Thomson Reuters research and development (R&D) group.
Together, the objective of the groups is to transform “knowledge work” by developing new capabilities and tools that address specific customer challenges, and to identify opportunities that could be enabled by AI and machine learning—including how Thomson Reuters consumes and analyzes a firehose of data from news, markets and social media, and how it enhances and organizes content.
“The R&D group drives innovation by creating new capabilities. So if a product or a service is in the making for future release, and the product planners discover that there’s a vital piece of technology missing that they cannot just license from somewhere else because it doesn’t exist yet—and I’m talking about [something that is] ahead of the curve, maybe three to five years—they come to us with a request. They collaborate with our teams by providing domain expertise, their data assets, and the requirements of what exactly is needed, and we provide the scientific and development expertise to create something that’s new,” says Jochen Leidner, director of research at Thomson Reuters in London. “We can modify existing algorithms and machine-learning models, or extend them or create entirely new things. We then explore the properties of what we created by trying them out on Thomson Reuters’ data assets and quantify the success of our efforts by measuring the accuracy of what we develop.”
The group regularly files patent applications and publishes scientific papers to disseminate its technology innovations, and transfers that new-found knowledge back to its business units through its internal project partners, essentially making the group an internal service provider, Leidner adds.
Accuracy and Risk Exposure
A recent development from CAICC is Reuters News Tracer, an AI-powered platform used by Reuters’ journalists that detects newsworthy events breaking on Twitter and rates the likelihood of them being true.
Two years in the making, Reuters News Tracer harnesses the power of cognitive computing and machine learning by running algorithms on a percentage of Twitter’s 700 million daily tweets. Its premise is to point journalists to events as they are reported around the world, filter out “noise,” such as chat or spam, and assess the veracity of this reporting. Reuters journalists taught the tool to ask key questions, consult historical data, and weigh relevance just like a human would, but within 40 milliseconds, giving them a reporting head start. The journalists then independently verify the information through their own channels and reporting, before publishing.
“Reuters News Tracer is a fine example of the synergy between machines and human experts,” Leidner says. “We are not automating away the journalists; rather, we are supplementing them with machine-learning tools that make them more productive.”
The tool is also live with some of the vendor’s financial customers. Thomson Reuters defines “channels” with market-moving potential or relevance to financial clients, and the clients consume that live stream without any human intervention, with a veracity score to determine the likelihood of accuracy.
Another example is Media Check, the recently launched media screening component within Thomson Reuters’ World-Check One financial crime and risk monitoring platform, which uses machine learning capabilities to filter potentially relevant negative mentions in news and text data relating to companies or individuals.
Media Check aggregates 11,600 print and online media sources, and allows users to search for individuals, companies or vessels to help identify potential instances or links to money laundering, theft, fraud, cybercrime, and political exposed persons. Media Check leverages Thomson Reuters Intelligent Tagging—proprietary machine learning-based algorithms that process and tag unstructured content—and a financial crime-based taxonomy.
“If an individual or a company is exposed to certain types of risks—regulatory risks, criminal risk, environmental risk, reputational risk, and so on—our machine-learning model will extract from a news story the kind of risks that company is exposed to and link it to the entity itself,” Leidner says.
In many cases, the machine-learning work that the R&D group and the CAICC do is “under the hood,” but the impact is better results for Thomson Reuters customers, which leads to enhanced productivity, he adds.
“For example, if we build a better search engine or a better question-answering system, then it makes our customers more productive. They don’t have to sift through a hierarchy of menus anymore, and there is an increasing demand on professional services to provide the same convenience that our customers are used to as consumers. In the consumer space, we have all been spoiled: Responses need to be fast, relevant and accurate. That same pressure applies in the financial services industry, and that’s why we need to conduct applied research in the verticals that we operate in, to provide the same high bars in terms of accuracy, relevance, and speed,” Leidner says.
Thomson Reuters is banking so much on the power of AI and machine learning that it has doubled the size of its scientific teams over the past two years, which continue to grow. “We are hiring additional scientific staff and seeking to expand the R&D team further, and the Toronto base [and the AI Center], which is creating 400 new technology jobs by the end of this year, is a sign of that,” he adds.
Expansion
Thomson Reuters is not alone in its efforts. Bloomberg has a dedicated machine-learning engineering group that has been expanding since its formation eight years ago and is on a recruitment drive today. Gary Kazantsev, head of the group, which consists of scientists, researchers and software engineers in London and New York, says machine learning has become such an important area at Bloomberg that the company runs an academic grant program to support machine-learning research at universities around the world.
“We also have positions for doctoral researchers. We take in people from academia to work with us on problems, and we run our own internal educational programs, which we started formally last year, to educate more of our engineers and product managers about the methods of machine learning, natural-language processing, and artificial intelligence. This has been very successful. Currently, we have five course offerings, which are very much like college courses,” Kazantsev says, adding that Bloomberg plans to make one of those courses available online for financial professionals with a strong mathematics background to learn more about machine learning.
Bloomberg has been implementing and investing in AI since hiring Kazantsev 10 years ago. The aim of the machine-learning team is to derive intelligence and insight from the massive amounts of data, financial information and news stories carried over Bloomberg’s network to benefit the 325,000 professionals who subscribe to Bloomberg’s terminal. His team also works to enhance Bloomberg’s enterprise products that are sold as inputs into strategies for black box trading, risk analysis and other client automated workflow, and to improve the efficiency of internal processes in terms of processing, organizing and collating the data coming in.
Over the years, it has enhanced the terminal’s question-and-answering capabilities, providing users with answers to complicated financial questions based on Bloomberg’s data resources. The team’s work enables subscribers to type questions in English queries rather than code—for instance, they could type “Which Chinese companies in the steel industry have had dividend yield of more than 5 percent last year?”—and receive real-time results, which Kazantsev says represents a huge gain in productivity for clients. Other areas of focus include sentiment analysis of news and financial filings, market impact indicators, social media analysis, topic clustering, and predictive models of market behavior.
In fact, sentiment analysis—which involves the application of machine-learning techniques to identify a news story or tweet as being relevant for an individual stock ticker, and to assign a sentiment score to each story or tweet in the feed—is something that Bloomberg has been developing and enhancing for almost a decade already.
“When I worked on the first sentiment analysis project, it took somewhere between six months and a year to get to a point where we were happy with it,” Kazantsev says. “Last year, following big investments by Bloomberg in infrastructure for machine-learning data science, we built three such models in three months. That’s a big acceleration in being able to achieve real results.”
One project the team is working on is the development of a multilingual sentiment analysis tool for instances where the underlying text is in a language other than English. Kazantsev says his team’s work enormously simplifies clients’ workflows. “For example, we have received feedback from clients, both internal and external, saying that they now use the search system differently. Where certain things used to be complicated or difficult, they’re now trivial to do.”
Fast Track
Other organizations investing heavily in AI power include S&P Global, which acquired Kensho Technologies in April for $550 million. The company says the acquisition will fast-track its use of AI, natural-language processing, and data analytics in existing and future applications to deliver improved actionable insights to clients, advanced search capabilities, and automated workflows that create new products faster.
Kensho, which previously counted S&P Global as a client and an investor, was founded five years ago out of Harvard University by CEO Daniel Nadler and has around 120 employees—mainly engineers and data scientists recruited from academia and tech organizations such as Google, Facebook and Apple. It started out by developing machine-learning systems to trawl through vast amounts of data and market-moving information seeking correlations between world events and their impact on asset prices.
“From there, we’ve built an expertise in structuring the unstructured—taking text, taking news, taking all sorts of unstructured data and turning it into linked and clean information that can be used to support different types of analysis,” says Kensho president and COO Adam Broun. After becoming a firm fixture on Goldman Sachs’ trading desks, the company’s technology has expanded to several other Wall Street firms.
S&P Global began working with Kensho two years ago on products to address specific market segments that it serves. “What that helped us to see was that a change was happening in the industry,” says S&P Global CTO Nick Cafferillo. “A lot of our clients have access to a tremendous amount of data, but they don’t necessarily have access to ‘information.’ What we have heard increasingly from our end users is that they want to spend less time linking data and trying to put it in a format that’s usable, and more time answering their clients’ questions, gaining insight, and driving value for their firms. Machine learning and artificial intelligence help us to do that, and the acquisition of Kensho was a logical next step for us.”
As such, a big focus for S&P Global has been to take different datasets from its ecosystem and link them to produce enhanced insights for clients. “It’s fairly easy in US public markets to link data, because typically, companies have some form of identifier, but it gets more difficult when you go across borders, when you start looking at private companies, for instance. Historically, we’ve had an analyst team do the work, but it can take a long time. With Kensho, we’ve been able to build algorithms that link the information automatically, which means the analysts can then validate those links and move them along. We’ve been able to generate huge time savings that has allowed us to take on more information,” says Cafferillo.
For example, using Kensho’s machine-learning algorithms, S&P Global can now link to information on privately held companies licensed from Crunchbase, which allows it to provide expedited insights for organizations interested in acquiring or investing in private companies. The linking project was expected to take several months, but took just several days.
“What this has allowed us to do is use our resources to collect other private company information from other country registries around the globe—projects that would have been conducted much further down the line,” Cafferillo says. “This reinforces how machine learning can have an impact on the way we work. It’s helping us to unlock the power of our people, so we’re able to produce new products more rapidly than before.”
The Kensho linking capability also gives S&P Global the ability to expand quickly into new markets by allowing it to onboard new datasets, he adds.
In other areas, the company plans to bring natural-language search to its desktop platforms so users can ask questions and get quick results from across S&P Global’s vast content sets.
“This is a journey,” Broun says. “Each new dataset opens up new possibilities. As a new dataset comes in, the combinations of that data with existing data allows for an ever more sophisticated question and analysis to be done as the platform expands and develops.”
For Kensho, collaborating with S&P Global, first in a commercial relationship and now as part of the organization (though operating as an independent division), has opened up a new world for its data scientists to develop solutions and solve problems with the datasets that S&P has.
Underground Movement
Behind the scenes, all the major information providers will continue to develop and deploy AI technologies. As Broun notes, “Where AI is being applied, and where it will continue to be applied over the next few years, is underground. It’s in the subtle processes that underpin what humans do, to make those humans more productive, and it will just become the background of how every knowledge worker does their work.”
Looking ahead, Kazantsev says he expects the pace of innovation to accelerate because the barrier to entry to using these tools continues to fall.
“What I would like to see, however, is more attention being paid to the ‘interpretability’ of machine-learning models. So, how do you build models that are understandable to human beings where their decisions can be explained? This is particularly pertinent in self-driving cars. If it crashes into a wall, you really would like to understand why that happened. But it is just as important in finance,” he says. “For regulatory reasons, for instance, a compliance officer might need to be able to explain to a securities regulator why an automated workflow made a particular trade. Only recently has there been a sufficient amount of attention paid to this issue in the academic literature, so I think there is a lot of work to be done there.”
And in a sense, this sums up the state of AI’s journey in capital markets: Though vendors have made great progress in scaling the mountain of applying AI to market data processing techniques, the summit seems like a distant and ever-moving target, and market participants will need to harness a new generation of techniques, tools and talent to unlock its full potential.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe
You are currently unable to print this content. Please contact info@waterstechnology.com to find out more.
You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.
Copyright Infopro Digital Limited. All rights reserved.
As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.
If you would like to purchase additional rights please email info@waterstechnology.com
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.
If you would like to purchase additional rights please email info@waterstechnology.com
More on Data Management
Substantive Research reveals new metrics for market data negotiations framework
The research firm will make its industry-derived project available for public consumption next month.
As the ETF market grows, firms must tackle existing data complexities
Finding reliable reference data is becoming a bigger concern for investors as the ETF market continues to balloon. This led to Big xyt to partner with Trackinsight.
Artificial intelligence, like a CDO, needs to learn from its mistakes
The IMD Wrap: The value of good data professionals isn’t how many things they’ve got right, says Max Bowie, but how many things they got wrong and then fixed.
An inside look: How AI powered innovation in the capital markets in 2024
From generative AI and machine learning to more classical forms of AI, banks, asset managers, exchanges, and vendors looked to large language models, co-pilots, and other tools to drive analytics.
As US options market continued its inexorable climb, ‘plumbing’ issues persisted
Capacity concerns have lingered in the options market, but progress was made in 2024.
Data costs rose in 2024, but so did mitigation tools and strategies
Under pressure to rein in data spend at a time when prices and data usage are increasing, data managers are using a combination of established tactics and new tools to battle rising costs.
In 2025, keep reference data weird
The SEC, ESMA, CFTC and other acronyms provided the drama in reference data this year, including in crypto.
Asset manager Saratoga uses AI to accelerate Ridgeline rollout
The tech provider’s AI assistant helps clients summarize research, client interactions, report generation, as well as interact with the Ridgeline platform.