What's Data Got To Do With IT?

IMD: One of the largest areas of data-related spend over the past year has been on lowering latency and increasing throughput. Can these issues be addressed separately, or must they be treated as one, and what are the most useful technologies for dealing with them?

Ralph Frankel, chief technology officer for financial services, Solace Systems

: While they are unique goals, lowering latency and increasing throughput must be addressed in context of one another because there are fewer use cases where just one or the other is useful: non-latency sensitive applications are constantly becoming more latency-sensitive over time, and latency-sensitive applications often produce most of their value during periods of peak activity and volatility.

The problem with most technologies that address one or the other is that increasing throughput puts strain on underlying networks and software that latency goes out the window. The reverse is also usually true, as an intense focus on minimizing latency has a tendency to restrict throughput. This is especially challenging because the back- and front-office environments each have their own priorities. In the back office, throughput is key as banks struggle to process more and more trades, while in the front office they strive to squeeze every microsecond out of data delivery despite an increasing amount of data. So, latency and throughput are equally important business objectives.

The primary rate limiters remaining in market data delivery are in software and operating systems, so the next logical step is to move to an all-hardware solution.

Peter Lankford, director, STAC Research

: In a latency-sensitive trading environment, they must be tackled together. There is typically a tradeoff between throughput and latency in any technology stack, though that tradeoff curve can differ dramatically from one stack to another. That's why the STAC-M1 Benchmark specifications include metrics like the maximum throughput a system can handle without exceeding a certain latency.

By contrast, in environments that are relatively insensitive to latency, such as distribution systems for human end users, throughput can be tackled without much concern for latency. Most systems today can handle hundreds of thousands of messages per second without getting close to latencies that a human could detect.

Jeff Wootton, vice president of product development, Aleri

: They really do need to be addressed together-but not as a single issue. Insufficient throughput capacity means queuing, which drives latency through the roof. On the other hand, simply having sufficient throughput won't necessarily deliver low latency. As for which technologies are most useful, it really depends on which link in the chain you are addressing. A fast messaging platform is critical for distributing data, but won't help with application speed or capacity. Feed handlers and telecom lines can be critical bottlenecks if undersized or not designed for low latency. And application development technology can play a big role. Complex Event Processing can be a valuable tool for implementing real-time applications that scale and perform. But bottom line, just like a chain is only as strong as the weakest link, a trading system is only as fast as the slowest component.

Henry Young, director, TS-Associates

: There is always a tradeoff between throughput and latency when dealing with a fixed underlying technology. The engineering tricks used to improve throughput, sacrifice latency and vice versa... so both performance characteristics must be factored into any systems design exercise. However, by upgrading the underlying technology, for example upgrading from one- to 10-gigabit Ethernet, can achieve both increased throughput and decreased latency-although at great expense. So there are really three variables in the trade off equation: latency, throughput and cost.

John Coulter, vice president of marketing, Vhayu Technologies

: They do frequently overlap but each serves its own purpose. Lowering latency allows computers to act on information faster, while increasing throughput enables computers and human operators to see more. Do you always need to see more data faster? Not necessarily. The options markets are a prime example: There are on average over 4,500 quotes for every trade, but the median is skewed due to the overwhelming amount of quoting on frequently traded contracts such as Google and Microsoft. Servers trade those issues now, not traders. But do traders really care about autoquoting for less frequently traded small caps? No, so why even bother consuming that data across your pipes? You have to pick your battle based on your firm's proprietary trading strategies as well as the customers you serve.

IMD: Is hardware acceleration the solution to dealing with latency and volume? If so, how can it overcome obstacles such as initial cost of deployment, and relative unfamiliarity with the products among end-user firms?

Coulter:

It's one of many, including data conflation, using software which handles multiple threading across multiple cores and leveraging in-memory database software. Hardware-only solutions are too hard to configure, and there are only a handful of programmers available with the knowledge in select geographical locations. Pure software solutions are becoming too slow to keep up with rising volumes. I think we're going to see many hybrid hardware/software solutions coming to market which will offer advantages of both.

Frank Piasecki, president, Activ Financial

: A customer needs to know no more about hardware acceleration technology than they do about the embedded chips that run their car. Hardware acceleration, like most of the market data technology stack, needs to be abstracted from the customer experience. There are no changes to customer applications or third party programs, monitoring or reporting systems required by the inclusion of our hardware acceleration component at a client or hosted implementation of the MPU. All they know is that it will delay the need for more hardware and hosting expenditures that have become commonplace of late. Further, with the market's drive towards exchange co-location, the lower footprint of hardware accelerated solutions like the ActivFeed MPU can help keep global exchange proximity deployment costs under control.

Frankel

: Special-purpose hardware has been the end game for virtually all computing-intensive technology where performance matters: network gear is how the Internet works at such massive scale, video chips have created a gaming industry worth billions of dollars, and compression and encryption at scale are orders of magnitude better in hardware. We've reached that same point in financial markets: Millions of messages at microseconds of latency is a classic job for hardware.

Hardware is actually just a new, simpler-to-use form of financial application infrastructure. You can buy new servers, rev them to a specific operating system version and patch set, then install software-or you can buy an integrated hardware appliance that's certified and ready to go. Upgrading a software environment entails patching the OS, revving the software and testing to see what each dependency might have disturbed. Upgrading hardware is a single instruction to refresh firmware. Beyond that, interacting with hardware middleware is through APIs and client libraries, exactly as it works now. Any middleware team can adapt to hardware as easily or probably more easily, as it does to any new software middleware product.

For deployments of any size, hardware is a lower-cost solution than existing software and servers. It is not uncommon for large investment banks to have thousands of software servers just to handle horizontal scale and redundancy of various market data and back-office middleware systems. Hardware can reduce that server/software footprint by anywhere from 10:1 to 30:1, such that tens of hardware devices can do the work of a hundred software boxes. On any single cost dimension-hardware, software, power or cost of maintenance-hardware is a lower-cost solution. When all dimensions are factored, it is a runaway winner.

Reduced hardware infrastructure also means less support, better maintenance and fewer failure points. Overall, this translates to greatly reduced support costs with substantial reliability gains.

Guy Tagliavia, president and chief executive, Infodyne: Hardware acceleration may raise the threshold at which message volumes can be handled, but without the ability to scale, the solution and return on investment can be short-lived. Aside from the higher cost of entry, hardware acceleration may also limit functionality and make it difficult to adopt changes imposed by the feeds and/or customer requirements. Such a solution needs to be based on a proven product methodology and flexible deployment strategy.

Young

: While hardware acceleration is a useful technique, it is valuable only when used "appropriately." So while higher-level application code, which typically requires more ongoing maintenance and enhancement, should be left in software, lower-level and well-characterized functionality can quite effectively be moved onto hardware, whether that be via FPGA, ASIC, or other custom silicon. Examples of functionality that is typically expensive in CPU terms, but which can be moved across the hardware boundary, include network I/O, compression, cache management and data filtering.

Wootton

: hardware acceleration is only appropriate for tasks that are well-defined and that rarely change. It's expensive and time consuming to implement software to take advantage of specialized hardware. Therefore, it's only practical for certain tasks. We've seen some vendors using hardware acceleration for messaging and for feed handlers-or rather, certain sub-operations within feed handlers. I don't think we'll be seeing it anytime soon for custom applications.

IMD: With so many white papers and benchmark studies being released by vendors, how valuable-and credible-are these claims, and how can they be best applied in real-life usage?

Young

: This really is caveat emptor territory. Vendor benchmarks can never be believed... and ultimately, customers have to do their own testing, in the context of their own use cases, systems architectures, datacenter policies, etc. While industry initiatives to standardize benchmarks are commendable, these are typically targeted at assisting the initial systems selection process, but are less helpful with ensuring continued optimal operation in the face of changing data loads, network technologies, business-driven use cases, etc. So a rigorous approach to operational monitoring of the performance of production systems should be put in place.

Frankel

: As with any information, the source is as important as the substance. Even the most ethical vendor can't pretend to be truly objective about their own performance. That said, these numbers are helpful in setting the context of a discussion about what their technology can do. The great thing about the financial community is that it's a small and tight group, which acts as a self-policing vehicle to vetting performance claims. Any vendor exaggerating product claims versus what they could do in a customer lab risks a tarnished reputation. Remember-what is important is not raw performance claims, but how a product can be applied to business problems in real-world scenarios. That's the data that really matters.

We applaud the evolution of independent analysts and testing agencies like STAC that publish legitimately independent performance numbers based on publicly available platforms and benchmarks that are starting to be standardized. This allows the realities to be separated from the hype by standardizing test cases so vendor claims can be verified in the context of very specifically described test environments-for example, how many publishers, how many servers, what machine types and OS was used, what NIC cards were used, what were the message rates, what was the message size. And the most critical point to define and standardize is how throughput and latency are measured. Additionally, performance at the edge is critical, as claims of average latency without standard deviations and outlier measurements are of little value.

Wootton

: I'd be highly skeptical of any benchmarks that aren't backed up by sufficient data to enable an independent observer to reproduce the results. That's why STAC has built a following so quickly-they bring a level of objectivity that wasn't there before. Latency is an excellent example, where it's not just a matter of test results having validity, but agreeing on what's being measured. Without defining how the measurement is done, the results are meaningless.

Lankford

: One of the reasons STAC exists is that vendor-published studies have too little value to customers. The main problems are well-known: bias, lack of detail, and lack of standards. STAC's original value proposition attacked the first two problems by publishing research that was unbiased and detailed. Now we are in the process of solving the final problem by working with vendors and end-user firms to establish standards. It's hard work that takes time, but it's about to pay off for the industry. By making these standards-and the tools that implement them-available to trading firms, we'll enable those firms to compare their own systems to a wide range of available technologies in an exact, apples-to-apples way.

IMD: Is content still king, or is the speed at which firms receive data now more important than the data itself-i.e., if you don't receive a price at least as fast as your rivals, you may as well not receive it at all? As more asset classes move to high-frequency program trading, will we see this attitude increase? Will this balance shift as it becomes harder to make incremental latency gains?

Young:

The old days of a specific venue's or broker's data being king are over. What is becoming far more important is the context of that data. The driver behind this is the increasing fragmentation of liquidity and multiplication of trading venues, initially in equities, but now sweeping across other asset classes. It is the larger institutions with the resources to reach out to more sources of liquidity to fully contextualize the data received from each source that will win the proprietary trading race. An increasingly important part of this equation is to compensate correctly for latency in building a consolidated context across multiple venues. So knowing the speed or delay inherent in all the prices you receive, and the relationship of those delays to the transactional side of the trade cycle, is perhaps even more important that simply being first.

Tagliavia: There will always be varying levels of latency tolerance, depending on the client's application requirements. It's not always latency that differentiates one program trading application from another-quality of data can be equally as important, while consolidated views of the market, complex event processing, and statistical analysis all require quality data. It is also equally important to have a system that can deliver high-quality data to latency-sensitive applications and, in parallel, support applications that are not so latency-sensitive, all from the same data source. This provides economies of scale and allows customers to leverage their investment in direct feeds throughout their data enterprise.

Wootton

: I don't think you can say that the only thing that matters is speed. It's still more important to be right than be fast. To be right, you have to make decisions on good information-even if those decisions are being made by a computer. We're already coming up against the limits of speed. Until someone revokes the laws of physics, there is a limit on how low latency can go. So the real gains to be made going forward aren't going to be getting the information even faster, but rather, once you get it, how fast you can analyze, understand, and act on it.

Piasecki

: Market data has always presented a tension between content and technology value propositions. Firms need to ensure their platform and applications support both, and the Activ service model embraces this dual responsibility with every one of our offerings. Technology itself doesn't address who will be responsible for sourcing and managing the content it produces. If you buy content, how well integrated is it into platforms that require it? Only a service that places equal value on the creation and provision of both content and technology offers a long-lasting, viable solution. As a result, either the customer spends time integrating technology and content and operating the continued integration. Or, financial institutions engage a firm who combines both as a service offering that can stand on both feet. Time-price priority markets (as most open exchanges are), will continue to remain elemental to capital markets. Latency is key in this model, and will dominate the interaction of OTC and other instruments as they become sufficiently securitized to be traded through exchange-like venues. However, latency is not the only component firms must consider.

Frankel

: Trading definitely goes through phases, and this performance arms race will be key for the next several years... with program trading requiring the most extreme speed for pricing and news. But in the end, content is still king. Acting intelligently is always better than acting quickly when only that dimension is considered. Acting intelligently and quickly is today's killer app. In a decade when everyone is down to shaving off single microseconds, speed will be table stakes, and strategy/content will be king.

The second wave of performance anxiety will be in order execution and order routing to the best available source of liquidity. It does you no good to make a trading decision in 10 microseconds if your order can't be routed in less than 10 milliseconds. There are plenty of legs left in infrastructure performance overhauls.

Coulter

: Every tick of data is incredibly valuable, whether the intent is to work an existing order, to evaluate the outcome immediately, or to store it for subsequent study. Quantitative trading strategies account for 12 percent of the total market volume, and that number is rapidly rising. We have many clients who build their own tick databases to store years' worth of data instead of buying end-of-day data so there can also be a cost benefit in processing every tick and storing it. Having access to every tick in real-time, even if you were outflanked, applies to adaptive algorithms as well, so they can learn even if they lost an opportunity because of speed.

IMD: Will this trend see a return to an emphasis on desktop analytical products, or increased adoption of technologies such as complex event processing software to mitigate and sift "nuggets" from vast "seams" of data "ore"? Lankford

: Purpose-built trading systems have done this kind of sifting for years. What seems to be changing is that as they build new systems, trading firms are more open to using off-the-shelf containers to host the sifting logic. They like the idea of shrinking time to market by using a vendor CEP product-or what STAC calls EPP (Event Processing Platforms). But the concern is always whether these general-purpose EPP products perform well enough under the specific workloads that a firm needs to handle. The STAC Benchmark Council is currently drafting benchmark specifications (STAC-A1) aimed at exactly this question. Our goal is for those specs to be available in time for us to run projects against them later this year.

Frankel

: CEP is very much the analogy of the desktop analytical tool for the program trading age. As a relatively new technology, more evolution is required to get CEP into the maturity range of trading infrastructure, but it has as much potential to be disruptive as advanced analytics on the desktop had for human traders 10 to15 years ago. CEP will add far more real-time intelligence to trading decisions, and will be valuable in order routing-for example, a bank could route orders over a secondary ECN when some level of unacceptable latency is discovered in the primary ECN-and in the real-time interpretation of market direction and sentiment, something that should open up exciting new trading opportunities and algorithms for very short-term arbitrage profits.

Piasecki

: There is too much automation occurring in capital markets to go back to desktop analytics. The growth customer is a computer, and services need to be tailored to the computer consumer's speed, with lots of content consumption and creation as a result. You can think of the modern trader as the pilot of a high-tech fighter jet adjusting the computer guidance and targeting systems. Desktop analytical products may be a thing of the past, but certainly the complexity of the server-based trading algorithms, and the ways in which traders interoperate with them, are just in their infancy.

Coulter

: Desktop analytic products will never diminish in importance. It comes down to where the calculations are generated. There's been a dramatic paradigm shift away from having the application process the analytics due to continuing volume increases, and moving that job to servers with high-speed analytics engines and time-series databases that can handle the heavy lifting. The derived data is then published to numerous applications across the firm. It's much faster, and allows everyone across an enterprise to see the same data.

Tagliavia: Most customers see a need to support both analytical and desktop applications. Desktop applications are also becoming much more advanced and capable of summarizing the strategy of the firm to the trader, rather than delivering the raw data, and leaving the real number crunching to high-powered application engines.

IMD: Does the recent trend of partnerships between low-latency providers and CEP vendors indicate functionalities that these platforms need to incorporate, or just a need to bring CEP capabilities ever-closer to the data? Might we even see CEP built onto FPGA cards and incorporated directly into low-latency platforms in future?

Coulter

: Many of the recent partnerships between CEP vendors and low-latency providers will provide users with the benefit of speed by running CEP engines in a co-location facility. The other important aspect to these types of partnerships is that most CEP vendors don't have a good data storage solution or are integrated with relational databases that hinder query performance. The market has dictated that it's equally important to possess fast access to both real-time and historical data in the same engine to make sense out of current market conditions by scanning historical data patterns in real time. Most algorithmic trading strategies rely on some amount of historical data. If a CEP platform is integrated with a relational database, it's going to miss out on trading opportunities due to the amounts of data-it will take too long to query the number of rows and columns and bring back the results into the real-time strategy. Brokers are pushing a lot of these partnerships solely for that reason.

Tagliavia: CEP vendors are now finding the same low-latency demands placed on them as the platform providers. Therefore, it is a natural extension to incorporate CEP within the market data platforms themselves. Depending on the strength of the partners and their relationships, customers may find varying levels of value in these partnerships. An ideal scenario is that a big enough player has all the software, hardware, and support resources necessary to deliver a coherent, holistic, and fully integrated solution to customers.

Frankel

: The never-ending push to squeeze every bit of latency out of these systems makes any opportunity to incorporate value-added functionality directly into the platform inevitable. As the only vendor in the market that has already taken the step of embedding infrastructure like messaging, content routing, transformation and more into FPGAs and network processors, we frequently engage with potential partners-and even banks-on their wish lists of logic they would like to see running in hardware. This is without question the new performance frontier, and the rule of thumb will be: if lower latency equals a meaningful amount of incremental profit potential, it will find its way into hardware.

Wootton

: The partnerships are stemming from the realization that getting data fast isn't enough-the speed at which you analyze and act on it is just as important. Right now the integration makes it easy for the customer to deploy CEP and feed it with low-latency market data. Will further collaboration between providers of feeds, market data systems, messaging and CEP lead to further performance improvements? Probably. CEP embedded in the market data platforms is an interesting idea. CEP on FPGA? That's where I'm skeptical. The value of CEP is in its programmability by the customer. You can implement your business logic and deploy it in days, rather than the weeks or months it would take to implement it in Java. FPGA technology delivers performance gains by going in the other direction-i.e. performance at the expense of hard-coded logic.

IMD: Where are the biggest remaining areas of untapped potential for reducing latency-at source, networks and physical location, feed handlers and data platforms, internal client systems, or other areas?

Wootton

: Humans. After that, I'd have to say applications. There has been so much focus on getting low-latency market data in, and getting low latency on order execution, but not enough effort into that bridge where market data is used to trigger an order. It's that piece-whether an application or a person-that is all too often the bottleneck.

Piasecki

: We see opportunities in all of the above. High-quality data services must address the integration of technology, content, service management, testing, compliance, third-party integration, deep R&D in IT innovation, QA, etc. Activ's strategy is to address everything about collecting, normalizing, entitling, distributing, and integrating or displaying real-time financial information. Whether it is exchange pricing, derived data, internal IP, OTC information, reference data, news, or company fundamental data, someone has to manage the system.

Tagliavia: Location is a big factor, given the laws of physics, but sometimes there is no choice but to centralize data collection and usage given some firms' broad market data sourcing requirements. Location aside, the most critical aspect of latency tolerance is not so much shaving microseconds, but ensuring a consistent, jitter-free latency profile, regardless of message rate fluctuation. Also, since latency can occur for any multitude of reasons-the least of which is the result of any one consistent issue-it is critical to be able to detect, report, pinpoint and resolve latency breaches when they occur. How well this is done differentiates vendors in this market.

Young

: Co-location is only the answer for a sub-set of business activities that are venue-specific-for example, market making. But even there, changes to market structures are breaking that down, as a market maker's activities must become more global in their coherence. The real breakthroughs are yet to come in the contextualization of data, in acknowledging the physical limitations of data delivery, and compensating for differential venue latency on both the pricing and execution sides of the trade cycle. So the focus is turning back to customers' internal infrastructures. The key developments here effectively amount to wholesale reimplementation of systems architectures with a focus on determinism and latency. Important emerging techniques include highly deterministic "fabrics" such as InfiniBand, zero copy "stackless" middleware, hardware offload or acceleration of network I/O and building in "performance self awareness" from the ground up. We see these trends just getting off the ground with the largest institutions, and expect to see the trickle-down commoditization process run for the next three to five years, by which time other new technologies will be at the bleeding edge.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@waterstechnology.com or view our subscription options here: http://subscriptions.waterstechnology.com/subscribe

You are currently unable to copy this content. Please contact info@waterstechnology.com to find out more.

Enough with the ‘Bloomberg Killers’ already

Waters Wrap: Anthony interviews LSEG’s Dean Berry about the Workspace platform, and provides his own thoughts on how that platform and the Terminal have been portrayed over the last few months.

Banks seemingly build more than buy, but why?

Waters Wrap: A new report states that banks are increasingly enticed by the idea of building systems in-house, versus being locked into a long-term vendor contract. Anthony explores the reason for this shift.

Most read articles loading...

You need to sign in to use this feature. If you don’t have a WatersTechnology account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here