Filter Market Data Messages in an FPGA – part 3

Filter Market Data Messages in an FPGA – part 3

Note: Skip directly to GitHub.com to download the source code by following this link:

This post will cover the next iteration of implementing an OrderBook inside an FPGA that is based on a NASDAQ ITCH 4.1 market data feed.

Some time has passed and I have finally found enough time to finish all the code changes required for the two (2) components listed below, along with the requisite test harnesses to validate.

Starting off, here are the components of an FPGA-based OrderBook

  • ITCH Parser
    • FPGA loop that listens to incoming data from a Network Interface Card that parses, filters and translates each incoming message and sends the appropriate message/command to the OrderBook loop.
  • OrderBook
    • FPGA loop that reads and writes Orders to memory using an insertion sort algorithm.  The Orderbook is currently able to support only one instrument and one side.  It’s capacity is 1,000 elements, which through the power of LabVIEW for FPGA can be easily adjusted, but that is not important right now.  The OrderBook currently supports two commands: add order and get all orders.  The get all orders command is meant to be used by a user or client application for trading and other purposes.

Using Test Driven Development, Here Are the Test Harnesses

  • ITCH Parser Test Harness
    • Input: A file containing raw ITCH 4.1 market data messages (generated using createItch.py)
    • Output: Array of OrderBook operations
  • OrderBook Test Harness
    • Input: An array of OrderBook operations
    • Output: A sorted array of Orders

What Does a Test Harness Look Like?

Here is a screenshot of the Front Panel diagram for the ITCH Parser Test Harness:

and here is a flow chart of what is going on:

What exactly is going on? The vi Host-ItchParser-TestHarness.vi, reads the file containing raw NASDAQ ITCH Market Data messages, sends them in to the FPGA Test Harness via the Host-to-Target DMA FIFO “HT-TEST_IN”.  The Fpga test harness is located in the “Tests” folder and is named “Fpga-ItchParser-TestHarness.vi”, this Test Harness passes the raw Itch data as is to the Fpga-ItchParser.vi, which parses, normalized and filters each message for Add order message types only for symbol AAPL. It then sends an OrderBook command for each appropriate message back out to the Fpga-TestHarness which sends the results up to the host

And for the OrderBook Test Harness

Here is what it would look like in a production system

Why this is so important?

Well, normally to create an FPGA based anything, one needs to use Verilog, VHDL or one of any numerous “high-level” design languages.  Here you can accomplish the same thing, but with a really great programming interface that matches the Verilog programming model, but only with a graphical interface.

This means you can create a custom FPGA based solution, reduce your datacenter power usage, increase your applications performance, and reap the rest of the great benefits of FPGA-based computing.

I encourage you to download the source code for this and to see for yourself what LabVIEW for FPGA can do for you and to then try it in one of your own applications.

Stay Tuned… What is next?

  1. Hook the ITCH Parser up to an actual Network Interface Card, preferably a 10 Gigabit, since I already own the hardware to do so.
  2. Hook up either a MicroBlaze processor or the host computer to the OrderBook so that something can be done with the OrderBook data itself.

References:

  1. http://www.nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/nqtv-itch-v4_1.pdf)

 

Filter Market Data Messages in an FPGA – part 2

Skip directly to the source code on Github.com here:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/tree/master/MarketData/MarketData_01


So what now.  We know what a NASDAQ ITCH 4.1 Market Data Message looks like.  The format is very simple, there is some – yes – ASCII data in the message format, and all messages are preceded by the message length.  Message length preceding the message makes it very easy to interpret a feed from inside an FPGA.

What to do first? Well, what does eXtreme Programming say to do? It says keep it simple.

So, I am working on a LabVIEW 2016 program that does the following:

  1. Open a NASDAQ ITCH file
  2. Send it one byte at a time into the FPGA over the PCIe bus
  3. FPGA reads the feed, skips over messages types that it does not know how to handle, and parses only messages of a specific type.
  4. Send back to the host computer a statistic – any statistic for now.

If you are unfamiliar with National Instruments products and LabVIEW, go here to learn more:

Step 1 – Open a NASDAQ ITCH file

The first step is to create a LabVIEW for Windows – as opposed to LabVIEW for FPGA – and to read in an entire NASDAQ ITCH 4.1 file.  This file is quite big, so I wrote a quick Python script (used Ryan Day’s code as a guide: https://github.com/rday/ITCH41), and I generated a file with the following:

  • 1 Timestamp message
  • 5 Timestamp messages
  • 50 Timestamp messages

The Python script can also output the seconds field from each message read in.  See the python script here:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/blob/master/MarketData/MarketData_01/parseItch.py

Now I have three files, named: T.itch, T.5.itch, and T.50.itch, and I will write a LabVIEW program to send all data from one of the files above in to the FPGA.

Step 2 – Send Data From One File to FPGA via DMA FIFO

This requires knowledge of LabVIEW for Windows and some knowledge of LabVIEW for FPGA.  I wrote a simple User Interface that allows you to select a file.  That file is then sent to the FPGA using a Host-to-Target FIFO 1 byte at a time.  Since viewing a LabVIEW vi requires that you have LabVIEW installed on your computer, I took some screen shots of the LabVIEW code and placed them here:

Step 3 – Interpret the Feed Inside the FPGA

Right now I have generated a file that contains, 1, 5, and 50 Timestamp messages.  So that means, for each messaged that is encountered, I will extract the timestamp, which will be in seconds, save it in to a local FPGA variable, and send the value back up to the host.

Step 4 – Send Data Back to Host

The statistic will be the seconds portion of each message that was passed in.  The seconds field is a 32-bit integer, so the Target-to-Host DMA FIFO will be a 32-bit integer.  Here is a quick screen shot of the FPGA top-level loop, which reads the input data stream one byte at a time:

The purple colored box is how the FPGA receives the data from the host.  In a live application, the purple colored box, also known as the “Read (FIFO Method)” can have this data come directly from a 10 gigabit connection, or from another loop inside the FPGA.

(Read more about this method on National Instruments website: https://zone.ni.com/reference/en-XX/help/371599H-01/lvfpgahost/fpga_method_fifo_read/)

As the data comes in, a counter is started at 0, and depending on the element count, the data is stored in different output variables.  The first 2 bytes are the message length, the 3rd is the message type, the 4th, 5th, 6th, and 7th elements are the Seconds portion of the message.

The counter is then compared to the message length variable, and when they are equal, the output variable “Message Done” is set to true.  Here is the bottom portion of the screen shot from above:

After the end of the message is read, we have a case structure, which is similar to an if statement but for FPGAs, which will read the appropriate variables and send them back up to the host via a DMA-FIFO.  Now this DMA-FIFO can be configured to send data up to the host computer, or to another DMA-FIFO inside the FPGA.  For now we are going to send this up to the host for analysis.

Take a look at the right-half of the original FPGA vi screenshot.  This element executes once, and reads the Seconds variable and sends it up to the host.

In part 3, I will add another feature – Add Order with MPID, so we can now know in the FPGA when a new order is entered for a particular security, what side that order is on, and how many shares/price.  This is more meaningful information that can be used to trade the markets, especially during a Donald Trump speech!

Filter Market Data Messages in an FPGA – part 1

So I went to NASDAQs ftp site and downloaded the entire ITCH feed for November 9th, 2013.  The file was large – 319MB compressed, and you can download it yourself from here:

ftp://emi.nasdaq.com/ITCH/11092013.NASDAQ_ITCH41.gz.

NASDAQ has a very simple document describing the specification here:

http://nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/NQTV-ITCH-V4_1.pdf

I skimmed over the specification to get an idea of how Market Data works.  What I basically understand is that at the start of the trading day, NASDAQ sends a list of all securities that will be available to trade for that day following by a bunch of messages indicating changes to prices being offered to buy or sell for the security as well as actual trades.

The basic format of an ITCH 4.1 Market Data message is the size of the message, followed by the data, where the first byte of the data is the message type.  So using this information we can easily decode an entire ITCH feed, paying attention only to messages that interest us.

Timestamp Message

0x05 Length
0x54 Message Type ‘T’
0x00 Second – byte 1
0x00 Second – byte 2
0x58 Second – byte 3
0xb7 Second – byte 4

The ITCH standard says that all Integer fields are in Big-Endian format, so the timestamp in the message above is interpreted as 0x00 00 58 b7, or 22,711. See a nice online hex to dec converter here:

http://www.binaryhexconverter.com/hex-to-decimal-converter

Now 22,711 is the number of seconds since midnight, here is another online tool to convert this to a normal time in hours, minutes, and seconds:

https://www.tools4noobs.com/online_tools/seconds_to_hh_mm_ss/

So 22,711 is 06:18:31.  That is pretty early in the morning, so it looks like this particular NASDAQ ITCH feed starts with pre-market trading.

I also used this online tool to convert ASCII to Hex:

http://www.asciitohex.com/

The next message in this feed:

System Event Message

0x06 Length
0x53 Message Type ‘S’
0x11 Timestamp byte 1(nanoseconds since last Timestamp Seconds)
0xcd Timestamp byte 2
0x6c Timestamp byte 3
0xc9 Timestamp byte 4
0x4f Event Code

The possible event codes are:

  • Daily
    • ‘O’ – (0x4f) – Start of Messages
    • ‘S’ – (0x53) – Start of System Hours
    • ‘Q’ – (0x51) – Start of Market Hours
    • ‘M’ – (0x4d) – End of Market Hours
    • ‘C’ – (0x43) – End of Messages
  • As Needed – In the event of an emergency market condition
    • ‘A’ – (0x41) – Emergency Market Condition – Halt
    • ‘R’ – (0x52) – Emergency Market Condition – Quote Only Period
    • ‘B’ – (0x42) – Emergency Market Condition – Resumption

I included the “As Needed” Event Codes, because that is when things will get real bad, and you will probably want your FPGA to get ready to liquidate everything in your portfolio… More on that in the future, but for now we must stayed focused on trading Trump and his tweets.

So 0x4f means start of messages.  Okay, continuing, I see a few more system event messages as well as some Timestamp messages.  I skip these for now and come to the next message which sounds interesting and is a Stock Directory Message

Stock Directory Message

byte # length data description
0 1 0x14 (decimal: 20) Length
1 1 0x52 Message Type ‘R’
2-5 4 1d  48 bd c7 Timestamp Nanoseconds
6-13 8 41 20 20 20 20 20 20 20 Stock (0x20 is a space, 0x41 is A) So this is for Agilent (http://finance.yahoo.com/quote/A?p=A)
14 1 4e (‘N’) Market Category – simply the exchange:

    • N – NYSE
    • A – NYSE Amex
    • P – NYSE Arca
    • Q – NASDAQ Global Select Market
    • G – NASDAQ Global MarketSM
    • S – NASDAQ Capital Market
    • Z – BATS BZX Exchange
15 1 20 (space) Financial Status Indicator – Indicates when a firm is not in compliance with NASDAQ continued listing requirements.  This sounds like a way to find distressed stocks that are about to be delisted – lots of volatility with low volume.

  • D – Deficient
  • E – Delinquent
  • Q – Bankrupt
  • S – Suspended
  • G – Deficient and Bankrupt
  • H – Deficient and Delinquent
  • J – Delinquent and Bankrupt
  • K – Deficient, Delinquent and Bankrupt
  • Space – Company is in compliance
16-19 4 00 00 00 64

(decimal: 100)

Round lot size
20 1 4e (N) Round lots only – indicates if NASDAQ only accepts orders in round lot size

  • Y – only round lots are accepted in this stock
  • N – odd/mixed lots are allowed

Okay, so we can now decode a couple of message types.  But this is not important and is only distracting us from the goal.  We can decode all of these messages inside a MicroBlaze Soft Core Processor running embedded C++ inside the FPGA, and then send a message to the rest of the FPGA telling it how to deal with the rest.  So let’s keep scanning this Market Data file and find a message that helps us with trading.

I encounter a message of type “Reg SHO Short Sale Price Test Restricted Indicator”.  I did some googling and found that Reg SHO is yet another loophole-filled attempt by regulators to prevent naked shorting.  My favorite sentence from this Investopedia page http://www.investopedia.com/terms/r/regsho.asp is:

“a broker has reasonable belief that the equity to be short sold can be borrowed and delivered to a short seller on a specific date before short selling can occur”

Anyway, let’s keep scanning this file.  I find a message of type ‘H’ – Stock Trading Action Message, this sounds pretty good…

Stock Trading Action Message

byte # length data description
0 1 0x13 (decimal: 19) Length
1 1 0x48 Message Type ‘H’
2-5 4 1d 50 bd 0d Timestamp Nanoseconds
6-13 8 41 42 2d 20 20 20 20 20 Stock Symbol: AB-
14 1 0x54 (T) Trading State

  • H – Halted across all U.S. equity markets / SROs
  • P – Paused across all U.S. equity markets / SROs (NASDAQ-listed securities only)
  • Q – Quotation only period for cross-SRO halt or pause
  • T – Trading on NASDAQ
15 1 20 (space) Reserved
16-19 4 20 20 20 20 Trading Action Reason – I guess all spaces means “no reason” or “nothing to worry about”

Okay, so it looks like the Trading Action Message just tells us when trading is officially open…etc.  Nothing of use here for the FPGA plan.  Anyway, on to the next message type that I find.

I continue running my analysis script and I find the following:

  • Add Order with MPID
  • Add Order Message
  • Order Delete Message

This time I will not analyze the “Add Order with MPID” message unless it appears to have something of use.  The Message Type is ‘F’, and it looks like this message will be useful to us.  Additionally, MPID stands for “Market Participant Identifier” and appears to simply be your broker.  See:

https://www.interactivebrokers.com/en/index.php?f=705

So, here is the analysis:

Add Order with MPID Attribution

byte # length data description
0 1 0x22 (decimal: 34) Length
1 1 0x46 Message Type ‘F’
2-5 4

02 43 d1 46

Nanoseconds portion of timestamp
6-13 8 00 00 00 00 00 00 16 b2 Order number – unique and is assigned during order entry
14 1 0x42 Type of order

  • ‘B’ – (0x42) – Buy
  • ‘S’ – (0x53) – Sell
15-18 4 00 00 00 64 (decimal: 100) Number of shares
19-26 8

5a 56 5a 5a 54 20 20 20

The stock symbol:

ZVZZT

Funny… this looks like a test symbol, take a look at the Bloomberg quote:

https://www.bloomberg.com/quote/ZVZZT:US

27-30 4

00 02 97 ac

Display price, converts to: $16.99
31-34 4  4c 45 48 4d NASDAQ MPID: LEHM (Is that Lehman Brothers?)

So I am looking at the NASDAQ ITCH 4.1 specification (as you have obviously realized by now) and I see that the order immediately before “Add Order with MPID Attribution” is “Add Order – No MPID Attribution”, which is exactly the same as the message above, but without the last field.  The only other difference is that the message type is ‘A’.

A word about Display Price

So the specification says that you convert all Integer price values and treat them as fixed pointed numbers with the first 6 places representing the integer portion and the remaining 4 representing decimal digits, and this leads to a maximum price of 200,000.0000.

So in the example message above:

000297ac is equal to 169900, which is 16.9900, or simply $16.99.

(see http://www.binaryhexconverter.com/hex-to-decimal-converter)

Okay, so we know when the market opens and trading is started, stopped…etc We know when a new order is added to the NASDAQ order book.  What is next? Order Delete!

Order Delete is different from Order Cancel.  Order Delete means that the entire order is removed or deleted from the Order Book, Order Cancel is whenever a portion of an order is cancelled.

Order Delete Message

byte # length data description
0 1 0x0D (decimal: 13) Length
1 1 0x44 Message Type ‘D’
2-5 4

21 95 1b cf

Nanoseconds portion of timestamp
6-13 8

00 00 00 00 00 00 031 6b

Order reference number

Order Cancel Message

byte # length data description
0 1 0x (decimal: ) Length
1 1 0x58 Message Type ‘X’
2-5 4

21 dd db e1

Nanoseconds portion of timestamp
6-13 8

00 00 00 00 00 25 42 b7

Order reference number
14-17 4

00 00 00 0a

Number of shares being removed from order.  Hmmm… is someone backing out of a position? Scared!!! that means sell! Eh, maybe not, they only removed 10 shares.  But what if they are trading a high-priced stock like Amazon or Google? Oops, I meant Alphabet… and what about Berkshire Hathaway?

Okay, so we have Add Order, Remove Order, Shorten Order.  What is left? Order Execute and Order Modify.  I will go with Order Executed.

Order Executed Message

byte # length data description
0 1 0x19 (decimal: ) Length
1 1 0x45 Message Type ‘E’
2-5 4

18 45 cd 07

Nanoseconds portion of timestamp
6-13 8

 

00 00 00 00 00 00 e9 ca

Order reference number
14-17 4  00 00 03 e8 Number of shares executed
18-25 8

00 00 00 00 00 00 00 01

 NASDAQ generated day-unique match number.

Order Executed with Price Message

byte # length data description
0 1 0x1e (decimal: ) Length
1 1 0x43 Message Type ‘C’
2-5 4

01 74 a8 ca

Nanoseconds portion of timestamp
6-13 8

 

00 00 00 00 00 00 31 9b

Order reference number
14-17 4  00 00 00 64 (decimal: 100) Number of shares executed.
 18-25 8

00 00 00 00 00 00 03 b0

 NASDAQ generated day-unique match number.
26 1

4e

Printable

  • ‘Y’ – (0x59)
  • ‘N’ – (0x4e)
27-30 4

00 02 9c 5c

Execution Price: 17.1100, or $17.11

I think that I have enough information to get started with the FPGA portion of this.  So I will skip analyzing the Order Replace Message for now.  So….

We have analyzed the following message types:

  • Timestamp Message
  • System Event Message
  • Stock Directory Message
  • Stock Trading Action Message
  • Add Order – No MPID Attribution (well, kind of)
  • Add Order with MPID Attribution
  • Order Delete Message
  • Order Cancel Message
  • Order Executed Message
  • Order Executed with Price Message

What is missing? I am looking at the ITCH 4.1 specification and I see the following message types:

  • Order Replace
  • Trade Message (Non-Cross)
  • Cross Trade Message
  • Broken Trade / Order Execution Message
  • Net Order Imbalance Indicator (NOII) Message
  • Retail Price Improvement Indicator (RPII)

So I can go ahead and analyze the rest of the messages, but I want to get to some coding.  Coding as in drawing some LabVIEW program (no, not writing, but drawing).  Anyway, I will take a look at how to write (aka draw) some LabVIEW for FPGA code that can handle the parsing and interpreting of the necessary information to react or trade to a Trump Event.  Stay tuned.