Some Time with the Arty Arix-7 35T Digilent Board

So I wanted to implement a simple, stripped down version of the open-source lightweight IP stack “lwIP” ( inside my LabVIEW FPGA project that I can handle TCP and UDP data streams.

I do not have a lot of experience with this, and I found that building such a project inside Vivado would take around 3 hours to simulate with all of the source code of the lwIP project embedded in the elf file.

I ended up purchasing a $99 board from Digilent that uses an Artix-7 35T board:

On this board I was able to run and debug the lwIP source code so that I could figure out how to use it with my configuration.  I creatd a public github repository with this source code, so if you happen to be trying to learn how to use the MicroBlaze processor with this board, check out:

Enjoy and I will be working on integrating this lwIP source code in to my LabVIEW FPGA project now.

A Diversion for CryptoCurrencies

I spent some time analyzing the Monero CryptoCurrency source code to understand the algorithm, how it works and to see if it is doable with an FPGA via LabVIEW for FPGA, our secret weapon.
I learned that there are 4 steps to the Monero “CryptoNight” algorithm and that step 3 is the part that does the heavy lifting, with around 500k reads and writes to a small section of memory that is 2 megabytes in size.  This section of memory was specifically selected to be a size that coincides with the size of most processor Level 3 caches.  This is supposed to be what makes the algorithm “memory-hard”.
Locks are meant to be broken, codes cracked… and secrets revealed.
I am thinking – what if I put step 3 inside an FPGA have it use Block RAM?
  • Block RAM is limited on an FPGA, so this may not be worthwhile

Okay, what about DRAM?

  • My FPGA may have DDR3 RAM, but other FPGAs have faster RAM.  If my implementation works well on DDR3 RAM, then I can move it to another FPGA with faster RAM.
  • Will an FPGA user of DRAM be faster than a CPU usage of L3 Cache? Taking in to account of course that the FPGA is the only user of this DRAM controller? What about an FPGA with multiple DRAM controllers?
Well, I know that DRAM is “slow” when compared to other types of memory, but the difference here is that the FPGA is the only user of the DRAM controller.  On any operating system, there are many users, i.e. programs, processes, kernel threads.  So would doing this from an FPGA make the cut?  Would it make that much of a difference?
Well, there is only one way to find out.  Try it out!
I have created a github repository with my work so far here:
I went in to the Monero c++ source code ( and saved to a binary file the following variables before the loop with 500k iterations starts (as of this date lines 591 and 600)
  • uint64_t a[2]
  • uint64_t b[2]
  • uint8_t *hp_state (<= this is the scratch pad of 2 megabytes of data)
  • uint8_t *hp_state_out (same scratch pad after CryptoNight Step 3 has run)
I implemented a sandboxed c++ version of this code that does CryptoNight Step 3 in an isolated program that runs with the same values each time.
This c++ program works on OSX and Windows (and probably linux), it uses gradle as its build tool and you can see the source code here:
I then implemented the same algorithm, based on the same source file by using LabVIEW for Windows.  The values match, so we have a working C++ version, a working LabVIEW for Windows version, and now we can determine if an FPGA version will be worth it.
Please note that the LabVIEW version is not optimized code, and I am not a LabVIEW for Windows Developer, and that is probably why it runs so slow… for now.  Yes, it takes over an hour to create one hash.  However, I have consulted with some LabVIEW experts, and they have told me what I should do to make it faster.  I will start working on that, and in the meantime, you can take a look at the ever-changing source code to see what the algorithm involves.  Remember, LabVIEW code is very easy to understand, so this may be the “flow-chart” explanation of what a cryptocurrency miner looks like.
See the LabVIEW code here:
(Requires LabVIEW 2017 to view…)  I will add some png versions of the code soon, but first I want to do some cleaning…

Issues with LabVIEW and Lack of Relative Directory References

So I wanted to mention that I have all of my LabVIEW (and Vivado) code saved on a RAID-1 mirrored location on my network.  From each of my workstations, I map the same network location to my Z drive.  This way any and all issues of LabVIEW referring to absolute paths goes away.  I do not develop in “offline” mode, I am always connected to one of my machines, whether it is by sitting directly in front of the machine or via a Remote Desktop Connection.  If you use a laptop, you could always split a piece off of your normal root partition and make a Z drive for yourself.

To do this yourself, create a network share and open it from Windows Explorer, and then select “Map Network Drive”.  This option will either be an icon or a menu option, and this all depends on the version of Windows that you are using.

So in my case, I have:

\\192.168.0.x\RAID-1 mapped to Z:\

So I work from:


More Code Posted to Github

So I have figured out how to use the MicroBlaze Core with an AXI-Stream FIFO, and I have also figured out how to export a project from Vivado by using the Vivado “Write Project TCL” option.

See the following project:

You have to re-generate the Vivado Project and create a new SDK workspace in order to get this to work on your machine.

How to regenerate a Vivado project from a TCL script:

Step 1 – Start Vivado

Step 2 – Change directory to where tcl script is located

Make sure you escape all Windows backslashes with another backslash




cd “Z:\\work\\git\\LabVIEW_Fpga\\06_MicroBlaze\\04_lwIP_Ex\\lwIP_Ex”

Step 3 – Source the tcl script

source init.tcl

That’s it!

Note: I am still in the process of converting all of my projects to use this method, if you want a quick taste, check out the project here:

New Code Added to GitHub – MicroBlaze MCS, IO Bus and LabVIEW

I just uploaded some code to GitHub that is a full demonstration on how to use LabVIEW FPGA 2017, the MicroBlaze MCS core and the IO Bus that is attached to the MicroBlaze MCS.

Clone the following repository:

and open the LabVIEW project:

Look at the Vivado 2015.4 project:

And finally, using the Xilinx SDK set your workspace to:

Now if you do not have access to LabVIEW from your current machine, I have included a screen shot for each VI with the words “Front” or “Back” added to the filename, and in the case where there are many case structures, I have added the case structure element number.

The example has three features:

  • Send a packet of data over the IO Bus to the MicroBlaze MCS and read the same packet back over the IO Bus
  • Write a value to GPI channel #1 and read the value multiplied by 2 over GPO channel #2
  • Read the values of GPO channels 1, 2, and 3

Now I am continuing to work on integrating the 10 gigabit ports with the MicroBlaze MCS and to get the lwip TCP/IP stack working on this board – NI PXIe-6592R.


How to Use the Microblaze Micro Controller System from LabVIEW

The MicroBlaze Micro Controller Syste (MCS) is a soft-core processor that can be customized and placed inside the fabric of your FPGA.  The uses of this are limitless.


Source Code

Browse the source code online via github by visiting the following link:

To download the source code, clone the entire repository with:

  • git clone

You can also download a zip file with the entire repository:

What this Guide Accomplishes

This guide shows how to use the version of Xilinx Vivado that is bundled with the “LabVIEW 2017 FPGA Module Xilinx Compilation Tools” to create a Vivado FPGA design that uses a MicroBlaze MCS core, to create and overlay an executable on top of that core using Xilinx SDK 2015.4, and finally how to import this design and to run it on the National Instruments PXIe-6592R High-Speed Serial Instrument.

References for Further Reading

Slide Share version

SlideShare (Opens in new tab)

Video Demonstration of Run

This guide is broken down in to the following sections:

  • Section 1 – Vivado – Create MicroBlaze MCS design
  • Section 2 – Xilinx SDK – Write a C Program for the MicroBlaze MCS
  • Section 3 – Vivado – Add Binary (ELF) to MicroBlaze MCS design
  • Section 4 – LabVIEW 2017 – Import design to LabVIEW and run on FPGA

Section 1 – Vivado – Create MicroBlaze MCS Design

Step 1 – Start Vivado 2015.4

If you haven’t set up a shortcut, just run the following batch file:

  • C:\NIFPGA\programs\Vivado2015_4\bin\vivado.bat

Step 2 – Create a New Project

Step 3 – Select Project Location and Project Name

I created my project in the following location: (screenshot is out of date)

  • G:/work/git/LabVIEW_Fpga/05_MicroBlaze_Mcs/01_MicroBlaze_MCS_GPIO

I named my project “MicroBlaze_Mcs_GPIO”

Step 4 – Just click next, I did not set anything here

Step 5 – Make sure the “Target language” is VHDL.

Step 6 – Just click next in the Add Existing IP page

Step 7 – Click next, we are not adding any constraints, nor do we have to in this project.

Step 8 – Select the appropriate FPGA part

We are using the PXIe-6592R board for this example and the FPGA has the following specifications:

  • Family: Kintex-7
  • Speed Grade: -2
  • Package: ffg900
  • Part #: xc7k410t

Step 9 – Click finish to create your new project

Step 10 – Here is what the project looks like after creation.  Click on the image below for a higher resolution image to appear in a new window.

Step 11 – Click “Create Block Design”

Vivado makes it easy to create designs.  By clicking on create block design, you can make a design that uses several cores and makes it easy to synthesize, package and to export to an application such as LabVIEW

Step 12 – Name the design

I like using the d_ prefix followed by a short description of my design.  Since we are using the MicroBlaze MCS, I name my design “d_mcs”

Step 13 – Here is the blank design

Step 14 – Click the “Add IP” button to get a list of all Xilinx available cores

Step 15 – Make sure you select the “MicroBlaze MCS” and not the “MicroBlaze”

The MicroBlaze core is more customizable and supports more features, I will cover this in a future article.

Step 16 – After adding the MicroBlaze MCS core.  Notice that no peripherals have been added, nor has the core been configured.

Step 17 – Right click (away from any terminals) and customize the block.

Step 18 – Configure the MicroBlaze MCS core

Step 19 – Set the memory size to 64 KB and enable the IO Bus if you like.  I will use the IO Bus in a future article

Step 20 – Enable the General Purpose Output (GPO) channel 1, 32 bits is fine.

Step 21 – Enable the General Purpose Input (GPI) channel 1, 32 bits is fine.

Step 22 – Click on “Run Connection Automation” at the top of the window containing the design. Check the Clk box.

Step 23 – The defaults for GPIO2 should be fine as well.

Step 24 – Same for Reset.  Note how the Reset Polarity is set to ACTIVE_HIGH.  This will not matter for our design, but it will in other cases.  i.e. I was dealing with the Arty Artix-7 board, and I had to flip the reset polarity for that board to work.

Step 25 – Now if you enabled the IO Bus, you have to manually make it external.  Make sure you right-click and that the cursor becomes pencil-like as shown below.

Step 26 – Click “Make External”

Step 27 – Here is what it looks like after clicking “Make External”

Step 28 – Now go back to the block design and right-click on the design containing the MicroBlaze MCS and select “Generate Output Products”

Step 29 – Global should be fine.

Step 30 – When Vivado is finished, you should see the following.

Step 31 – Now right-click on the design and select “Create HDL Wrapper”

Step 32 – Either option should be fine here, but I like having Vivado manage this for me automatically in case I make changes to my block design.

Step 33 – Notice how there is a new VHDL file, that contains an instantiation of everything in our Block Design.

Step 34 – A preview of the contents of the VHDL design wrapper.  From LabVIEW we will be importing this wrapper.

Section 2 – Xilinx SDK Write a C program

Step 1 – Now we take a break from Vivado and launch the Xilinx SDK.  You can normally export the hardware from Vivado and ask it to launch the SDK, but there is a bug in this version of Vivado (2015.4) that prevents us from doing so only in the case that our design is using the MicroBlaze MCS.  Note that this does not apply to designs using the MicroBlaze core.

Step 2 – Create a directory named “MicroBlaze_Mcs_GPIO.sdk” as a sub-directory inside the Vivado project directory and set this to be your workspace.

Normally, if you select “File->Export->Hardware”, this directory will automatically be created for you, but remember that the hdf file that will exist in the root directory will not work due to a bug in Vivado 2015.4, so I typically create the directory myself.

Step 3 – Click “File->New->Other” to get to the New Project Wizard.

Step 4 – Select Hardware Specification.

Step 5 – Click the “Browse” button and add the following file, make sure you select the file ending with “_sdk.xml”

The full path from the root of the Vivado project is as follows:


Step 6 – The default name is okay, just click Finish after selecting the XML file from the step above.

Step 7 – What the new project looks like.  Notice the Target FPGA Device and the address map.

Step 8 – Now create a new Application Project.

Step 9 – Call this project “gpio_rw”, our project will read from GPI-1 and write out of GPO-1.

Step 10 – Select Empty Application, I will provide the code that you should insert.

Step 11 – Right click on the “src” directory to create a new C source.

Step 12 – After right-clicking, select “New->Source File”

Step 13 – Name this file main.c

Step 14 – Copy and Paste the following source code:

* main.c
* Created on: Jun 17, 2017
* Author: John
*/#include “xparameters.h”
#include “xil_cache.h”
#include “xiomodule.h”int main()
{Xil_ICacheEnable();Xil_DCacheEnable();print(“—Entering main—\n\r”);{

XIOModule gpioIn_1;

XIOModule_Initialize(&gpioIn_1, XPAR_IOMODULE_0_DEVICE_ID);


XIOModule gpioOut_1;

XIOModule_Initialize(&gpioOut_1, XPAR_IOMODULE_0_DEVICE_ID);


u32 gpi_1;



gpi_1 = XIOModule_DiscreteRead(&gpioIn_1, 1);

if(gpi_1 == 0)


gpi_1 = 50;


XIOModule_DiscreteWrite(&gpioOut_1, 1, gpi_1 + gpi_1);



print(“—Exiting main—\n\r”);



return 0;


Step 15 – Building should automatically take place when you save the source file.  But click “Project->Build All” to confirm this.

Section 3 – Vivado Part 2

Step 1 – Go back to your Vivado project and select “Tools->Associate ELF Files…”

Step 2 – Click the “…” button next to the elf file, which should be “mb_bootloop_le.elf” for both Design Sources and Simulation Sources.

Step 3 – Add your binary, which should be located in the following directory:


Step 4 – Click ok, the new elf file should automatically be selected.

Step 5 – Now run the Synthesis.

Step 6 – You can verify that synthesis is running by looking at the top-right of the Vivado Window.

Step 7 – After a couple of minutes when Synthesis finishes, click Cancel, because we do not want to run the implementation.  LabVIEW will handle that!

Step 8 – Write a checkpoint by clicking “File->Write Checkpoint”

Note that you must write a Synthesized Checkpoint, which means that you have to have followed the steps above and not have run Implementation.  If you run implementation, the checkpoint file will be larger than if you only ran synthesis. In case you made a mistake and implemented your design, simply open the Synthesized design and write a new checkpoint.

Section 4 – LabVIEW 2017

Step 1 – LabVIEW 2017 splash screen.  I like the new look.

Step 2 – Create a new Project.

Step 3 – Blank Project is fine.

Step 4 – Here is what an empty project looks like before you save it.

Step 5 – Add a new target, for this tutorial we are going to be using the PXIe-6592R High-Speed Serial Instrument.

Step 6 – If you do not have live hardware plugged in, select New target or device, if you do, it should show up automatically.

Step 7 – After adding the PXIe-6592R

Step 8 – Click File->Save and save the project.

Step 9 – Add a new FPGA-scoped VI.  Make sure you right-click on the FPGA-target for this.

Step 10 – I opened the VI and cleaned up the windows and show the block diagram here.

Step 11 – Now click “File->Save” to save the FPGA-scoped VI

Step 12 – I usually save FPGA-scoped VIs in a sub-directory named after the FPGA target.  In this case “Fpga-6592”

Step 13 – Right-click anywhere in the blank white space and select “Timed Loop” to add a Single-Cycle Timed Loop.

Step 14 – Add a new FPGA clock since the default of 40 MHz is not suitable for our 100 MHz MicroBlaze MCS.

Step 15 – Just type 100 in the “Desired Derived Frequency box and click ok

Step 16 – Now right-click on the clock input to the Single-Cycle Timed Loop and select Create->Constant

Step 17 – Here is what the loop looks like before selecting the clock type.

Step 18 – Dropdown should reveal 2 clocks.  You select 100 MHz

Step 19 – After selecting 100 MHz

Step 20 – Now to configure the CLIP node.  CLIP stands for Component Level IP.

Step 21 – Click Component-Level IP in the left, and then click on the Create File icon on the right.

Step 22 – Add the checkpoint dcp file, and the wrapper vhdl file.

The wrapper file can be found here:

I also have a sample checkpoint file, but you ideally want to create this yourself:

Step 23 – Depending on your target, you may have to limit the device families. 

Step 24 – Click Chek Syntax, this requires the Vivado Compilation Tools to be installed in order to work.

Step 25 – Set the reset_rtl Signal type to be reset, the clock_rtl signal type to be clock, and set the data type for the gpio_rtl_tri_i and gpio_rtl_tri_o to be U32.

Step 26 – Nothing to do here, just click next.

Step 27 – Nothing to do here, just click next.

Step 28 – Use the shift key and the mouse to select all signals on the left, and make them all require the clock_rtl clock domain and to be required to be inside a Single-Cycle Timed Loop.

Step 29 – Click Finish

Step 30 – And now you have a CLIP available to your project.

Step 31 – Now create an instance of this CLIP by clicking New->Component Level IP

Step 32 – Select the ip from the drop down, I usually name the instance to match what is in the wrapper vhdl file.  In older versions of LabVIEW this was required, but I am not sure if that is still the case.

Step 33 – Select the appropriate clock

Step 34 – What the project looks like with the added CLIP after expanding it.

Step 35 – Inside the Single-Cycle Timed Loop, right-click and select an “I/O Node”

Step 36 – From here select the gpio_rtl_i, and gpio_rtl_o signals.

Step 37 – Add a control to the input, and an indicator to the output.

Step 38 – Create a Build Specification

Step 39 – And build it! 

You can now run the top-level VI.  Video demonstration to come shortly.

Filter Market Data Messages in an FPGA – part 3

Filter Market Data Messages in an FPGA – part 3

Note: Skip directly to to download the source code by following this link:

This post will cover the next iteration of implementing an OrderBook inside an FPGA that is based on a NASDAQ ITCH 4.1 market data feed.

Some time has passed and I have finally found enough time to finish all the code changes required for the two (2) components listed below, along with the requisite test harnesses to validate.

Starting off, here are the components of an FPGA-based OrderBook

  • ITCH Parser
    • FPGA loop that listens to incoming data from a Network Interface Card that parses, filters and translates each incoming message and sends the appropriate message/command to the OrderBook loop.
  • OrderBook
    • FPGA loop that reads and writes Orders to memory using an insertion sort algorithm.  The Orderbook is currently able to support only one instrument and one side.  It’s capacity is 1,000 elements, which through the power of LabVIEW for FPGA can be easily adjusted, but that is not important right now.  The OrderBook currently supports two commands: add order and get all orders.  The get all orders command is meant to be used by a user or client application for trading and other purposes.

Using Test Driven Development, Here Are the Test Harnesses

  • ITCH Parser Test Harness
    • Input: A file containing raw ITCH 4.1 market data messages (generated using
    • Output: Array of OrderBook operations
  • OrderBook Test Harness
    • Input: An array of OrderBook operations
    • Output: A sorted array of Orders

What Does a Test Harness Look Like?

Here is a screenshot of the Front Panel diagram for the ITCH Parser Test Harness:

and here is a flow chart of what is going on:

What exactly is going on? The vi, reads the file containing raw NASDAQ ITCH Market Data messages, sends them in to the FPGA Test Harness via the Host-to-Target DMA FIFO “HT-TEST_IN”.  The Fpga test harness is located in the “Tests” folder and is named “”, this Test Harness passes the raw Itch data as is to the, which parses, normalized and filters each message for Add order message types only for symbol AAPL. It then sends an OrderBook command for each appropriate message back out to the Fpga-TestHarness which sends the results up to the host

And for the OrderBook Test Harness

Here is what it would look like in a production system

Why this is so important?

Well, normally to create an FPGA based anything, one needs to use Verilog, VHDL or one of any numerous “high-level” design languages.  Here you can accomplish the same thing, but with a really great programming interface that matches the Verilog programming model, but only with a graphical interface.

This means you can create a custom FPGA based solution, reduce your datacenter power usage, increase your applications performance, and reap the rest of the great benefits of FPGA-based computing.

I encourage you to download the source code for this and to see for yourself what LabVIEW for FPGA can do for you and to then try it in one of your own applications.

Stay Tuned… What is next?

  1. Hook the ITCH Parser up to an actual Network Interface Card, preferably a 10 Gigabit, since I already own the hardware to do so.
  2. Hook up either a MicroBlaze processor or the host computer to the OrderBook so that something can be done with the OrderBook data itself.




Filter Market Data Messages in an FPGA – part 2

Skip directly to the source code on here:

So what now.  We know what a NASDAQ ITCH 4.1 Market Data Message looks like.  The format is very simple, there is some – yes – ASCII data in the message format, and all messages are preceded by the message length.  Message length preceding the message makes it very easy to interpret a feed from inside an FPGA.

What to do first? Well, what does eXtreme Programming say to do? It says keep it simple.

So, I am working on a LabVIEW 2016 program that does the following:

  1. Open a NASDAQ ITCH file
  2. Send it one byte at a time into the FPGA over the PCIe bus
  3. FPGA reads the feed, skips over messages types that it does not know how to handle, and parses only messages of a specific type.
  4. Send back to the host computer a statistic – any statistic for now.

If you are unfamiliar with National Instruments products and LabVIEW, go here to learn more:

Step 1 – Open a NASDAQ ITCH file

The first step is to create a LabVIEW for Windows – as opposed to LabVIEW for FPGA – and to read in an entire NASDAQ ITCH 4.1 file.  This file is quite big, so I wrote a quick Python script (used Ryan Day’s code as a guide:, and I generated a file with the following:

  • 1 Timestamp message
  • 5 Timestamp messages
  • 50 Timestamp messages

The Python script can also output the seconds field from each message read in.  See the python script here:

Now I have three files, named: T.itch, T.5.itch, and T.50.itch, and I will write a LabVIEW program to send all data from one of the files above in to the FPGA.

Step 2 – Send Data From One File to FPGA via DMA FIFO

This requires knowledge of LabVIEW for Windows and some knowledge of LabVIEW for FPGA.  I wrote a simple User Interface that allows you to select a file.  That file is then sent to the FPGA using a Host-to-Target FIFO 1 byte at a time.  Since viewing a LabVIEW vi requires that you have LabVIEW installed on your computer, I took some screen shots of the LabVIEW code and placed them here:

Step 3 – Interpret the Feed Inside the FPGA

Right now I have generated a file that contains, 1, 5, and 50 Timestamp messages.  So that means, for each messaged that is encountered, I will extract the timestamp, which will be in seconds, save it in to a local FPGA variable, and send the value back up to the host.

Step 4 – Send Data Back to Host

The statistic will be the seconds portion of each message that was passed in.  The seconds field is a 32-bit integer, so the Target-to-Host DMA FIFO will be a 32-bit integer.  Here is a quick screen shot of the FPGA top-level loop, which reads the input data stream one byte at a time:

The purple colored box is how the FPGA receives the data from the host.  In a live application, the purple colored box, also known as the “Read (FIFO Method)” can have this data come directly from a 10 gigabit connection, or from another loop inside the FPGA.

(Read more about this method on National Instruments website:

As the data comes in, a counter is started at 0, and depending on the element count, the data is stored in different output variables.  The first 2 bytes are the message length, the 3rd is the message type, the 4th, 5th, 6th, and 7th elements are the Seconds portion of the message.

The counter is then compared to the message length variable, and when they are equal, the output variable “Message Done” is set to true.  Here is the bottom portion of the screen shot from above:

After the end of the message is read, we have a case structure, which is similar to an if statement but for FPGAs, which will read the appropriate variables and send them back up to the host via a DMA-FIFO.  Now this DMA-FIFO can be configured to send data up to the host computer, or to another DMA-FIFO inside the FPGA.  For now we are going to send this up to the host for analysis.

Take a look at the right-half of the original FPGA vi screenshot.  This element executes once, and reads the Seconds variable and sends it up to the host.

In part 3, I will add another feature – Add Order with MPID, so we can now know in the FPGA when a new order is entered for a particular security, what side that order is on, and how many shares/price.  This is more meaningful information that can be used to trade the markets, especially during a Donald Trump speech!

Filter Market Data Messages in an FPGA – part 1

So I went to NASDAQs ftp site and downloaded the entire ITCH feed for November 9th, 2013.  The file was large – 319MB compressed, and you can download it yourself from here:

NASDAQ has a very simple document describing the specification here:

I skimmed over the specification to get an idea of how Market Data works.  What I basically understand is that at the start of the trading day, NASDAQ sends a list of all securities that will be available to trade for that day following by a bunch of messages indicating changes to prices being offered to buy or sell for the security as well as actual trades.

The basic format of an ITCH 4.1 Market Data message is the size of the message, followed by the data, where the first byte of the data is the message type.  So using this information we can easily decode an entire ITCH feed, paying attention only to messages that interest us.

Timestamp Message

0x05 Length
0x54 Message Type ‘T’
0x00 Second – byte 1
0x00 Second – byte 2
0x58 Second – byte 3
0xb7 Second – byte 4

The ITCH standard says that all Integer fields are in Big-Endian format, so the timestamp in the message above is interpreted as 0x00 00 58 b7, or 22,711. See a nice online hex to dec converter here:

Now 22,711 is the number of seconds since midnight, here is another online tool to convert this to a normal time in hours, minutes, and seconds:

So 22,711 is 06:18:31.  That is pretty early in the morning, so it looks like this particular NASDAQ ITCH feed starts with pre-market trading.

I also used this online tool to convert ASCII to Hex:

The next message in this feed:

System Event Message

0x06 Length
0x53 Message Type ‘S’
0x11 Timestamp byte 1(nanoseconds since last Timestamp Seconds)
0xcd Timestamp byte 2
0x6c Timestamp byte 3
0xc9 Timestamp byte 4
0x4f Event Code

The possible event codes are:

  • Daily
    • ‘O’ – (0x4f) – Start of Messages
    • ‘S’ – (0x53) – Start of System Hours
    • ‘Q’ – (0x51) – Start of Market Hours
    • ‘M’ – (0x4d) – End of Market Hours
    • ‘C’ – (0x43) – End of Messages
  • As Needed – In the event of an emergency market condition
    • ‘A’ – (0x41) – Emergency Market Condition – Halt
    • ‘R’ – (0x52) – Emergency Market Condition – Quote Only Period
    • ‘B’ – (0x42) – Emergency Market Condition – Resumption

I included the “As Needed” Event Codes, because that is when things will get real bad, and you will probably want your FPGA to get ready to liquidate everything in your portfolio… More on that in the future, but for now we must stayed focused on trading Trump and his tweets.

So 0x4f means start of messages.  Okay, continuing, I see a few more system event messages as well as some Timestamp messages.  I skip these for now and come to the next message which sounds interesting and is a Stock Directory Message

Stock Directory Message

byte # length data description
0 1 0x14 (decimal: 20) Length
1 1 0x52 Message Type ‘R’
2-5 4 1d  48 bd c7 Timestamp Nanoseconds
6-13 8 41 20 20 20 20 20 20 20 Stock (0x20 is a space, 0x41 is A) So this is for Agilent (
14 1 4e (‘N’) Market Category – simply the exchange:

    • N – NYSE
    • A – NYSE Amex
    • P – NYSE Arca
    • Q – NASDAQ Global Select Market
    • G – NASDAQ Global MarketSM
    • S – NASDAQ Capital Market
    • Z – BATS BZX Exchange
15 1 20 (space) Financial Status Indicator – Indicates when a firm is not in compliance with NASDAQ continued listing requirements.  This sounds like a way to find distressed stocks that are about to be delisted – lots of volatility with low volume.

  • D – Deficient
  • E – Delinquent
  • Q – Bankrupt
  • S – Suspended
  • G – Deficient and Bankrupt
  • H – Deficient and Delinquent
  • J – Delinquent and Bankrupt
  • K – Deficient, Delinquent and Bankrupt
  • Space – Company is in compliance
16-19 4 00 00 00 64

(decimal: 100)

Round lot size
20 1 4e (N) Round lots only – indicates if NASDAQ only accepts orders in round lot size

  • Y – only round lots are accepted in this stock
  • N – odd/mixed lots are allowed

Okay, so we can now decode a couple of message types.  But this is not important and is only distracting us from the goal.  We can decode all of these messages inside a MicroBlaze Soft Core Processor running embedded C++ inside the FPGA, and then send a message to the rest of the FPGA telling it how to deal with the rest.  So let’s keep scanning this Market Data file and find a message that helps us with trading.

I encounter a message of type “Reg SHO Short Sale Price Test Restricted Indicator”.  I did some googling and found that Reg SHO is yet another loophole-filled attempt by regulators to prevent naked shorting.  My favorite sentence from this Investopedia page is:

“a broker has reasonable belief that the equity to be short sold can be borrowed and delivered to a short seller on a specific date before short selling can occur”

Anyway, let’s keep scanning this file.  I find a message of type ‘H’ – Stock Trading Action Message, this sounds pretty good…

Stock Trading Action Message

byte # length data description
0 1 0x13 (decimal: 19) Length
1 1 0x48 Message Type ‘H’
2-5 4 1d 50 bd 0d Timestamp Nanoseconds
6-13 8 41 42 2d 20 20 20 20 20 Stock Symbol: AB-
14 1 0x54 (T) Trading State

  • H – Halted across all U.S. equity markets / SROs
  • P – Paused across all U.S. equity markets / SROs (NASDAQ-listed securities only)
  • Q – Quotation only period for cross-SRO halt or pause
  • T – Trading on NASDAQ
15 1 20 (space) Reserved
16-19 4 20 20 20 20 Trading Action Reason – I guess all spaces means “no reason” or “nothing to worry about”

Okay, so it looks like the Trading Action Message just tells us when trading is officially open…etc.  Nothing of use here for the FPGA plan.  Anyway, on to the next message type that I find.

I continue running my analysis script and I find the following:

  • Add Order with MPID
  • Add Order Message
  • Order Delete Message

This time I will not analyze the “Add Order with MPID” message unless it appears to have something of use.  The Message Type is ‘F’, and it looks like this message will be useful to us.  Additionally, MPID stands for “Market Participant Identifier” and appears to simply be your broker.  See:

So, here is the analysis:

Add Order with MPID Attribution

byte # length data description
0 1 0x22 (decimal: 34) Length
1 1 0x46 Message Type ‘F’
2-5 4

02 43 d1 46

Nanoseconds portion of timestamp
6-13 8 00 00 00 00 00 00 16 b2 Order number – unique and is assigned during order entry
14 1 0x42 Type of order

  • ‘B’ – (0x42) – Buy
  • ‘S’ – (0x53) – Sell
15-18 4 00 00 00 64 (decimal: 100) Number of shares
19-26 8

5a 56 5a 5a 54 20 20 20

The stock symbol:


Funny… this looks like a test symbol, take a look at the Bloomberg quote:

27-30 4

00 02 97 ac

Display price, converts to: $16.99
31-34 4  4c 45 48 4d NASDAQ MPID: LEHM (Is that Lehman Brothers?)

So I am looking at the NASDAQ ITCH 4.1 specification (as you have obviously realized by now) and I see that the order immediately before “Add Order with MPID Attribution” is “Add Order – No MPID Attribution”, which is exactly the same as the message above, but without the last field.  The only other difference is that the message type is ‘A’.

A word about Display Price

So the specification says that you convert all Integer price values and treat them as fixed pointed numbers with the first 6 places representing the integer portion and the remaining 4 representing decimal digits, and this leads to a maximum price of 200,000.0000.

So in the example message above:

000297ac is equal to 169900, which is 16.9900, or simply $16.99.


Okay, so we know when the market opens and trading is started, stopped…etc We know when a new order is added to the NASDAQ order book.  What is next? Order Delete!

Order Delete is different from Order Cancel.  Order Delete means that the entire order is removed or deleted from the Order Book, Order Cancel is whenever a portion of an order is cancelled.

Order Delete Message

byte # length data description
0 1 0x0D (decimal: 13) Length
1 1 0x44 Message Type ‘D’
2-5 4

21 95 1b cf

Nanoseconds portion of timestamp
6-13 8

00 00 00 00 00 00 031 6b

Order reference number

Order Cancel Message

byte # length data description
0 1 0x (decimal: ) Length
1 1 0x58 Message Type ‘X’
2-5 4

21 dd db e1

Nanoseconds portion of timestamp
6-13 8

00 00 00 00 00 25 42 b7

Order reference number
14-17 4

00 00 00 0a

Number of shares being removed from order.  Hmmm… is someone backing out of a position? Scared!!! that means sell! Eh, maybe not, they only removed 10 shares.  But what if they are trading a high-priced stock like Amazon or Google? Oops, I meant Alphabet… and what about Berkshire Hathaway?

Okay, so we have Add Order, Remove Order, Shorten Order.  What is left? Order Execute and Order Modify.  I will go with Order Executed.

Order Executed Message

byte # length data description
0 1 0x19 (decimal: ) Length
1 1 0x45 Message Type ‘E’
2-5 4

18 45 cd 07

Nanoseconds portion of timestamp
6-13 8


00 00 00 00 00 00 e9 ca

Order reference number
14-17 4  00 00 03 e8 Number of shares executed
18-25 8

00 00 00 00 00 00 00 01

 NASDAQ generated day-unique match number.

Order Executed with Price Message

byte # length data description
0 1 0x1e (decimal: ) Length
1 1 0x43 Message Type ‘C’
2-5 4

01 74 a8 ca

Nanoseconds portion of timestamp
6-13 8


00 00 00 00 00 00 31 9b

Order reference number
14-17 4  00 00 00 64 (decimal: 100) Number of shares executed.
 18-25 8

00 00 00 00 00 00 03 b0

 NASDAQ generated day-unique match number.
26 1



  • ‘Y’ – (0x59)
  • ‘N’ – (0x4e)
27-30 4

00 02 9c 5c

Execution Price: 17.1100, or $17.11

I think that I have enough information to get started with the FPGA portion of this.  So I will skip analyzing the Order Replace Message for now.  So….

We have analyzed the following message types:

  • Timestamp Message
  • System Event Message
  • Stock Directory Message
  • Stock Trading Action Message
  • Add Order – No MPID Attribution (well, kind of)
  • Add Order with MPID Attribution
  • Order Delete Message
  • Order Cancel Message
  • Order Executed Message
  • Order Executed with Price Message

What is missing? I am looking at the ITCH 4.1 specification and I see the following message types:

  • Order Replace
  • Trade Message (Non-Cross)
  • Cross Trade Message
  • Broken Trade / Order Execution Message
  • Net Order Imbalance Indicator (NOII) Message
  • Retail Price Improvement Indicator (RPII)

So I can go ahead and analyze the rest of the messages, but I want to get to some coding.  Coding as in drawing some LabVIEW program (no, not writing, but drawing).  Anyway, I will take a look at how to write (aka draw) some LabVIEW for FPGA code that can handle the parsing and interpreting of the necessary information to react or trade to a Trump Event.  Stay tuned.

From Trump to Profit

I am sure that every single trader gets nervous whenever Trump speaks.  If he says “wall” they have to go short the Mexican Peso, if he talks about Obamacare or the “Unaffordable Care Act”, they have to short Healthcare stocks.  During his first news conference on Thursday, January 11th, 2017, he went a step further and said he wants bidding and competition for Government purchases of Pharmaceutical drugs.  So not only did the Mexican Peso suffer, but so did Pharmaceutical companies such as GlaxoSmithKline, Merck and Pfizer, among others!

His news conference was from 11:00 am to 12:15 pm.

Here is a chart of the Mexican Peso during this time:

USDMXN – 1/11/2017 from 11:00am to 12:15pm

and a chart showing Pfizer:

PFE US Equity – 1/11/2017 from 11:00am to 12:15pm

and GlaxoSmithKline:

GSK US Equity – 1/11/2017 from 11:00am to 12:15pm

See the CNN timeline of Trump’s first news conference here:

Now how do you use your custom trading algorithm to make money during such an event as a Trump News Conference?  Simple, use FPGAs!

You have your algorithm, which may be as simple as “If PFE drops by x percent in y seconds” then “sell z amount of PFE”.  Then after PFE drops a further amount, you may choose to buy it all back.  The algorithm is up to you of course, and my job is to tell you how to use an FPGA to speed it up, because the first to sell at the top make the most profit.  Let’s take a closer look at the chart for GlaxoSmithKline:

GSK US Equity – 1/11/2017 from 11:22am to 11:25am – right before Trump said that he wants bidding on Pharmaceuticals

The major premise is that at the left of this chart trading volume is very high, both for GlaxoSmithKline, other Pharmaceuticals and the markets in general, and that the few trading firms that are able to send sell orders first will make the most money, with those that follow making less and less at an accelerated pace.  Here is a chart of the volume for GSK during the entire trading day:

GSK US Equity – Volume and Price Jan 11, 2017 11:21-11:26am

On January 11th, 2017, the average trading volume for the entire day for GSK was 13,003 shares per minute, and here is a table of the volume for the chart above:

Minute Volume Price
11:20 8,145 39.1599
11:21 7,266 39.16
11:22 147,329 38.97
11:23 64,563 38.95
11:24 60,768 38.821
11:25 38,823 38.8554
11:26 23,603 38.8368
11:27 34,205 38.8001

Look at the spike in volume at 11:22, and look at the price difference!

So a trader or algorithm looking to exploit this opportunity faces some challenges.

  • More Market Data for security of interest
    • Your algorithm is now being bombarded with thousands, if not hundreds of thousands of messages about the price of the security which you want to trade.
  • More Market Data in General
    • Your market data feeds are being overloaded and the time it takes to process each message increases.
    • The amount of time that it takes to process a message with respect to the amount of messages coming in is never linear.
  • Slower response time from your algo
    • Your algo is slower at generating messages because your system is overloaded.

An FPGA can help in all 3 areas.

1 – Filter out all securities and keep only those of interest

2 – Do the analysis portion to detect whether or not to sell GSK in the FPGA

3 – Generate the orders directly in the fpga


The advantages of an FPGA, especially when in a Network Card configuration is that all of the above happens inside hardware,  and is completely outside of your host computer.  So imagine that your host computer gets only Market Data from a list of securities instead of all Market Data from all securities.  Your host now no longer needs to filter market data before sending the information to your algo.

Now the FPGA can also pre-process the data and send the data in a easier format to the host computer.  Imagine your host computer no longer has to perform simple calculations that it uses to determine whether or not trade because that has been done inside the FPGA.

And finally, why generate and send orders to buy or sell a security inside the host computer when the FPGA can do that for you as well? Instead of sending an entire order down your operating system stack, just send a bunch of parameters to the FPGA and let it handle creating the appropriate messages, calculating CRCs and checksums, and of putting them on the wire.