Update

So it appears to me that Monero/CryptoNote mining has gained a lot of popularity lately, and this has led many people to this website.  Let me note that LabVIEW FPGA is a proprietary tool that comes with a 30-day Evaluation.  After that, you have to spin up a new virtual machine and reinstall LabVIEW to keep your installation alive.  However, when it comes to FPGA development, LabVIEW is the best tool out there.

Sure, it may not support the latest board from Xilinx or Digilent or whoever, but what it does support it supports well.

Take the PXIe-6592 board.  I was able to take this board and to implement stage 3 of the CryptoNight algorithm while I was on a “diversion” from my normal FPGA development side projects.

Now I am hearing that Monero is profitable, but only via an FPGA.  Well, let me go back to my source code and see if I can provide some benchmarks.

Network Connectivity Established via MicroBlaze and PXI-6592

So after some serious debugging, editing, and regenerating of the bitstream, I was able to send out an ARP response from the FPGA to my linux server, and for my linux server to send a UDP packet in to the MicroBlaze.  This was all verified via UART debug statements.

Now while I work on cleaning this all up, you can actually use this code in your project, but only if you have enough knowledge of LabVIEW and Xilinx.  My job is to help bridge that gap, but for now:

The source code is located in:

https://github.com/fpganow/MicroBlaze_lwIP

The application that is to be compiled and to run on the MicroBlaze is called ‘mb_lwip’ and its source code is here:

https://github.com/fpganow/MicroBlaze_lwIP/tree/master/xilinx_mb_lwip/mb_lwip

To make it work, you will need to add a reference to the lwip directory, which is currently located:

https://github.com/fpganow/lwip/tree/master/lwip

Note, that the ‘fpganow/lwip’ repository is currently referenced as a sub-module of ‘fpganow/MicroBlaze_lwIP’.  I will move it of course… But I will first try to re-create this project and while I re-create the project I will move and clean things up.

Now on to the LabVIEW part.

https://github.com/fpganow/MicroBlaze_lwIP

You want to open the ‘Tests/FPGANic/FPGANic-Tester.vi’ program as your entry.  Start the vi, wait for the FPGA bitfile to be downloaded and then click on the ‘polling’ button to start receiving debug information about incoming and outgoing packets.

LabVIEW FPGA, MicroBlaze, and UART – Full Guide

Working from scratch, I created a LabVIEW FPGA project that imports a MicroBlaze design that communicates with LabVIEW via a UART, and has the ability to change the elf file in a much shorter time frame than before.

I did this by adding the MicroBlaze to the project after it had been exported to Vivado, and not from within the CLIP that is imported as before.  The only bad news is that I have to synthesize the FPGA project from Vivado, which currently is not connected to the NI FPGA Compile Cloud.  This may be a feature that is coming soon, but it will only come if users start using this the Project Export to Vivado feature in the first place.  So please write me any comments if anything is confusing or hard to follow below!

YouTube Video Demonstration

Source Code

See my github repository here:

https://github.com/JohnStratoudakis/MicroBlaze_UART

Part 1 – Create and Export a MicroBlaze Design

We are creating a MicroBlaze design, settings all of our processor options, including adding an instance of the UARTlite IP core, and exporting this Block Design to a tcl script that we will later on import in to our LabVIEW FPGA generated Vivado Project.  We will not export the hardware or create any elf files in this part.

  • Create a new project from within Vivado 2015.4 (this is important, it will likely not work from other versions of Vivado)
    •  
  • The first step is not that important, you can just click next
  • I selected the project location to be on my E drive, “E:/git/MicroBlaze_UART/xilinx_mb”, and the name of the project to be “mb_uart”.
  • This is an RTL project type, and we do not want to specify any sources at this time.
  • The PXIe-6592R board contains a Kintex-7 FPGA chip with the following parameters:
    • Part #: xc7k410tffg900-2
    • Family: Kintex-7
    • Package: ffg900
    • Speed Grade: -2
  • New Project Summary, just click finish
  • What an empty project looks like:
  • Now create a block design
  • I have been going with the “d_” d underscore followed by microblaze naming schema
  • Now look at the empy Block Design
  • Click on the “Add IP” button
  • Start typing in MicroBlaze, and make sure you select “MicroBlaze” and not “MicroBlaze MCS”.  The MicroBlaze MCS is a striped down version of the MicroBlaze which is very easy to use, but hard to bring in to LabVIEW.  Well, it is not hard to bring in to LabVIEW, I just have not figured out how to bring it in and for the UART to work!
  • Look at the MicroBlaze IP.  I hear in older version of the Xilinx Tools – namely ISE – there was no such picture, but instead you were given a list of signals and ports…
  • Now click on “Run Block Automation”, this will bring up a wizard where you can set a bunch of parameters, such as how much memory should be used and what peripherals it can support
  • This is what the defaults look like:
  • I set the following parameters:
    • Local Memory: 128KB
    • Cache Configuration: 64KB
    • Debug Module: None (Can’t debug from LabVIEW at the moment)
  • Here is what it looks like after block automation.  Notice the local memory block, the Processor System Reset icon and the Clocking Wizard
  • I want to remove the reset from the Clocking Wizard and to convert the input clock to a single-ended clock from a differential clock.  A differential clock just means that there are 2 clock signals and they always have to be opposites of each other.
  • Here I switch the clock from “Differential Clock” to “Single-Ended Clock
  • Here I get rid of the Reset signal, notice how the Reset Type gets grayed out automatically.
  • Now I add the “AXI Uartlite” IP.  There is another UART IP that is available, but I have arbitrarily chosen to learn by using this one.
  • Now I want to customize the Uartlite IP, so just as before with the Clocking Wizard, I right-click (away from any terminals) and select “Customize Block”.
  • I set the Baud Rate to 128,000, I leave the number of data bits to 8, and I set even parity.  Note that I chose to add a parity because I want my UART connection to be more exact and to receive less (if no) garbled text.
  • Now for the fun part… Click on “Run Connection Automation” and watch as Vivado wires up all of our IP and components together!
  • All of the default options are fine, but I have included screen shots so that you can see all the details yourself:
  • Now after this completes, the block design will look pretty messy, so click on the “Refresh” looking icon below to regenerate the layout:
  • Here is a cleaned up version of the Block Design.  Notice that the “Run Block Automation” option is still there.  Nothing has gone wrong, this text is there because we now have to wire up the Data and Instruction Caches.
  • Again, the default options are fine
  • And finally… our block design is ready.
  • Now we will generate an HDL Wrapper file.  This is not required for the Block Design, but it will help us with the importing of this design in to LabVIEW later.
  • Now we will click on “Export->Export Block Design”, this will generate a tcl script that we can run or “source” from another Vivado project and this block design will be regenerated for us.
  • Note the location of the wrapper VHDL.  Copy this file to your clipboard
  • Place it in to the root directory of your project, a location that you will commit to source control.
  • The Tcl file should also be in the same directory

Part 2 – (Optional) How to Recreate MicroBlaze Design from Source TCL Script

So Vivado is not like other programming languages where you create your gitignore file and commit the rest to source code control.  In Vivado, you generate a TCL script that will re-generate your entire project – or in our case – a specific Block Design.  This script will also import any other files such as VHDL files or constraints files that are required.  In our case we have a very simple design that does not require any such helper files.

  • Click on Create New Project:
  • Same as before, first couple of steps just click “Next”
  • This time I will call the project “mb_uart.imp”, to differentiate it from the project that I created.
  • Again, RTL project, and do not specify any sources at this time
  • Same part as before.  Not sure if you can import a block design to other FPGAs, perhaps if they have the same family or series, but I have not tried this out yet.
  • Project Summary
  • Click on “Tcl Console”
  • Change to the directory where the Tcl export script is located
  • Type dir if you like.  Notice how “known Tcl” commands are sent to the underlying os for execution
  • And finally, “source” the tcl script
  • And here is the imported Block Design
  • And that’s it! Create an HDL Wrapper if you like

Part 3 – Bring Design in to LabVIEW FPGA

Now we have to create a CLIP (Component Level IP) Node in LabVIEW FPGA that will import this MicroBlaze Block Design.  A CLIP node contains a top-level vhdl wrapper that usually instantiates the IP that we want to bring in to LabVIEW FPGA, but in this case I am creating a CLIP node that contains an empty wrapper for the MicroBlaze Block Design.

  • Launch LabVIEW 2017 (32-bit) from the start menu
  • Here is the screen that appears after you start LabVIEW
  • Click on “Create Project”, Blank Project should be fine.
  • Right-click on the “My Computer” icon and select “New->Targets and Devices…”
  • Select “New target or device” and select the PXIe-6592R FPGA board
  • Here is what the project looks like after adding the FPGA device/target:
  • Create a FIFO for communicating from the Host to the FPGA Target, aka “Host to Target – DMA” by clicking New->FIFO:
  • I follow a naming standard that ALE System Integration follows which is to prepend “HT” or “TH” to the name of each FIFO, where HT stands for Host to Target, and TH stands for Target to Host, so I name this FIFO “HT-UART_TX”:
  • I also set the Data Type of this FIFO to U8, because it will be used to receive characters
  • The same thing for the Receive FIFO, “TH” for Target to Host, and RX for Receive.
  • Data type, again is U8.
  • Now do you remember where the microblaze wrapper vhdl was located? Find it and copy it to the root of the LabVIEW project
  • The LabVIEW project is located in the MicroBlaze_UART/labview_fpga_uart directory:
  • Rename the file by prepending “UserRTL_” to the name of the file.
  • Edit the file, here is what it looks like before: (Sorry for the screen shot, I will provide source code links soon):
  • Change the name of the entity to match the file right now
  • Now we will create a CLIP, right-click on the FPGA target and select “Properties”
  • Add the VHDL file that we edited before – “UserRTL_d_microblaze_wrapper.vhd”
  • Then click Next to go to step 2 of 8.  Here select the “Limited to families selected below” option:
  • Then click next to go to step 3, here you have to click on “Check Syntax” and a Xilinx application is run to check the syntax
  • Now for step 4, click on the “clock_rtl” signal and make sure it is set to “Signal type” of clock.  Don’t worry about the reset signal, we want to control this ourselves.
  • For step 5, nothing is required, so skip and go to step 6
  • Same thing for Step 6, just click next
  • For step 7, just make sure that all signals are “Allowed” inside a Single-Cycle Timed Loop.  You can probably also required this, but we will not be doing that today.
  • Step 8 – Click Finish. 
  • We are finished, now a CLIP is available for us to use in our FPGA design, but it has not been instantiated.  I believe you can have multiple instances of the same CLIP.
  • Before we go any further, we have to add a clock for the MicroBlaze.  Our design will be 100 MHz, so right-clock on the 40 MHz clock and click on “New FPGA Derived Clock”
  • Type 100 in to the “Desired Derived Frequency” box, and everything else should automatically update.  Then click OK
  • Now we also have to set our top-level clock to be 100MHz.  So right-click on the FPGA Target and select properties, and in the following dialog select “Top-Level Clock” and select the new clock.
  • Now we will add an instance of the CLIP.  Right-click on the FPGA Target and select “New->Component-Level IP
  • Select the UserRTL_d_microblaze_wrapper
  • And on the “Clock Selection” page, select the 100MHz clock.
  • Now we will add some existing vi’s
  • And here is what our final project looks like:

Part 4 – Create LabVIEW Host Wrapper

LabVIEW Host applications run as native Windows processes

  • Now we will add the Host LabVIEW application vis.  A LabVIEW Host application is a native Windows executable and thanks to a bunch of libraries written by National Instruments will handle all of our communications with the FPGA.
  • Add the existing files
  • Here is what the project looks like with all of the Host VIs.

Part 5 – Export Project to Vivado

LabVIEW has introduced a new feature, the ability to export an entire FPGA design to Vivado, which allows you to import any existing IP.  All you have to do is define one or more CLIP IPs that define the interface with your design.

  • Right click on the Build Specification inside the FPGA Target and select “New->Project Export for Vivado”
  • Give the build specification a name, I recommend keeping it short and sweet.  So change it from “My Project Export for Vivado Design Suite” to something like “FpgaUart”
  • I always set “Auto increment” to true.
  • Then on the “Source Files” tab and select the “Fpga-Uart-Exercisor.vi” to be the top level
  • Now there is a new Build Specification.  Clock on Build
  • After it completes, you can launch Vivado by clicking on the button below.
  • Here is what the Vivado project looks like.  All of the IP files are encrypted, except for the files we added for our CLIP.  Since we added only one file, that is all we will see.
  • Now we will import our Block Design.  Locate the “Tcl Console” in the bottom window.
  • This console supports TCL commands as well as regular operating system commands.  Since we are on Windows, we would like to change our current working directory to be where our d_microblaze.tcl file is located.  Remember, Vivado uses the backslash ‘\’ as its escape character, so you will have to enter the backslash twice for each time you would like to use it.
  • Now we have to source our file by issuing the “source d_microblaze.tcl” command:
  • This will take a couple of seconds, depending on the speed of your computer.
  • After it finishes importing/re-creating the design, this is what you should see:
  • Now we go back to our “UserRTL_d_microblaze_wrapper.vhd” vhdl wrapper file and remove the comments enabling the code that uses the MicroBlaze.
  • Now you will see that this new design appears under the VHDL wrapper file
  • Now we have to export this entire design to Xilinx SDK so we can generate an executable to run.  Click “File->Export->Export Hardware”
  • We have not run “Generate Output Products” for the MicroBlaze, so we will be prompted to do so.  Make sure you click “Generate Output Products”.  If you are more experienced than me in Vivado, perhaps you know if this step is required.
  • The default directory is fine.
  • The directory will be named “FpgaUart.sdk”

Part 6 – Building a C Executable for Running on the MicroBlaze Soft-Core Processor

Run the executable by referencing the lvbitx file

  • Now open Xilinx SDK and select the “FpgaUart.sdk” directory as the workspace
  • Create a new “Hardware Platform Specification” project by clicking on “File->New->Other”
  • Select “Xilinx->Hardware Platform Specification”
  • Select the only file in the sdk directory, the .hdf file.
  • The Project name will be automatically populated
  • Here is what everything looks like after creating the Hardware Platform Specification
  • Now create an Application Project by selecting “File->Application Project”
  • Name the project “mb_uart_1” and click next.
  • Select “Empty Application”
  • Now we will add a new source file
  • We will call it main.c
  • Here is the application with an empty main.c file.  Note that there are errors listed in the bottom window because this cannot build without a main function!
  • After I cut and paste the source code for a simple UART application, the errors go away.
  • Now I am creating a second project with the same name but with the number 2 instead.
  • I am going to use the same Board Support Package of “BSP” file as before, and I am creating another Empty Application.
  • I add a new main.c as before.
  • I paste the same source code, but this time I replace all instances of “1.0” with “2.0”
  • I set the active configuration to Release for both projects.  This isn’t really necessary for a small design such as this one, but it is a good habit to have.
  • Now we go back to Vivado and click “Tools->Associate ELF Files…”
  • We do not have to select an elf file for simulation, but you can if you wish to create a test bench for this project.  I have done this in the past and it takes me about 3 hours to simulate 100 milliseconds.  With the method described on this page, it becomes much easier to just swap out the elf file and to regenerate the bitstream.
  • We select the first elf file – “mb_uart_1.elf”
  • And finally, we click on “Generate Bitstream”

 

I Need Uart

Before I go any further and start wiring up my code to use the lwIP on the embedded MicroBlaze, I will need some sort of better method of debugging.  All code that I have implemented that uses a MicroBlaze processor via the Xilinx tools makes use of the UART to send and receive standard input and output.

Right now I have been getting around this by writing values to the GPIO and to the host via a FIFO, but I do not want to have to recompile all of my code and move where I have wired the FIFO to for each test.  This would result in very long compile-build-run-test cycles.  Or to be more accurate, synthesize-build-… okay, you get the point.

Anyway, I have just finished implementing a UART project that is able to receive the contents of a printf statement from the MicroBlaze in LabVIEW.  I am using even parity, even though they say it is no required, 1 stop bit and 8 data bits, all at the 128,000 baud rate.

I pushed my code to github, and am in the process of adding some printf statements to the main – large project – also known as “FPGANic”.

So for now, see:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/tree/master/06_MicroBlaze/05_UART

and stay tuned as I will be cleaning up this code and coming up with a better way of sending and receiving data inside the MicroBlaze.  Perhaps I will make the entire thing “console” based, i.e. I send a command to the MicroBlaze via the UART, and the MicroBlaze will then create a UDP, send data, close a session, whatever I want!

If You Buy This Board, You Can Run This

If you purchase the National Instruments PXIe-6592R Board, retailing at $12,197.00 USD, I guarantee that you can run an FPGA accelerated 10 Gigabit network card in as much time as it takes for you to synthesize your code!  Call Now, the number is 1-900-XXX-YYYY.

Batteries not included, strings attached.  But seriously, I have just cleaned up the code and was able to run this from its new home, namely a brand new directory inside my kitchen sink of LabVIEW samples.

Download the source code and take a look at it here:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/tree/master/07_10_Gigabit/02_FPGANic

You can take a look at the MicroBlaze design, you can look at the MicroBlaze C++ code, or you can look at the LabVIEW code.  I removed all of the other vi’s that are not needed for this specific example, so browsing this code should be easy.

Please note however, that this project is what I will slowly morph to have a TCP/IP stack implemented inside the FPGA MicroBlaze and not a pseudo TCP/IP stack in software.

Additionally, only Windows drivers are available for use for this specific board developed (again) by National Instruments.  I have successfully (and legally) engineered my own device drivers for other boards in the past from NI.  You too can write your own drivers to port this to Linux or IBM or whatever hardware platform of your choice.  However, in order to do this you would have to spend a certain amount of $ on hardware, or be really good at convincing people to give you things…

Finally, to run this code, you would need the following installed on your system:

And that’s it! Oh, and you would also need a PXIe chassis to house this board, but if you order one of these boards your sales representative will recommend one for you.  I went real cheap and got a used PXIe-1062Q for around 750$.  And the Q stands for “quiet”, and believe me, it is not quiet at all, so imagine how loud the normal version is!  (Remember, this hardware is usually military hard and capable of running on things like airplanes, satellites, Humvees in the desert, so the noise it makes it most definitely acceptable, but don’t expect to meditate with this thing on)

Addendum:

NI Week is coming up fast, this means a new version of LabVIEW and LabVIEW FPGA will be available for download sometime in May of 2018.  This means we may get a nice upgrade to be able to use this board with a later version of the Xilinx Vivado Tools.  So stay tuned and check out www.ni.com/niweek for more information!

30,000 Foot View

I normally avoid shop words or “corporate speak” because I feel it dehumanizes us, but sometimes these phrases are necessary.  So here is the “30,000 foot” view.  And please pardon the appearance of my flow charts and diagrams, I am not a graphic designer…

All images open in a new tab, so just click on them with ease until I figure out how to make this WordPress theme wider.

Take  look at this:

  • There are four (4) 10 Gigabit ports available on this device, but I am using only one of them for this design.
  • The FPGA design contains a MicroBlaze soft-core Processor
  • This MicroBlaze soft-core Processor is connected to a LabVIEW for Windows Executable, which is the same thing as a standard .exe file.

Now some more details:

  • Any type of Ethernet Frame can enter the 10 Gigabit PHY, as shown in the diagram:
    • An Ethernet Frame with an ARP Request
    • An Ethernet Frame with an IPv4 Packet containing an ICMP packet, otherwise known as a “Ping” message
    • An Ethernet Frame with an IPv4 Packet containing a UDP packet.  You may known this as “multicast” or “broadcast” messages.

And finally, everything, the full shebang:

I have broken out the details of the hardware comprising the 10 Gigabit connection.  Technically it consists of 4 SFP+ connectors going to a Multi-Gigabit Transceiver.  Now normally I would think that there is some sort of chip in between the SFP+ and the FPGA, but I believe that National Instruments has used some Xilinx IP that handles this for us.  See, I don’t even know what hardware is being used, but I am able to use it and make an FPGA-based Network Card!

The data from the Multi-Gigabit Transceiver then goes to the 10 GE MAC Core, which sends all received packets

First, a Definition

CLIP – Stands for Component Level IP and is a method of bringing in non-LabVIEW FPGA code in to LabVIEW.  Basically you take a synthesized design, wrap it up in some VHDL and import this VHDL file to LabVIEW.  In this case an instance of the OpenCores 10 GE Mac is being brought in to the the design.

See the top-level wrapper VHDL file here:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/blob/master/07_10_Gigabit_CLIP/CLIP/TenGbEClip.vhd

See the source code of this core here:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/tree/master/07_10_Gigabit_CLIP/CLIP/xge_mac_opencore/source/xge_mac_opencore

See the official project website for this core here:

https://opencores.org/project,xge_mac

Now the Descriptions

A – The reading of incoming frames is wrapped up in to a nice library by National Instruments, that even includes some IP to respond to ARP messages.  I have stripped all of this usage out for this design because I wanted simplicity for learning.  Anyway, on each clock cycle you have the following variables:

  • data valid [boolean]
  • data [64 bit WORD]
  • byte enables [array of 8 booleans]
  • End of Good Frame [boolean]
  • End of Bad Frame [boolean]

Here is a screenshot of the usage for this, it should be very easy to understand once you have an idea of what LabVIEW is and how it works.

B – Now the data coming from the 10 Gigabit PHY contains 64-bit WORDs, and 2 booleans, one for a good frame, and one for a bad frame.  Now I do not know how to configure and properly use a 64-bit AXI-Data Stream FIFO with a MicroBlaze processor, so I had to convert this data manually myself.  It did not take long, in fact I documented this in my log where it took me 1 hour and 15 minutes to do this following the LabVIEW FPGA State Machine paradigm.  Think of the LabVIEW FPGA State Machine paradigm or pattern as the absolute best of both worlds in terms of VHDL/Verilog and LabVIEW.

So, we have data coming in what I call “AXI-64bit format” and we have to convert it/write it to a LabVIEW FIFO.  Here is a close-up of this code:

The code above is running inside a loop clocked at 156.25 MHz, and on the left is how we get the data from the 10 GE MAC.  If “data valid”, or “End of Good Frame” or “End of Bad Frame” are true, we enter the self explanatory Case Structure, which is the same thing as an if statement.  Inside this case we package all the data in to a custom “Cluster type”, which is the same thing as a C structure and write it in to a LabVIEW FIFO.

C – Now we read one element on each clock cycle of the custom LabVIEW Cluster defined in step B, and convert this in to a 32-bit AXI Data Stream to be read by the MicroBlaze.

Here is a screenshot of the entire loop, which runs at 100MHz, because I clocked my MicroBlaze to that speed.  I could probably increase my MicroBlaze to 156.25MHz, but that will decrease my productivity in terms of longer synthesis times.

I zoomed out a bit further for this screenshot and included the clock specifier, which is 100 MHz.  Also notice how there is another “Case Structure” inside this loop, but the case is not “True”, but it is a State Machine with the “Read-Top” case showing as the default state.  This state checks if the incoming data is valid, and if so writes the upper 32 bits of the 64-bit data WORD in to the AXI-Data Stream FIFO that is connected to the MicroBlaze.

Here is a close-up of the “Read-Top” state:

Here is the other state:

The “Read-Bottom” state.  This state will write the lower half of the 64-bit WORD and will check if this is the final element in the Ethernet Frame.  If this is the final element, it will enter the “Append-Size” state, which is incorrectly named, will fix that later “TODO: Rename Append-Size”. haha.  Anyway, it will append some metadata indicating if this frame should be dropped or kept.

The final state – “Append-Size”:

This code is very simple.  I set TKEEP to all one’s, or 0b1111, and I set the first 2 bits of the 32-bit WORD to contain “End of Good Frame” and “End of Bad Frame”.  Now why am I setting TKEEP to all 1’s? Well, simple, because I haven’t implemented this part yet, however, by setting it to all 1’s my code will still work because most TCP/IP stacks just ignore all padded 0’s.

D – Now the MicroBlaze has been programmed with a function that reads an incoming frame from AXI-Data Stream FIFO #0 and writes its contents to AXI-Data Stream FIFO #1.  It also reads an incoming frame from AXI-Data Stream FIFO #1 and writes its contents to AXI-Data Stream FIFO #0.  This is a simple passthrough that exercises my implementation of the FIFOs.

A direct link to the source code of this C code that is running in the MicroBlaze:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/blob/master/07_10_Gigabit_CLIP/mb_lwip/mb_lwip.sdk/mb_lwip/src/helpers.c#L83

And a screenshot:

E – Now what happens after the source code above executes? Well, data is read from the 10 Gigabit PHY and read on the first AXI-Data Stream FIFO and written out back to the rest of the FPGA via the second AXI-Data Stream FIFO.  So we want to read this ethernet frame and write it up to the Host application running on normal/regular Windows.  This is very simple, read data, write data to a Target-to-Host LabVIEW FIFO, and if tlast is equal to true, include this in the metadata, which for now is simply the upper half of the 64-bit WORD.

F – Now how do you read this data on the host? Well, if you are familiar with LabVIEW, the code would look like this:

The green box contains a reference to the running FPGA.  The first box on the left polls the FIFO to see if any elements are available, and if the number of elements available is greater than 0 it reads that number of elements.

If you wanted to do this from C++, you could use LabWindows CVI to read from the FPGA interface as such:

/* read the DMA FIFO */
NiFpga_MergeStatus(&status, NiFpga_ReadFifoI16 (session,
NiFpga_fpga_TargetToHostFifoI16_AIFIFO,
data, numSamples, timeout, &r));

(See: http://www.ni.com/tutorial/8638/en/)

Please note that you can also link to the LabWindows CVI library and use it from your existing C++ applications.  Drivers for this specific board are only available for Windows, but if you are a big bank or financial firm with deep pockets, I’m sure you can set up some sort of agreement with National Instruments to port this code and drivers to <Operating System of your Choice>.

Okay, that is great, now what about writing data from the Host application back to the FPGA for sending out of the 10 Gigabit PHY? Well, you do the opposite, you enter the codes in reverse. (Spies Like Us).

Instead of a “Target-to-Host” FIFO, use a  “Host-to-Target” FIFO, and in my case, I prepend the size in WORDs to the packet to be sent.

Again, the green box is a reference to the running FPGA.  The square light green colored box is a function, also known as a “sub-VI” that I wrote that generates a UDP packet (or is it a UDP datagram? I forget).  The output of this UDP packet is converted in to 32-bit WORDS by the box with a white background, and then the size is prepended to this array and written in to the “HT_WRITE” LabVIEW Host-to-Target FIFO.

G – So we are receiving data in the following format from the host:

<size>

<WORD1>

<WORD2>

<WORDN>

We read the size and then the rest of the elements from the FIFO and write them to the 2nd AXI-Data Stream FIFO that is connected to the MicroBlaze.  Again note that I have not yet fully implemented the proper usage of the TKEEP signals, so in this case the TKEEP signal can be dynamically set from the Host application for testing purposes.

H – Now that the MicroBlaze has read our outgoing ethernet frame on Fifo #1 and has written the same outgoing frame on FIFO #0, we have to convert this 32-bit AXI Data Stream in to 64-bit words that are suitable for our 10 Gigabit Ethernet PHY.

This time however, I used a proper state machine and named all of the states correctly.

The left-most box connects the signals from the MicroBlaze to LabVIEW and wires them in to the state machine.  If the data is valid and it is not the last element, the top half is stored in to a shift-register and the next state is “Read-Bottom”.  Here is a close-up of the “Read-Top” state:

And here is a close-up of the “Read-Bottom” state:

I – Now we have written the 32-bit data coming from the MicroBlaze in to a LabVIEW FIFO with 64-bit data WORDS and we have to write this out using the CLIP, via the National Instruments provided wrapper.

Look at how simple and beautiful this code is!

You can look at the source code here: (you must clone the repository to your local machine to see it easier, just clone it and open the 2nd file – the html file)

If you do now know git, you can also download a zip file of the entire repository:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/archive/master.zip

If you somewhat know git, you can:

git clone git@github.com:JohnStratoudakis/LabVIEW_Fpga.git

And finally, browse the documentation, which is probably outdated here:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/tree/master/07_10_Gigabit_CLIP

Up Next

So what now? Do I continue cleaning up the code and updating documentation? Do I make a youtube video demonstrating this?  Do I modify the MicroBlaze code to no longer just be a “passthrough” but instead to send all data through the lwIP TCP/IP stack? If I do this, I will have to modify the elf file (compiled binary) that is embedded in to my design, breaking this design, so I can make multiple Xilinx checkpoints and it will work, but that will confuse all of my readers… Man decisions, decisions.

How about this, I finalize this project, make a new sub-directory in the source code and make a brand new LabVIEW FPGA project and this time I will use the lwIP version of the source code and I will make sure that everything is reproducible.  It is raining now anyway and I want to stay inside and code…

 

Coding Standards Matter…

I have wired up the components of my 10 Gigabit FPGA Accelerated Network card with great care, and I decided to have my “tester” application skip the lwIP stack and to pass the received packet directly to the host for testing/verification purposes.

Everything was checking out fine, the LabVIEW code looked flawless, the interface to the 10 Gigabit Transceiver was perfect.  All looked fine, but for some reason I was not receiving the packets on the host.

I analyzed the code, inserted probes and what not.  And finally, I was reading through the actual C++ code (MicroBlaze C++ that is) I found the bug.

A very simple bug hidden in plain sight!


// Now echo the data back out
if (XLlFifo_iTxVacancy(fifo_1)) {
XGpio_DiscreteWrite(&gpio_2, 2, 0xF001);
for ( i = 0; i < recv_len_bytes; i++) {
XGpio_DiscreteWrite(&gpio_2, 2, buffer[i]);

XLlFifo_Write(fifo_1, buffer, recv_len_bytes);

}

XLlFifo_iTxSetLen(fifo_1, recv_len_bytes);
}


Do you see the error?  Well, neither did I, until I read the documentation for XLlFifo_Write again, for the umteenth time… I was writing the data of the packet to the buffer (length of packet) squared times! Why? Because the single call to XLlFifo_Write is writing the entire packet on each call.

Anyway, I am now re-synthesizing my code and we will see what happens when I run it in around 2 hours time.

Also, I added the TKEEP signal to my AXI Stream FIFO, and it worked exactly as expected, meaning that:

  • If I send 12 bytes from the LabVIEW FPGA FIFO in to the MicroBlaze, it detects 12 bytes
  • If I send 13 bytes, with the TKEEP signal being 0b0001 for the last word only, and 0xF for the rest, I get 13 bytes in the MicroBlaze code.
  • If I send 14 bytes… and so on and so forth, MicroBlaze recognized only that many bytes.

However, everything was aligned to 32 bit words.

Maybe I will work on cleaning up and pushing some of my code to github while I wait…

AXI4 + MicroBlaze != 64-bit

The 10 Gigabit MAC/transceiver gives me 64 bit data words.  I currently think I am giving and getting  64 bit data words, but I am really only using 32 bits.  I came to this conclusion after I tried reading a 64 bit word and saw the data was simply two repeated 32 bit words.  Additionally some random person on the internet said that the MicroBlaze data bus is 32-bit and you have to use some sort of data width converter ip.

Out of luck… I don’t know how to use the converter, but I am sure there is a way to properly convert this by using LabVIEW FPGA.  So for starters, this means I can remove my AXI4 Stream Data FIFOs and keep the two 32-bit versions.  I’ll also throw in support for TKEEP while I am at it.

So the “Receive Ethernet Frame” code from the 10 Gigabit transceiver/MAC looks like this:

I have to convert this 64-bit data stream in to a 32-bit data stream before I send it in to the MicroBlaze.  Here is the current/erroneous implementation:

So what do I have to do? I have to read one element from the LabVIEW FIFO – the FIFO on the left, write the upper half of the 32 bit word in one cycle, and not read from the LabVIEW FIFO for the next clock cycle and to write the lower half of the 32 bit word.  Want to see the power of LabVIEW? It is 7:22 AM right now… [elevator music/jeopardy music starts playing in the background]

Now it is 8:07 AM and I have finished re-factoring this loop.  I am writing the upper half of each 64 bit word in one cycle, and am writing the bottom half during the next clock cycle.  I am also keeping the logic that appends an extra word which contains the “EndOfGoodFrame”, and “EndOfBadFrame” boolean values.  Since I am writing 32-bit words now, I am only appending one word.

Here is the full loop:

And a close up of Case 0 of the innermost Case Structure:

And a close up of Case 1:

I now have to do this for the other direction – convert a LabVIEW FIFO packet to an AXI 32-bit stream. Here is the current implementation:

The signal on AXI_STR_TXD_data is a U32 and I have to collect 2 of these values and insert them in to the FIFO on the right side.  I am going to have to think about this for a bit, but I have to get ready and go to work.  So I may not finish this before leaving.

Thanks and have a nice day!

Update: Okay, this is not that pretty, but here is my first-cut “20 minutes” version:

Now I have to go and get ready! But I’ll be sure to set everything to synthesize before I leave…

IP Integration Node vs CLIP

I wired up the 10 gigabit ethernet MAC to my MicroBlaze instance to my host computer and compiled/synthesized everything.  I then turn on my “quiet” PXIe-1062Q and fire up my tester application and it did not work…  I open up an isolated tester – “Fpga-Mac-Top.vi”, and it worked.  I open up the isolated MicroBlaze tester – “Fpga-MicroBlaze-Top.vi”, and nothing.  Not even a read from the GPIO.

This is quite strange… why is it not working? I spend some time looking over everything, re-generating output products, synthesizing from Vivado, bringing the design back in to LabVIEW, and long story short I was not setting the MicroBlaze Reset to ACTIVE_LOW, whereas in all of my previous designs I was setting it to ACTIVE_HIGH.  Anyway, while I wait for it to compile, I have something to say.  Which do you prefer? Using an IP Integration Node or a CLIP (Component Level IP) for using a MicroBlaze Processor from LabVIEW?

Well, first off, let me link to some National Instruments documentation on both:

And now let me show you some screen shots.  Here is a close up of what using an IP Integration Node looks like: (right-click to open in a new window for a larger version until I figure out how to modify this wordpress theme to be wider)

Here is a zoomed out version of this same VI:

And finally, what it looks like without an IP Integration node, but with a CLIP (Component Level-IP):

 

Can you see the difference? I can… for starters, I can read the full name of each signal when using a CLIP.  Additionally, with a CLIP I can split up my nodes in to separate locations, so that I can organize my VI in a much cleaner way.  And finally, since I can read the full signal name when using a CLIP node, I no longer have to hover over each signal to get the signal name, thus removing any reason for having comments as in the IP Integration Node version.

Anyway.  CLIP node is my recommended method of using LabVIEW FPGA to import Xilinx Vivado IP.

Also, this code was from a project that I implemented in order to learn how to use the AXI Stream FIFO inside of LabVIEW via a MicroBlaze.  In other words, how to communicate with a MicroBlaze processor via an AXI Stream FIFO from LabVIEW FPGA.

See the source code here:

https://github.com/JohnStratoudakis/LabVIEW_Fpga/tree/master/06_MicroBlaze/03_MicroBlaze_AXI_Stream

Pros and Cons of LabVIEW FPGA

Ever since I started developing this LabVIEW FPGA project that uses a MicroBlaze soft processor to process TCP streams, I have learned a lot and can comment on the pros and cons of using LabVIEW FPGA vs using a traditional Xilinx/Altera based FPGA development approach.

For starters, LabVIEW FPGA blows every single other FPGA development system out of the water when it comes to developing prototypes.  I made a prototype for implementing a Monero miner in record time.  I don’t remember how long it took, but you can see my commit history here: https://github.com/JohnStratoudakis/CryptoCurrencies

Then I was able to implement a UDP based orderbook proof of concept, again in record time, see my commit history here: https://github.com/JohnStratoudakis/LabVIEW_Fpga/tree/master/MarketData/MarketData_02/Fpga

Then I decided that I wanted to make my orderbook support TCP/IP, which is what most Market Data Feeds are using, so I embarked on learning how to make LabVIEW FPGA play well with Xilinx Vivado.  I did not realize it at the time, but the knowledge I have gained over the past year is enough to make one not have to live with any of the cons that LabVIEW FPGA comes with.

  • I have learned how to integrate basic VHDL/Verilog IP in to a LabVIEW FPGA project.
  • I have learned how to integrate more complex Xilinx IP such as Adder/Subtractors, Fast Fourier Transforms, and AXI Stream FIFOs.
  • I have learned how to integrate an entire soft-core processor based system in as well.  Including both the simplified MicroBlaze MCS, and the more complex MicroBlaze processors developed by Xilinx.
  • Furthermore, I have been able to communicate between LabVIEW FPGA and the MicroBlaze processor via AXI Stream FIFOs, General Purpose Input/Output registers, and have implemented Interrupt handlers.

Using all of this together, I can develop in a very efficient manner the perfect prototype that uses existing Xilinx IP, IP from opencores.org, or proprietary IP that can use a MicroBlaze soft-core processor all from within LabVIEW FPGA.  This serves a great risk-mitigating factor in that one can tell if an FPGA will be a viable solution for a particular type of problem.  Then, one can choose to keep the LabVIEW FPGA implementation and scale it out, or one can rewrite the portions written in LabVIEW in another language such as Verilog or VHDL.

Usually, the first product that works is what makes it to market and is successful, not because it is the best, but because it is the most adaptable to change. Think Evolution… think VHS, think DVDs, think about the iPod.  These products were market leading because they got the job done right now, not later when all of the features were fully implemented.  Additionally these products were easy to use.

Anyway, I have fully wired up the 10 Gigabit Transceiver in to my MicroBlaze, and have wired the MicroBlaze to my host application, and I am anxiously awaiting my FPGA synthesizer to complete so I can test it out…