Computing Fixed-Point Square Roots and Their Reciprocals Using Goldschmidt Algorithm
IntroductionA well known algorithm for computing square roots by iteration is provided by the Newton-Raphson Algorithm. The algorithm determines the square root using iteration until the root has been determined to some user-defined level of accuracy. The method is easily derived. First, describe a number in terms of its square root:
$$ a = y ^ {2} ,$$
where $y = \sqrt{a}$. The value of the $\sqrt{a}$ can be written as $y = y_0 + \epsilon$, where $y_0$ is the value of the square root and...
Use DPLL to Lock Digital Oscillator to 1PPS Signal
IntroductionThere are occasions where it is desirable to lock a digital oscillator to an external time reference such as the 1PPS (One Pulse Per Second) signal output from a GPS receiver. One approach would be to synchronize a fixed frequency oscillator on the leading edge of the 1PPS signal. In many cases, this will result in adequate performance. However, in situations where simple synchronization does not provide adequate performance, digital phase-lock techniques can be applied to a...
Inside the Spartan-6: Using LUTs to optimize circuits
While building a small CPU on a Spartan-6 chip I came across the same old problem: my Verilog was mapping to a lot of slices . Way more then seems reasonable. So let's dig in and see what's really going on.
The J1 CPU (see Messing Around with a J1) is an amazingly streamlined design expressed in just over 100 lines of Verilog, and is reasonably compact at 150 Spartan-6 slices (half of that with the modifications described in the article). But the Picoblaze is...
Makefiles for Xilinx Tools
Building a bitstream from an HDL is a complicated process that requires the cooperation of a lot of tools. You can hide behind an IDE or grow a pair and use command line tools and a makefile to tie your build process together. I am not a huge fan of makefiles either (I believe a language should be expressive enough to automate the build process), but the alternatives are dismal.
Command-line driven workflow is easier on the hands and faster. The example...
Fit Sixteen (or more) Asynchronous Serial Receivers into the Area of a Standard UART Receiver
IntroductionThis article will describe a technique, available in many current FPGA architectures, to fit a large amount of logic into a small area. About ten years ago now (Feb/Mar 2005), I helped develop a multi-line Caller ID product. The Multi-Channel Asynchronous Receiver (MCAR) FPGA core developed for that product will be used to illustrate the technique(s) needed to fit a 16 channel MCAR into a single Spartan II XC2S30-5VQ100 FPGA.
To stay true to the original design, I...
I don’t often convert VHDL to Verilog but when I do ...
VHDL to VerilogI don’t often convert VHDL to Verilog but when I do it is not the most exciting task in the world (that is an understatement). For the most part I am HDL agnostic. Well that is not true, I have a strong preference for MyHDL, and an insubstantial preference for VHDL over Verilog. The choice of HDL for a project is often complicated, irrational, sometimes rational, but most often random. It is often not a choice of the developer - for...
MyHDL Interface Example
MyHDL Interfaces ExampleWith the next release of MyHDL, version 0.9, conversion of interfaces will be supported. In this context an interface is any object with a Signal attribute. This can be used to simplify connection between modules and port definitions. For example, if I want to define a simple memory-map bus, the Signals for the bus can be defined as follows:
class BareBoneBus: def __init__(self): self.wr = Signal(False) self.rd =...BGA and QFP at Home 1 - A Practical Guide.
It is almost universally accepted by the hobbyists that you can't work with high-density packages at home. That is entirely incorrect. I've been assembling and reflowing BGA circuit boards at home for a few years now. BGAs and 0.5mm-pitch QFPs are well within the realm of a determined amateur.
This series of articles presents practical information on designing and assembling boards with high-density packages at home. While the focus is on FPGA packages, most of...
Shared-multiplier polyphase FIR filter
Keywords: FPGA, interpolating decimating FIR filter, sample rate conversion, shared multiplexed pipelined multiplier
Discussion, working code (parametrized Verilog) and Matlab reference design for a FIR polyphase resampler with arbitrary interpolation and decimation ratio, mapped to one multiplier and RAM.
IntroductionA polyphase filter can be as straightforward as multirate DSP ever gets, if it doesn't turn into a semi-deterministic, three-legged little dance between input, output and...
VGA Output in 7 Slices. Really.
Ridiculous? Read on - I will show you how to generate VGA timing in seven XilinxR Spartan3R slices.Some time ago I needed to output video to a VGA monitor for my Apple ][ FPGA clone. Obviously (I thought), VGA's been done before and all I had to do was find some Verilog code and drop it into my design. As is often the case (with me anyway), the task proved to be very different from my imagined 'couple of hours to integrate the IP'.I found some example code for my board. I...
Computing Fixed-Point Square Roots and Their Reciprocals Using Goldschmidt Algorithm
IntroductionA well known algorithm for computing square roots by iteration is provided by the Newton-Raphson Algorithm. The algorithm determines the square root using iteration until the root has been determined to some user-defined level of accuracy. The method is easily derived. First, describe a number in terms of its square root:
$$ a = y ^ {2} ,$$
where $y = \sqrt{a}$. The value of the $\sqrt{a}$ can be written as $y = y_0 + \epsilon$, where $y_0$ is the value of the square root and...
Use DPLL to Lock Digital Oscillator to 1PPS Signal
IntroductionThere are occasions where it is desirable to lock a digital oscillator to an external time reference such as the 1PPS (One Pulse Per Second) signal output from a GPS receiver. One approach would be to synchronize a fixed frequency oscillator on the leading edge of the 1PPS signal. In many cases, this will result in adequate performance. However, in situations where simple synchronization does not provide adequate performance, digital phase-lock techniques can be applied to a...
BGA and QFP at Home 1 - A Practical Guide.
It is almost universally accepted by the hobbyists that you can't work with high-density packages at home. That is entirely incorrect. I've been assembling and reflowing BGA circuit boards at home for a few years now. BGAs and 0.5mm-pitch QFPs are well within the realm of a determined amateur.
This series of articles presents practical information on designing and assembling boards with high-density packages at home. While the focus is on FPGA packages, most of...
I don’t often convert VHDL to Verilog but when I do ...
VHDL to VerilogI don’t often convert VHDL to Verilog but when I do it is not the most exciting task in the world (that is an understatement). For the most part I am HDL agnostic. Well that is not true, I have a strong preference for MyHDL, and an insubstantial preference for VHDL over Verilog. The choice of HDL for a project is often complicated, irrational, sometimes rational, but most often random. It is often not a choice of the developer - for...
Inside the Spartan-6: Using LUTs to optimize circuits
While building a small CPU on a Spartan-6 chip I came across the same old problem: my Verilog was mapping to a lot of slices . Way more then seems reasonable. So let's dig in and see what's really going on.
The J1 CPU (see Messing Around with a J1) is an amazingly streamlined design expressed in just over 100 lines of Verilog, and is reasonably compact at 150 Spartan-6 slices (half of that with the modifications described in the article). But the Picoblaze is...
Shared-multiplier polyphase FIR filter
Keywords: FPGA, interpolating decimating FIR filter, sample rate conversion, shared multiplexed pipelined multiplier
Discussion, working code (parametrized Verilog) and Matlab reference design for a FIR polyphase resampler with arbitrary interpolation and decimation ratio, mapped to one multiplier and RAM.
IntroductionA polyphase filter can be as straightforward as multirate DSP ever gets, if it doesn't turn into a semi-deterministic, three-legged little dance between input, output and...
Makefiles for Xilinx Tools
Building a bitstream from an HDL is a complicated process that requires the cooperation of a lot of tools. You can hide behind an IDE or grow a pair and use command line tools and a makefile to tie your build process together. I am not a huge fan of makefiles either (I believe a language should be expressive enough to automate the build process), but the alternatives are dismal.
Command-line driven workflow is easier on the hands and faster. The example...
An Editor for HDLs
Unless you're still living in the '90s and using schematics, your FPGA designs are entered into text files as VHDL or Verilog source. Which, of course, implies you're using some form of text editor. Now, right after brace placement in C, the choice of an editor is the topic most likely to incite a nerd civil war (it's a bike-shed issue). I won't attempt to influence your choice because it really makes no difference to me. But if you are using the same editor I do, then maybe I can help you...
Developing FPGA-DSP IP with Python
This blog post was previously titled MyHDL ASIC Proven (How is this related to FPGAs?) but the blog post has been updated and mainly discusses developing FPGA-DSP IP with Python / MyHDL. The original content is still present but the post has been reorganized and expanded. Original post 16-Mar-2010.
Developing FPGA-DSP IP with Python / MyHDLUsing Python to develop DSP logic for an FPGA is very powerful. The Python ecosystem contains many packages including numerical and...
Fit Sixteen (or more) Asynchronous Serial Receivers into the Area of a Standard UART Receiver
IntroductionThis article will describe a technique, available in many current FPGA architectures, to fit a large amount of logic into a small area. About ten years ago now (Feb/Mar 2005), I helped develop a multi-line Caller ID product. The Multi-Channel Asynchronous Receiver (MCAR) FPGA core developed for that product will be used to illustrate the technique(s) needed to fit a 16 channel MCAR into a single Spartan II XC2S30-5VQ100 FPGA.
To stay true to the original design, I...
BGA and QFP at Home 1 - A Practical Guide.
It is almost universally accepted by the hobbyists that you can't work with high-density packages at home. That is entirely incorrect. I've been assembling and reflowing BGA circuit boards at home for a few years now. BGAs and 0.5mm-pitch QFPs are well within the realm of a determined amateur.
This series of articles presents practical information on designing and assembling boards with high-density packages at home. While the focus is on FPGA packages, most of...
I don’t often convert VHDL to Verilog but when I do ...
VHDL to VerilogI don’t often convert VHDL to Verilog but when I do it is not the most exciting task in the world (that is an understatement). For the most part I am HDL agnostic. Well that is not true, I have a strong preference for MyHDL, and an insubstantial preference for VHDL over Verilog. The choice of HDL for a project is often complicated, irrational, sometimes rational, but most often random. It is often not a choice of the developer - for...
An Editor for HDLs
Unless you're still living in the '90s and using schematics, your FPGA designs are entered into text files as VHDL or Verilog source. Which, of course, implies you're using some form of text editor. Now, right after brace placement in C, the choice of an editor is the topic most likely to incite a nerd civil war (it's a bike-shed issue). I won't attempt to influence your choice because it really makes no difference to me. But if you are using the same editor I do, then maybe I can help you...
Use DPLL to Lock Digital Oscillator to 1PPS Signal
IntroductionThere are occasions where it is desirable to lock a digital oscillator to an external time reference such as the 1PPS (One Pulse Per Second) signal output from a GPS receiver. One approach would be to synchronize a fixed frequency oscillator on the leading edge of the 1PPS signal. In many cases, this will result in adequate performance. However, in situations where simple synchronization does not provide adequate performance, digital phase-lock techniques can be applied to a...
Developing FPGA-DSP IP with Python
This blog post was previously titled MyHDL ASIC Proven (How is this related to FPGAs?) but the blog post has been updated and mainly discusses developing FPGA-DSP IP with Python / MyHDL. The original content is still present but the post has been reorganized and expanded. Original post 16-Mar-2010.
Developing FPGA-DSP IP with Python / MyHDLUsing Python to develop DSP logic for an FPGA is very powerful. The Python ecosystem contains many packages including numerical and...
Computing Fixed-Point Square Roots and Their Reciprocals Using Goldschmidt Algorithm
IntroductionA well known algorithm for computing square roots by iteration is provided by the Newton-Raphson Algorithm. The algorithm determines the square root using iteration until the root has been determined to some user-defined level of accuracy. The method is easily derived. First, describe a number in terms of its square root:
$$ a = y ^ {2} ,$$
where $y = \sqrt{a}$. The value of the $\sqrt{a}$ can be written as $y = y_0 + \epsilon$, where $y_0$ is the value of the square root and...
VGA Output in 7 Slices. Really.
Ridiculous? Read on - I will show you how to generate VGA timing in seven XilinxR Spartan3R slices.Some time ago I needed to output video to a VGA monitor for my Apple ][ FPGA clone. Obviously (I thought), VGA's been done before and all I had to do was find some Verilog code and drop it into my design. As is often the case (with me anyway), the task proved to be very different from my imagined 'couple of hours to integrate the IP'.I found some example code for my board. I...
Shared-multiplier polyphase FIR filter
Keywords: FPGA, interpolating decimating FIR filter, sample rate conversion, shared multiplexed pipelined multiplier
Discussion, working code (parametrized Verilog) and Matlab reference design for a FIR polyphase resampler with arbitrary interpolation and decimation ratio, mapped to one multiplier and RAM.
IntroductionA polyphase filter can be as straightforward as multirate DSP ever gets, if it doesn't turn into a semi-deterministic, three-legged little dance between input, output and...
Inside the Spartan-6: Using LUTs to optimize circuits
While building a small CPU on a Spartan-6 chip I came across the same old problem: my Verilog was mapping to a lot of slices . Way more then seems reasonable. So let's dig in and see what's really going on.
The J1 CPU (see Messing Around with a J1) is an amazingly streamlined design expressed in just over 100 lines of Verilog, and is reasonably compact at 150 Spartan-6 slices (half of that with the modifications described in the article). But the Picoblaze is...
Makefiles for Xilinx Tools
Building a bitstream from an HDL is a complicated process that requires the cooperation of a lot of tools. You can hide behind an IDE or grow a pair and use command line tools and a makefile to tie your build process together. I am not a huge fan of makefiles either (I believe a language should be expressive enough to automate the build process), but the alternatives are dismal.
Command-line driven workflow is easier on the hands and faster. The example...