NetworkSherpa

Hardware – Clock and Data Recovery

Clock and data recovery is an essential physical-layer function of modern switch and router hardware. Digging deep into the electronics of a router may not be your thing, but clock recovery is a fundamental building block for other network hardware functions. For example, serial to parallel data conversions require reliable clock and data recovery (CDR) to function effectively. It’s hard to understand serial to parallel conversions or signal conditioning without learning about CDR first.

Over-clocking to support 64b/66b

From the perspective of the transmitter’s MAC-layer, the bit rate of 10GBase-R is 10Gbps. (10Gbase-R here refers to SR, LRM, LR, ER etc.) On the physical line however, the bit-rate is 10.3125 Gbps. The 10Gbase-R link uses 64b/66b encoding which divides the Layer-2 data received from the MAC layer into 64-bit blocks and inserts an additional 2 bits of header before each block. This overhead does not eat into the available bandwidth. Instead the transmitter uses ‘over clocking’ and transmits the encoded bit-stream using a 10.3125 Khz clock. These additional bits are stripped by the receiving physical layer circuits and so the line-coding and over-clocking isn’t visible to the MAC layer. The rabbit hole on 64b/66b encoding goes a lot deeper, but for now it’s good to know that 64b/66b encoding exists.
In a 10GBase-R digital receiver, the received signal is interpreted by sampling the waveform right in the middle of the expected bit period. In order to align this sampling and properly decode the received signal as a one or zero, you need to have a receive clock which runs at precisely same frequency and phase as the transmitter’s clock. Unfortunately, you can’t just install the same 10.3125Khz oscillator in the receiver and be done with it. No….that would be too easy….

Clock and Data Recovery

The snag is there is that no oscillator is perfect. The 802.3 10G standard allows for the clock to be accurate to within +/- 0.01 percent.  So it’s quite likely that the transmit and received clocks will be out-of-sync. This is called clock or frequency drift. Mismatched clocks will cause lost bits due to bit-slip or falsely interpreted bits where the receiver samples at the wrong point in the wave form, confusing a ‘one’ with a ‘zero’ or vice-versa.
The diagram below shows the Rx clock (in red) running faster than the Tx clock and data signal. Although the phase of the first 0-1 transitions are in-phase between the two signals, within a few clock cycles they’re in complete anti-phase. Yes, this is an exaggeration but it’s a real problem.

Even if the Tx and Rx clocks are running at precisely the same frequency, they would still need to be aligned so that their phases match each other. You can see a diagram below showing the clocks running at the same frequency but out of phase with each other, again leading to mis-sampling of the received signal.

Phase locked loop

“He’s got ….. lock on us! – He’s engaging me. Goddamn it. Mustang, this bogey’s all over me”

Instead of trying to guess the precise frequency and phase of the transmitter, why not derive it from the incoming signal? This is clock and data recovery (CDR). The CDR function built into a receiver has it’s own oscillator and uses a clever component called a phase locked loop (PLL) to calibrate the oscillator and match it to the phase and frequency of the incoming signal. Within the PLL there is a ‘phase comparator’ which compares the phase of the receive signal against the phase of the local oscillator. The difference between the two signals creates an output signal which drives the voltage controlled oscillator (VCO). This VCO control input modifies the oscillator to better match the incoming signal and feeds the local signal back into the phase comparator to start the process again.
I know, I’m losing you, but we’re done with the crazy electronics speak. The feedback circuit eventually locks the local oscillator onto phase and frequency of the incoming signal. That’s it. Once your PLL clock is ‘locked’ the receiver can correctly time it’s sampling to recovered the transmitted data. The transmitter’s clock can vary it’s frequency over time and temperatures though, so the PLL has to maintain that lock by continuously monitoring the received signal.

Transitions and line codes

I have hand-waved over an important element of clock and data recovery – level transitions.  A level transition is a change from a one value to a zero value and from a zero to a one. In the examples above the bit stream was an idealistic square wave, 1-0-1-0-1-0. In the real-world, if there are no level transitions in a received data stream (e.g. a stream of all one’s or all zero’s) then the PLL loses it ability to detect the the clock speed of the transmitter. This is called ‘loss-of-lock’. The ‘run-length’ of a PLL is a measure of how long it can tolerate a lack of transitions before it loses lock.
The transmitted signal needs to be modified in some way to help the receiver recover and maintain their clock lock. 10Base-T Ethernet used Manchester Encoding which guaranteed a level transition during every bit time. Unfortunately it did this by doubling the clock frequency on the line, using two clock pulses for every data bit transmitted. That’s a 100% line-coding overhead.
In 10GBase-R the 64b/66b line coding addresses this problem. The two overhead bits (or preambles) can only have values 0-1 or 1-0. The 0-1 preamble means that the following 64-bit block is pure-data, and a 1-0 preamble means that the block contains some control information. We’ll explore the signaling later, but note that the 0-1 and 1-0 preamble patterns ‘guarantee’ a level transition every 64-bits. So we can say that 10GBase-R has a maximum of 64-bit ‘run-length’. The line-coding overhead is 3.125% which is more palatable than the manchester encoding overhead.
Furthermore, the 64-bit block is ‘scrambled’ to ensure further level transitions. This isn’t done solely for clock-recovery and makes it highly likely that you’ll have level transitions in your data stream, so it’s good to have the forced transitions from the 2-bit preambles.

Where is clock recovery used?

When the received serial 10GBase-R bit stream is to be transmitted to the MAC layer it has to be converted for transmission to the MAC using the XGMII interface 32-bit parallel interface. The Serial-in / Parallel-out (SIPO) converter needs to perform CDR here. The recovered data bits are clocked into a 32-bit register at line rate. The recovered clock is divided by 32 and used to clock the 32-bit words onto a bus at a reduced data rate.
Sometimes you want to perform CDR even when you plan to re-transmit the received serial data as another serial data stream. This is a form of re-conditioning and is known as re-timing. Signals transmitted in a channel will be attenuated (reduced in size) and distorted (changed in shape) as they travel, and as discussed in a prior post the circuit board is a hostile environment. By recovering the clock and data and then regenerating the pulses you can extend the life of a signal.
Lastly, you should note that the CDR mechanism usually has a hidden companion called an equalizer or EQ.  The EQ is another part of signal conditioning, boosting the high-frequency components of a received signal, cleaning it up so that the CDR circuit can have a fighting chance of spotting the level transitions. We’ll look at EQ circuits soon.
If you have any questions, corrections or general feedback, please let me know in the comments.