# CMOS Receiver Design for Optical Communications over the Data-Rate of 20 Gb/s

Joseph Chong

Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of

> Doctor of Philosophy in Electrical Engineering

Dong S Ha, Chair Seongim S Choi Luke F Lester Anbo Wang Yang Yi

May 4th, 2018 Blacksburg, Virginia

Keywords: Analog Integrated Circuit, CMOS, Optical Communication, Transimpedance Amplifier, Clock and Data Recovery Copyright © 2018, Joseph Chong

## CMOS Receiver Design for Optical Communications over the Data-Rate of 20 Gb/s

Joseph Chong

#### (ACADEMIC ABSTRACT)

Circuits to extend operation data-rate of a optical receiver is investigated in the dissertation. A new input-stage topology for a transimpedance amplifier (TIA) is designed to achieve 50% higher data-rate is presented, and a new architecture for clock recovery is proposed for 50% higher clock rate. The TIA is based on a  $g_m$ -boosted common-gate amplifier. The input-resistance is reduced by modifying a transistor at input stage to be diode-connected, and therefore lowers R-C time constant at the input and yielding higher input pole frequency. It also allows removal of input inductor, which reduces design complexity. The proposed circuit was designed and fabricated in 32 nm CMOS SOI technology. Compared to TIAs which mostly operates at 50 GHz bandwidth or lower, the presented TIA stage achieves bandwidth of 74 GHz and gain of 37 dB $\Omega$  while dissipating 16.5 mW under 1.5V supply voltage. For the clock recovery circuit, a phase-locked loop is designed consisting of a frequency doubling mechanism, a mixer-based phase detector and a 40 GHz voltage-controlled oscillator. The proposed frequency doubling mechanism is an all-analog architecture instead of the conventional digital XOR gate approach. This approach realizes clock-rate of 40 GHz, which is at least 50% higher than other circuits with mixer-based phase detector. Implemented with 0.13- $\mu$ m CMOS technology, the clock recovery circuit presents peak-to-peak clock jitter of 2.38 ps while consuming 112 mW from a 1.8 V supply.

## CMOS Receiver Design for Optical Communications over the Data-Rate of 20 Gb/s

Joseph Chong

#### (GENERAL AUDIENCE ABSTRACT)

This dissertation presents two electronic circuits for future high-speed fiber optics applications. A receiver in a optical communication systems includes several circuit blocks serving various functions: (1) a photodiode for detecting the input signal; (2) a transimpedance amplifier (TIA) to amplify the input signal; (3) a clock and data recovery block to re-condition the input signal; and (4) digital signal processing. High speed integrated circuits are commonly fabricated in SiGe or other high electron mobility semiconductor technologies, but receiver circuits based on Silicon using complementary metal oxide semiconductor (CMOS) technology has gained attention in open literatures due to its advantage of integrating signal processing. This dissertation shows a TIA circuit and a clock recovery circuit designed and implemented in CMOS technology. The TIA circuit is based on a " $g_m$ -boosted common-gate amplifier" topology, and a slight modification at the input of the topology is proposed. Implemented in 32nm SOI CMOS technology, the TIA measures bandwidth that achieved 100 Gb/s bandwidth. The bandwidth is increased by at least 48%when compared with state-of-the-art CMOS TIA's. The clock recovery circuit is a phase-locked loop with a mixer as the phase detector. An architectural change of replacing the conventional frequency doubling mechanism is proposed. The circuit is implemented in 0.13  $\mu$ m CMOS technology, and it achieved 40 GHz clock rate with 40 Gb/s data input, which is about 40% increase of clock rate compared to state-of-the-art clock recovery circuits of similar architecture.

# Dedication

To my wife Erin, thank you for all the suffers went through, and for all the support given.

Thanks to Jesus who is the real joy, peace and hope to me while working on this dissertation.

# Acknowledgments

Thanks to Prof Dong S Ha, for being a guide to academic success.

Thanks to all MICS group friends, for the friendship, and for all the insightful discussion. To Dongseok, Farooq, Hyunchul, Jebreel, Jihoon, Laya, Lisa, Michael, Reza, Ross, Shinwoong, Yahya, it is really nice to know you all, and really nice to have worked together.

This work was supported by Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (No.B0101-15-0024, Terabit optical-circuit-packet converged switching system technology development for the next-generation optical transport network).

# Contents

| 1 | Intr           | roduction                                                                                       | 1  |
|---|----------------|-------------------------------------------------------------------------------------------------|----|
|   | 1.1            | Research Motivation and Goals                                                                   | 1  |
|   | 1.2            | Design Methodology                                                                              | 1  |
|   | 1.3            | Organization of Dissertation                                                                    | 2  |
| 2 | $\mathbf{Pre}$ | liminaries                                                                                      | 3  |
|   | 2.1            | MOSFETs                                                                                         | 3  |
|   | 2.2            | Basics of Amplifiers                                                                            | 6  |
|   |                | 2.2.1 Common-source amplifier                                                                   | 6  |
|   |                | 2.2.2 Common-gate amplifier                                                                     | 6  |
|   |                | 2.2.3 Cascode amplifier                                                                         | 7  |
|   |                | 2.2.4 Differential amplifier                                                                    | 8  |
|   | 2.3            | Block Diagram of Fiber Optic Receivers                                                          | 8  |
| 3 | Tra            | nsimpedance Amplifier Design                                                                    | 10 |
|   | 3.1            | Overview of CMOS TIA topologies                                                                 | 10 |
|   | 3.2            | $G_m$ -boosted common-gate amplifier                                                            | 12 |
|   |                | 3.2.1 Effect of Input Inductance to TIA Bandwidth                                               | 13 |
|   |                | 3.2.2 Literature Review – architectural modification to conventional $g_m$ -boosted amplifier . | 15 |

| Re | efere | nces                                            |                                                                                                                                                                                                | 43 |  |  |  |  |
|----|-------|-------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|--|--|--|--|
| 5  | Cor   | nclusio                                         | n                                                                                                                                                                                              | 42 |  |  |  |  |
|    | 4.3   | Measu                                           | rement of the Proposed Circuit                                                                                                                                                                 | 39 |  |  |  |  |
|    |       | 4.2.4                                           | Voltage-Controlled Oscillator                                                                                                                                                                  | 39 |  |  |  |  |
|    |       | 4.2.3                                           | Mixer-based Phase Detector                                                                                                                                                                     | 36 |  |  |  |  |
|    |       | 4.2.2                                           | Resonator-based FDM                                                                                                                                                                            | 34 |  |  |  |  |
|    |       | 4.2.1                                           | Pre-amplifier                                                                                                                                                                                  | 34 |  |  |  |  |
|    | 4.2   | Propo                                           | sed Clock Recovery Circuit with Resonator-Based FDM                                                                                                                                            | 34 |  |  |  |  |
|    |       | 4.1.5                                           | Issue with Current FDM Architecture                                                                                                                                                            | 33 |  |  |  |  |
|    |       | 4.1.4                                           | Literature Review – MBPD and FDM Implemented in Literatures                                                                                                                                    | 32 |  |  |  |  |
|    |       | 4.1.3                                           | Frequency Doubling Mechanism for MBPD                                                                                                                                                          | 31 |  |  |  |  |
|    |       | 4.1.2                                           | Operation Principle of Mixer-Based Phase Detector                                                                                                                                              | 30 |  |  |  |  |
|    |       | 4.1.1                                           | Operation Principle of a PLL-Based Clock Recovery Circuit                                                                                                                                      | 27 |  |  |  |  |
|    | 4.1   | CDR                                             | with Mixer-Based Phase Detector                                                                                                                                                                | 27 |  |  |  |  |
| 4  | Clo   | ck Rec                                          | covery Circuit Design                                                                                                                                                                          | 27 |  |  |  |  |
|    |       | 3.4.5                                           | Simulation and Measurement Results                                                                                                                                                             | 22 |  |  |  |  |
|    |       | 3.4.4                                           | Limitation of inductor values                                                                                                                                                                  | 21 |  |  |  |  |
|    |       | 3.4.3                                           | Noise Analysis                                                                                                                                                                                 | 20 |  |  |  |  |
|    |       | 3.4.2                                           | Dummy TIA and the Second and Third Stage Buffers                                                                                                                                               | 20 |  |  |  |  |
|    |       | 3.4.1                                           | Diode-Connected Input Stage                                                                                                                                                                    | 18 |  |  |  |  |
|    | 3.4   | 4 Proposed TIA with Diode-Connected Input Stage |                                                                                                                                                                                                |    |  |  |  |  |
|    | 3.3   | Calcul                                          | ation of gain and noise from measurement                                                                                                                                                       | 18 |  |  |  |  |
|    |       | 3.2.3                                           | Noise Analysis of conventional $g_m$ -boosted amplifier $\ldots \ldots \ldots$ | 16 |  |  |  |  |

# List of Figures

| 2.1  | Cross-section of simplified structure for NMOS and PMOS.                                                                                                                                | 4  |
|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 2.2  | An NMOS in (a) Deep N-well technology, and (b) SOI technology.                                                                                                                          | 4  |
| 2.3  | Three terminal symbols representing NMOS and PMOS                                                                                                                                       | 4  |
| 2.4  | (a) Applying DC voltage to NMOS, and (b) DC response conceptually                                                                                                                       | 5  |
| 2.5  | Parasitic $C_{GS}$ and $C_{GD}$ .                                                                                                                                                       | 5  |
| 2.6  | Small-signal model of NMOS                                                                                                                                                              | 6  |
| 2.7  | (a) A common-source amplifier, and (b) its small-signal model                                                                                                                           | 7  |
| 2.8  | (a) A common-gate amplifier, and (b) its small-signal model                                                                                                                             | 7  |
| 2.9  | (a) A cascode amplifier, and (b) a differential pair                                                                                                                                    | 8  |
| 2.10 | Block diagram of a typical optical receiver.                                                                                                                                            | 9  |
| 3.1  | Commonly adopted CMOS TIA topologies: (a) a common-source amplifier with a feedback resistor, (b) a inverter-based amplifier with a feedback resistor, and (c) a $g_m$ -boosted common- |    |
|      | gate amplifier                                                                                                                                                                          | 11 |
| 3.2  | (a) Schematic of a conventional GBCG amplifier, and (b) its small-signal equivalent circuit.                                                                                            | 12 |
| 3.3  | Small-signal model with a Norton's equivalent source $i_S$ and $Z_S$                                                                                                                    | 14 |
| 3.4  | Normalized gain function for a CG amplifier, a GBCG amplifier, a GBCG amplifier with optimum $L_{IN}$ , and a GBCG amplifier with a larger $L_{IN}$ causing in-band peaking             | 14 |
| 3.5  | Architectural modifications to the GBCG amplifiers by Kim et al. [1].                                                                                                                   | 16 |

| 3.6  | Architectural modifications to the GBCG amplifiers by Atef and Zimmermann [2]                                                                                                                                                    | 16 |
|------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 3.7  | Architectural modifications to the GBCG amplifiers by Bashiri and Plett [3]                                                                                                                                                      | 17 |
| 3.8  | Small-signal model for noise analysis                                                                                                                                                                                            | 17 |
| 3.9  | (a) The proposed modified GBCG TIA, and (b) its small-signal model                                                                                                                                                               | 19 |
| 3.10 | Normalized gain function of a conventional GBCG amplifier, a GBCG with diode-connected $M_B$ and $L_X$ of 100pH, and one with $L_X$ of 150pH                                                                                     | 19 |
| 3.11 | Schematic of the buffers of the proposed circuit with DC blocking capacitors                                                                                                                                                     | 20 |
| 3.12 | Low frequency equivalent circuit for noise analysis                                                                                                                                                                              | 21 |
| 3.13 | (a) Example structure of implemented inductors, and (b) the imaginary part of $Z_L$ for three cases: an EM-simulated result, a calculated $Z_L$ with $L_X$ only, and a calculated $Z_L$ based on Eq. 3.33 with $L_X$ and $C_X$ . | 22 |
| 3.14 | Frequency response for a CS amplifier: (a) with an R load; (b) with a series R and ideal L load; (c) with a series R and L with SRF.                                                                                             | 22 |
| 3.15 | Die photo of the proposed circuit                                                                                                                                                                                                | 23 |
| 3.16 | TIA measurement setup                                                                                                                                                                                                            | 23 |
| 3.17 | The raw $Z_T$ measurement data for the frequencies of (a) $1 - 40$ GHz, (b) $50 - 70$ GHz, and (c) $75 - 100$ GHz.                                                                                                               | 24 |
| 3.18 | Simulation and measurement of $Z_T$                                                                                                                                                                                              | 25 |
| 3.19 | Eye diagram simulation with measured data.                                                                                                                                                                                       | 25 |
| 3.20 | Simulation and measurement of input referred noise.                                                                                                                                                                              | 26 |
| 4.1  | Block diagram of a circuit recovery circuit with a series RC as loop filter.                                                                                                                                                     | 28 |
| 4.2  | Time domain response of the PLL where phase step and loop bandwidth are normalized to one.                                                                                                                                       | 28 |
| 4.3  | Phase domain model for linear analysis.                                                                                                                                                                                          | 29 |
| 4.4  | Frequency response of T(s) showing bandwidth at 1 rad/s and phase margin greater than 60 degrees.                                                                                                                                | 29 |
| 4.5  | Frequency response of JTF showing a low-pass function.                                                                                                                                                                           | 30 |

| 4.6  | Phase detector response and its small signal linearized gain                                          | 31 |
|------|-------------------------------------------------------------------------------------------------------|----|
| 4.7  | Conceptual waveform showing relationship between data, clock and FDM output                           | 32 |
| 4.8  | Block diagram and schematic of the FDM and the mixer presented by Lee and Wu                          | 32 |
| 4.9  | Block diagram and schematic of the FDM and the mixer presented by Sun et al                           | 33 |
| 4.10 | AC Gain of a CML buffer compared to a resonator-based buffer in 0.13- $\mu m$ CMOS technology         | 33 |
| 4.11 | Block diagram of the proposed clock recovery circuit                                                  | 34 |
| 4.12 | (a) Schematic of pre-amplifier, and (b) its simulated voltage gain                                    | 35 |
| 4.13 | Schematic of the proposed resonator based FDM                                                         | 35 |
| 4.14 | Simulated FDM conversion gain for $f_0$ output and $2f_0$ output.                                     | 36 |
| 4.15 | Simulated time-domain FDM output (bottom) compared to an ideal NRZ signal input (top).                | 36 |
| 4.16 | Simulated time-domain FDM output (bottom) compared to an ideal NRZ signal input (top).                | 37 |
| 4.17 | Simulated mixer DC voltage output versus phase difference                                             | 37 |
| 4.18 | Simulated PD output current versus phase difference.                                                  | 38 |
| 4.19 | (a) Time domain $I_{PD}$ output for three different time delays, and (b) time average $I_{PD}$ output |    |
|      | versus time delay                                                                                     | 38 |
| 4.20 | Schematic of the proposed VCO.                                                                        | 39 |
| 4.21 | Simulated phase noise of the VCO.                                                                     | 39 |
| 4.22 | Microphoto of the chip.                                                                               | 40 |
| 4.23 | VCO tuning range.                                                                                     | 40 |
| 4.24 | Measurement of the output clock signal: (a) spectrum, and (b) phase noise                             | 41 |
| 4.25 | Phase noise measurement showing 16 MHz of bandwidth.                                                  | 41 |

# List of Tables

| 3.1 | Comparisons of Recent TIAs                                  | 26 |
|-----|-------------------------------------------------------------|----|
|     |                                                             |    |
| 4.1 | Comparison of Mixer-Based Full-Rate Clock Recovery Circuits | 41 |

# Chapter 1

# Introduction

### **1.1** Research Motivation and Goals

A rapid increase in network traffic necessitates the increase of backbone optical network data-rate from 40 Gb/s to 100 Gb/s and above. Ethernet product of 100 Gb/s data-rate implemented with four parallel 25 Gb/s non return-to-zero (NRZ) data channels will be mainstream soon, and a 400 Gb/s Ethernet is being developed with four parallel 50 Gb/s four-level pulse-amplitude-modulation (PAM-4) channels [4]. Although receiver front-end implemented in SiGe BiCMOS technology has its advantage in better speed and noise performance [5,6], CMOS is gaining attention in the literatures and is preferred for being able to integrate signal processing circuits such as equalization and error correction.

The goal of this research is to study the current state-of-the-art CMOS optical receiver circuits, identify the bottleneck that limits data-rate, and proposes a modification to the circuit topology or system architecture to achieve a higher data-rate.

### 1.2 Design Methodology

Integrated circuits (IC) are implemented in order to facilitate operation at a higher data-rate by tightly controlling parasitic capacitance and inductance. Circuit models for transistors, capacitors and resistors are obtained from corresponding foundries, as well as information on interconnection. Modification to existing circuit topology or system architecture is discovered through circuit analysis, and circuit simulation is performed to ensure the circuit is meeting design goals. The resulting prototype of circuit is then sent to corresponding foundries for fabrication. After receiving the fabricated prototype, measurement is performed in laboratory of Multifunctional Integrated Circuits and Systems (MICS) group.

## 1.3 Organization of Dissertation

This dissertation is organized as follows: Basic CMOS device and circuits are introduced in chapter two. After that, chapter three presents the basic operation principles of a TIA, and the work proposed to achieve 100 Gb/s application. Then, chapter four shows the operation principles of a CDR, as well as the work proposed to increase operation clock frequency. Finally, chapter five concludes the dissertation.

## Chapter 2

# Preliminaries

### 2.1 MOSFETs

Metal-Oxide-Semiconductor Field-Effect-Transistors (MOSFETs, also termed as 'MOS') are the basic element for the circuits described in this work. [7] Fig. 2.1 shows a simplified cross section of an *n*-type MOS (NMOS) and a *p*-type MOS (PMOS) fabricated on a lightly doped p-substrate. An NMOS consists of two heavily doped n regions (n+ regions) for source (S) and drain (D), a thin layer of silicon dioxide (slash line region), and a conductive polysilicon for gate (G). A PMOS is a similar device except it is fabricated in an n-well and the source and drain are p regions. The voltage of the bulk p-substrate (B) of an NMOS is applied with a heavily doped p region (p+), and a n+ region is used for the B for PMOS. A modified structure of NMOS which includes deep n-well is used in this work, as shown in Fig. 2.2a. With a deep n-well implemented, the bulk part of the NMOS is isolated from the substrate and can be tied to any voltage, and the NMOS is also isolated from interference through substrate coupling. Another type of NMOS used in this work is a silicon-on-insulator (SOI) technology, which is shown in Fig. 2.2b where a layer of silicon dioxide is placed isolating the n+ region from the substrate. With the isolator implemented, the parasitic capacitance of the p-n junction formed by the substrate and the n+ region will be eliminated, therefore resulting in operation frequency.

Assume that bulk is always tied to source and doesn't affect circuit operation, the MOSFETs can be considered as a three terminal device, and it can be represented with a symbol shown in Fig. 2.3.

First consider applying DC voltages to a MOSFET. When a voltage  $V_{GS}$  is applied to the gate of an NMOS while the source and bulk is grounded, a channel is formed connecting the source and drain, and the current  $I_{DS}$  flows from drain to source through the channel (Fig. 2.4a). Ideally, when drain-to-source voltage  $V_{DS}$  is small, the current behaves linearly proportional to  $V_{GS}$  (Fig. 2.4b), given that  $V_{GS}$  is greater than threshold voltage  $V_{TH}$ . When  $V_{DS}$  is larger than  $V_{GS}-V_{TH}$ , the NMOS goes into "saturation", and



Fig. 2.1: Cross-section of simplified structure for NMOS and PMOS.



Fig. 2.2: An NMOS in (a) Deep N-well technology, and (b) SOI technology.



Fig. 2.3: Three terminal symbols representing NMOS and PMOS.

the  $I_{DS}$  is proportional to the square of  $(V_{GS}-V_{TH})$  (Fig. 2.4b),

$$I_{DS} = \frac{1}{2} \mu_n C_{ox} \frac{W}{L} (V_{GS} - V_{TH})^2 (1 + \lambda V_{DS}), \qquad (2.1)$$

where  $\mu_n$  is the mobility of charge carriers,  $C_{ox}$  is the gate oxide capacitance per unit area, W and L are the width and length of gate polysilicon, and  $\lambda$  is a parameter to model channel-length modulation.

After appropriate bias voltages  $V_{GS}$  and  $V_{DS}$  is set so that an NMOS operates in saturation region, a



Fig. 2.4: (a) Applying DC voltage to NMOS, and (b) DC response conceptually.



Fig. 2.5: Parasitic  $C_{GS}$  and  $C_{GD}$ .

small-signal can be applied on top of  $V_{GS}$ . The small-signal causes a perturbation at the output  $I_{DS}$ , and the relationship is the transconductance of the NMOS,

$$g_m = \frac{\partial I_{DS}}{\partial V_{GS}} = \mu_n C_{ox} \frac{W}{L} (V_{GS} - V_{TH})$$
(2.2)

$$=\sqrt{2\mu_n C_{ox} \frac{W}{L}} I_{DS}.$$
(2.3)

If a small-signal  $V_{DS}$  is applied while  $V_{GS}$  stays constant,  $I_{DS}$  will change in proportional to  $V_{DS}$ , which is the effect of an output resistance  $r_o$ , and its equation can be obtained as

$$r_o = \frac{1}{\partial I_{DS} / \partial V_{DS}} \approx \frac{1}{\lambda I_{DS}}.$$
(2.4)

Due to the overlapping of conductive gate with the conductive channel, parasitic gate-source capacitance  $C_{GS}$  and gate-drain capacitance  $C_{GD}$  will be present and affect high frequency signals (Fig. 2.5). There are other parasitic capacitances as well, such as source to drain capacitance, or the capacitance of the reverse biased p-n junction from bulk to source or drain. However, those other capacitances will be much smaller compared to  $C_{GS}$  and  $C_{GD}$  and they can be ignore during first order analysis.

From here onward, unless otherwise specified, a MOSFET is operating in saturation region and only the small-signal relation between  $V_{GS}$  and  $I_{DS}$  is considered. The equivalent circuit shown in Fig. 2.6 can be



Fig. 2.6: Small-signal model of NMOS.

used for detailed circuit analysis.

### 2.2 Basics of Amplifiers

Utilizing MOSFETs and resistors, one can create circuit that amplifies an input AC signal. This section will first introduce basic voltage amplifier topologies, the common-source amplifier, common-gate amplifier, cascode amplifier, and differential amplifier. When the input of the amplifier is a current source, and the output is taken as voltage, the amplifier is called 'transimpedance amplifier'. This section will briefly describe the two types of transimpedance amplifier, a common-gate based amplifier, and a shunt-feedback amplifier.

#### 2.2.1 Common-source amplifier

A common-source (CS) amplifier can be obtained by connecting an NMOS with a resistor as in the schematic shown in Fig. 2.7a, where  $C_{out}$  is the parasitic capacitance of the MOSFET of the next stage. The smallsignal analysis equivalent circuit is also shown in Fig. 2.7b, where the DC voltages are treated as equivalent ground. Assuming  $r_o$  of the transistor is much larger than the drain resistor  $R_D$ , the gain of the amplifier can be obtained as

$$A_{v} = \frac{v_{out}}{v_{in}} = -g_{m}R_{D} \cdot \frac{1}{1 + sR_{D}C_{out}}.$$
(2.5)

The parasitic capacitance seen at the input node is the combination of  $C_{GS}$  with the miller effect equivalent of the  $C_{GD}$ , which is

$$C_{in} = C_{GS} + C_{GD}(1 + A_v). (2.6)$$

#### 2.2.2 Common-gate amplifier

On the other hand, a common-gate (CG) amplifier is implemented with a current source  $I_B$  and a drain resistor  $R_D$  as shown in Fig. 2.8a. The small-signal equivalent circuit is shown in Fig. 2.8b where the DC voltages are equivalent ground and the DC current source is equivalently open circuit. Assuming  $r_o$  of the



Fig. 2.7: (a) A common-source amplifier, and (b) its small-signal model.



Fig. 2.8: (a) A common-gate amplifier, and (b) its small-signal model.

transistor is much larger than the drain resistor  $R_D$ , the gain of the amplifier can be obtained as

$$A_{v} = \frac{v_{out}}{v_{in}} = g_{m}R_{D} \cdot \frac{1}{1 + sR_{D}C_{o}ut}.$$
(2.7)

The input resistance of the CG amplifier is

$$R_{in} = \frac{v_{in}}{i_{in}} = \frac{1}{g_m}.$$
(2.8)

The parasitic capacitance seen at the input node is the  $C_{GS}$  of the NMOS.

#### 2.2.3 Cascode amplifier

In order to reduce the input capacitance caused by miller effect for a CS amplifier, a cascode architecture may be employed. As shown in Fig. 2.9a, the cascode amplifier consists of two stacked NMOS,  $M_1$  and  $M_2$ , and a drain resistor  $R_D$ . The input is applied to  $M_1$  while  $M_2$  act as a CG amplifier. Assume that both NMOS has identical  $g_m$ , the voltage at node X is approximately equals to  $-v_{in}$ , and therefore the input capacitance is

$$C_{in} = C_{GS} + 2C_{GD},$$
 (2.9)



Fig. 2.9: (a) A cascode amplifier, and (b) a differential pair.

which is smaller than a CG with gain higher than 1. The cascode architecture also increase the output impedance and result in

$$r'_{o} = r_{o1} + r_{o2} + g_{m2}r_{o1}r_{o2}.$$
(2.10)

#### 2.2.4 Differential amplifier

Single-ended signalling is susceptible to noises, such as supply noise and ground noise, therefore a differential signalling scheme is desirable. The schematic of a differential amplifier is shown in Fig. 2.9b, where the NMOS pair  $M_1$  and  $M_2$  are operating is a push-pull manner. Coming from eq. Eq. 2.1, the large signal current difference of  $M_1$  and  $M_2$  is

$$I_{D1} - I_{D2} = \frac{1}{2} \mu_n C_{ox} \frac{W}{L} (v_{in+} - v_{in-}) \sqrt{\frac{4I_B}{\mu_n C_{ox} \frac{W}{L}}} - (v_{in+} - v_{in-})^2.$$
(2.11)

The small-signal transconductance when both  $v_{in+}$  and  $v_{in-}$  nodes have the same DC voltage is then

$$G_m = \frac{\partial I_D}{\partial v_{in}} = \sqrt{\mu_n C_{ox} \frac{W}{L} I_B},\tag{2.12}$$

which is identical to Eq. 2.3.

## 2.3 Block Diagram of Fiber Optic Receivers

A typical receiver for optical communication is composed of a photodiode, a transimpedance amplifier (TIA), a limiting amplifier (LA), and a clock-and-data recovery (CDR) circuit as shown in Fig. 2.10 [8]. The optical data received is converted to electrical signal with photodiode, then it is amplified with TIA.

The TIA is a low noise amplifier that amplifies the input current into voltage output. LA succeeds the



Fig. 2.10: Block diagram of a typical optical receiver.

TIA to maximizes the voltage signal to increase the error margin for the CDR. CDR circuit then attempts to recover an optimum and clean clock from the input data, and then re-samples the data with the clock for a clean data output.

## Chapter 3

# **Transimpedance Amplifier Design**

### 3.1 Overview of CMOS TIA topologies

CMOS TIA topologies commonly adopted for higher speed receivers include a "common-source amplifier with a feedback resistor", an "inverter-based amplifier with a feedback resistor", and a  $g_m$ -boosted commongate amplifier (GBCG). The three topologies are shown in Fig. 3.1 where the feedback resistor and the drain resistor is included as  $R_f$  and  $R_D$ , and the input current  $i_{IN}$  is amplified to result in output voltage  $v_{OUT}$ . Kim and Buckwalter investigated a TIA in 0.13  $\mu$ m CMOS technology based on "common-source amplifier with a feedback resistor" [9]. The work analyzed the inductors used for bandwidth enhancement, both at the input and between stages, and concluded that the inductor value needs to be obtained empirically for optimized group delay variation response. With a total of three gain stages, the overall circuit achieves 29 GHz bandwidth and 50 dB $\Omega$  gain. Ding et al. presented a TIA with a similar topology that incorporated a tunable peaking stage [10]. Implemented in 65 nm CMOS technology, the TIA yields 40 GHz bandwidth and 55 dB $\Omega$  gain. For a differential input, the TIA can also be implemented with a differential pair with a feedback resistor [11]. Chou et al. employed a differential pair input stage, two post-amplifier stage with nested feedback, and a resistor feeding back the output of the third gain stage to the input [11]. The nested feedback introduced a zero to the transfer function, and the TIA achieved 35 GHz of bandwidth with 54 dB $\Omega$  gain with a 65 nm CMOS technology.

Kim and Buckwalter also investigated a TIA adopting the "inverter-based amplifier with a feedback resistor" topology in 45 nm SOI CMOS technology [12]. With the more advanced technology with  $f_T$  of a highly scaled technology, the parasitic capacitance is lower and the push-pull amplifier can achieve the desired 40 Gb/s operation. The TIA shows 55 dB $\Omega$  gain over 30 GHz. Park and Oh presented a TIA with the same topology in a 65 nm CMOS technology. Implementing an inverter-based input stage with four-stages limiting amplifier, the TIA measured total 79 dB $\Omega$  gain, and 29.6 GHz bandwidth [13].

The input impedance and the gain of both the "common-source amplifier with a feedback resistor", and "inverter-based amplifier with a feedback resistor" topologies depend on  $R_f$ , and leads to a direct trade-off



Fig. 3.1: Commonly adopted CMOS TIA topologies: (a) a common-source amplifier with a feedback resistor, (b) a inverter-based amplifier with a feedback resistor, and (c) a  $g_m$ -boosted common-gate amplifier.

between gain and bandwidth. A lower  $R_f$  can give a lower RC time-constant at the input node for higher bandwidth, at the cost of having a lower gain. In contrast, a GBCG amplifier has a low input resistance associated with the transconductance  $(g_m)$  of the transistor, which enables the topology to reduce the input resistance without trading it with the gain directly.

A GBCG amplifier is a common-gate with a feedforward auxiliary amplifier that boosts the equivalent  $g_m$  and thus having lower input impedance compared to a common-gate. Bashiri and Plett implemented a GBCG amplifier in 65nm CMOS technology [3]. The work modified the biasing current path to the auxiliary amplifier to improve input impedance, and it shows 46.7 dB $\Omega$  gain with 26 GHz bandwidth. Chen et al. presented a receiver with GBCG amplifier input stage in 65 nm CMOS [14]. Incorporating the differential TIA, post-amplifiers and a CDR, the receiver has demonstrated 38 Gb/s to 43 Gb/s data-rate.

Other less common TIA topologies include a "common-source amplifier with input matching", a "commongate amplifier with feedback resistor", a "hybrid of common-source and common-gate", etc. Jin and Hsu used a common-source amplifier with input inductance matching network in CMOS 0.18  $\mu$ m technology [15]. With three inductors forming a  $\pi$ -type network at each node, up to three times bandwidth improvement is shown, and the TIA achieved 51 dB $\Omega$  gain and 30.5 GHz bandwidth. Liao and Liu employed a common-gate amplifier with a feedback resistor in 90 nm CMOS [16]. A network of two inter-stage inductors are analyzed, and more than three times bandwidth enhancement can be achieved with the fourth-order network. With a variable gain amplifier included, the circuit is capable of 40 Gb/s operation with overall gain of 2 k $\Omega$ . Kim et al. presented a common-gate amplifier paired with common-source designed in 65 nm CMOS [17]. At the input stage, the PD current is split to the inter-connected common-gate and common-source with feedback, then a differential pair combines both the output into a differential signal. The TIA achieves 52 dB $\Omega$  gain and 50 GHz bandwidth.

This work presents a GBCG based CMOS TIA intended for high speed and short distance communications between network servers with the target data rate of 100 Gbps. The key idea of the proposed TIA is a diodeconnected bias stage, which lowers the equivalent input resistance when compared to a conventional GBCG. Since the pole at the input node is the dominant pole, the lower input resistance is crucial for obtaining a



Fig. 3.2: (a) Schematic of a conventional GBCG amplifier, and (b) its small-signal equivalent circuit.

higher bandwidth. The TIA is fabricated in 32 nm CMOS SOI technology, and the measurement results show about 50% increase of bandwidth compared with previous works, indicate that the TIA achieves gain of 37 dB $\Omega$  and 3-dB bandwidth of 74 GHz, enabling the data rate of 100 Gbps. The proposed TIA increases the bandwidth by approximately 50% or more when compared with existing TIAs including [9] and [14], but the gain is lower than those TIAs. The lower gain of the TIA is not a major issue for the target applications such that short distance communications between network servers.

### **3.2** $G_m$ -boosted common-gate amplifier

A  $g_m$ -boosted common-gate (GBCG) amplifier shown in Fig. 3.2a consists of a CG amplifier  $M_1$  and  $R_D$  biased with a current source  $M_B$ , and a feedforward auxiliary amplifier (XA)  $M_X$  and  $R_X$ . The PD is modeled as an input current  $i_{IN}$  and a capacitance  $C_{PD}$ .

A small-signal model of the GBCG amplifier is shown in Fig. 3.2b for transimpedance gain  $(Z_T)$  and frequency response analysis. The model ignores the output resistance  $r_o$  of transistors as they are much larger than resistors  $R_X$  and  $R_D$ , and it also ignored all capacitances other than  $C_{GS}$  and  $C_{PD}$ , assuming that they play a minor role in the frequency response. The equation obtained by performing a Kirchhoff's circuit law (KCL) at the drain node of  $M_X$  is

$$v_{GS1} = -v_{IN} \cdot \frac{1 + A_X}{1 + s/\omega_X},$$
(3.1)

where

$$A_X = g_{mX} R_X, \text{and} \tag{3.2}$$

$$\omega_X = 1/R_X C_{GS1} \tag{3.3}$$

are the gain and the pole frequency of the XA. The voltage gain of the GBCG amplifier can then be

obtained with a KCL at the output node into it, resulting in

$$\frac{v_{OUT}}{v_{IN}} = G_m R_D \cdot \frac{1}{1 + s/\omega_X},\tag{3.4}$$

where

$$G_m = g_{m1}(1 + A_X) \tag{3.5}$$

means that the equivalent  $g_m$  is boosted with a factor of  $(1 + A_X)$  with this configuration. Combining Eq. 3.1 and Eq. 3.4 and performing KCL on the input node, the resulting input impedance is

$$Z_{IN} = \frac{v_{IN}}{i_{IN}} = \frac{1}{G_m} \cdot \frac{1 + s/\omega_X}{1 + (1 + \eta)\frac{s}{\omega_i} + \frac{s^2}{\omega_X \omega_i}},$$
(3.6)

where the input pole  $\omega_i$  is

$$\omega_i = G_m / (C_{PD} + C_{GS2}) = G_m / C_{IN}, \tag{3.7}$$

and  $\eta$  is the ratio of the miller capacitance of  $M_1$  to the input capacitance,

$$\eta = \frac{C_{GS1}}{C_{IN}} (1 + A_X) \tag{3.8}$$

The  $Z_T$  can be obtained with Eq. 3.4 and Eq. 3.6 to be

$$Z_T = \frac{v_{OUT}}{i_{IN}} = R_D \cdot \frac{1}{1 + (1 + \eta)\frac{s}{\omega_i} + \frac{s^2}{\omega_X \omega_i}}.$$
(3.9)

This shows that the inclusion of an XA results in a two-pole transfer function, the both poles needs to be considered while designing for high frequency applications. For completeness, one can also include the pole at the output node and results in

$$Z_T = R_D \cdot \frac{1}{1 + \frac{s}{\omega_{OUT}}} \cdot \frac{1}{1 + (1 + \eta)\frac{s}{\omega_i} + \frac{s^2}{\omega_X \omega_i}}.$$
(3.10)

where  $\omega_{OUT} = 1/R_D C_L$ . In most cases,  $C_{IN}$  is much larger than  $C_L$ , and therefore more efforts should be made to increase input pole frequency.

#### 3.2.1 Effect of Input Inductance to TIA Bandwidth

High-speed CMOS TIAs often employs an off-chip PD fabricated in III-V technologies for a lower  $C_{PD}$  and higher photodiode bandwidth [18, 19]. The bond-wire results in a parasitic inductance between the TIA and the PD [20]. The resulting  $Z_T$  with an input inductor added is shown as follows. Utilizing a Norton's equivalent model for the PD and the bond-wire, an equivalent small-signal circuit can be obtained and shown in Fig. 3.3, where

$$i_S = \frac{i_{IN}}{s^2 C_{PD} L_{IN} + 1}, Z_S = \frac{1}{s C_{PD}} + s L_{IN}.$$
(3.11)



Fig. 3.3: Small-signal model with a Norton's equivalent source  $i_S$  and  $Z_S$ .



Fig. 3.4: Normalized gain function for a CG amplifier, a GBCG amplifier, a GBCG amplifier with optimum  $L_{IN}$ , and a GBCG amplifier with a larger  $L_{IN}$  causing in-band peaking.

The resulting  $Z_T$  is obtained as

$$Z_T = R_D \cdot \frac{1}{1 + \frac{s^2}{\omega_{LC}^2}} \cdot \frac{1}{1 + (1 + \eta)\frac{s}{\omega_i'} + \frac{s^2}{\omega_X \omega_i'}},$$
(3.12)

where

$$\omega_{LC} = 1/\sqrt{L_{IN}C_{PD}},\tag{3.13}$$

$$\omega_i' = G_m / C_{IN}', \text{and} \tag{3.14}$$

$$C_{IN}' = C_{GS2} + \frac{C_{PD}}{1 + s^2/\omega_{LC}^2}.$$
(3.15)

It can be observed that the inductance results in a beneficial bandwidth enhancement for  $\omega_i$ , therefore some works include an additional on-chip inductor for bandwidth enhancement citeBashiri2010. However, it also yields an undesirable peaking at  $\omega_{LC}$ , therefore the inductance should be kept small in order that the peaking does not occur within signal bandwidth.

An example plot of  $Z_T$  is shown in Fig. 3.4 with values obtained from 32 nm SOI CMOS. The gain is normalized to  $R_D$ , and the frequency is normalized so that a CG amplifier has 3-dB frequency at 1 rad/s as shown in the figure. A GBCG amplifier extends the bandwidth by two times, and with an input inductor of 20 pH, it is possible to improve the bandwidth for another 15%. However, increasing the input inductor to 60 pH result in a peaking within the bandwidth. Focusing on only the poles at the input node and the XA drain node, without the input inductor, Eq. 3.12 can be re-written as

$$Z_T = R_D \cdot \frac{1}{1 + \frac{s}{\omega_0 Q_0} + \frac{s^2}{\omega_0^2}},$$
(3.16)

where

$$\omega_0 = \sqrt{\omega_X \omega_i}, \text{and} \tag{3.17}$$

$$Q_0 = \frac{1}{1+\eta} \sqrt{\frac{\omega_i}{\omega_X}}.$$
(3.18)

From which we can determine the parameter that dominantly limits the bandwidth. By having  $Q_0 \leq 1/\sqrt{2}$  and finding the frequency when  $|Z_T| = R_D/2$ , the bandwidth limit can be derived as

$$BW \le \frac{1}{2\pi} \sqrt{\omega_X \omega_i} \approx \frac{1}{2\pi} \sqrt{\frac{g_{m1} g_{mX}}{C_{IN} C_{GS1}}},\tag{3.19}$$

which is a function of the capacitance of the photodiode and the  $f_T$  of the transistors. This shows that a wider bandwidth can be achieved by optimizing both the input pole  $\omega_i$  and the XA's  $\omega_X$ . The frequency of the poles can be increased by a larger bias current of transistors  $M_1$  or  $M_X$ , but it will cause a larger voltage drop on the drain resistor and result in a lower voltage headroom.

# 3.2.2 Literature Review – architectural modification to conventional $g_m$ -boosted amplifier

A conventional GBCG amplifier may achieve desired bandwidth with a lowered input impedance and an input inductor [14]. However, several works has also proposed modification to circuit topology to further lower the equivalent input resistance [1–3]. A work proposed in [1] is shown in Fig. 3.5, which includes an additional  $g_m$ -boosting path. With the additional amplifiers consisting of  $M_Y$  and  $M_Z$ , the input current due to a  $v_{IN}$  is increased by  $g_{mX}g_{mY}g_{mZ}R_XR_Yv_{IN}$ , thus resulting in a lower input resistance at DC. With a 650 fF capacitance at the input, the TIA designed in 180 nm CMOS technology is able to achieve 65% bandwidth increase, resulting in 4.98 GHz bandwidth with 56.7 dB $\Omega$  gain. However, the work has only considered pushing  $\omega_i$  to a higher frequency, and does not consider the effect of a lowered  $\omega_X$  frequency due to additional capacitance at the drain of  $M_X$ .

A work by Atef and Zimmermann [2] is shown in Fig. 3.6, which shows an inverter as the XA consisting of  $M_{XN}$  and  $M_{XP}$ , and resistor at the output node replaced with a PMOS active load  $M_2$ . The inverter provides a larger  $A_X$  with lower power consumption compared to a common-source amplifier, thereby provided a much lower input resistance and a higher  $\omega_i$  frequency. The PMOS active load allows increasing of bias current without sacrificing voltage headroom, resulting in possibly larger  $g_{m1}$ . The work is implemented in 40 nm CMOS technology, and with 450 fF capacitance at the input, it achieved 8 GHz bandwidth with 46 dB $\Omega$  gain. However, due to a lowered  $\omega_X$  frequency, this architecture may not be suitable for higher data-rate applications.



Fig. 3.5: Architectural modifications to the GBCG amplifiers by Kim et al. [1].



Fig. 3.6: Architectural modifications to the GBCG amplifiers by Atef and Zimmermann [2].

The work by Bashiri and Plett shown in Fig. 3.7 included a series  $L_f - R_f$  path from drain of  $M_1$  to its gate [3]. The bias current of  $M_1$  flows through  $R_X - R_f - L_f$ , thus allow increasing of current while keeping voltage across  $R_D$  constant, yielding a higher  $g_{m1}$  without sacrificing voltage headroom. The inductor  $L_f$ introduces a zero in the gain function which helps to improve bandwidth. The TIA is designed with a 65 nm CMOS technology, and realized 40 Gb/s operation with 26.1 GHz bandwidth and 46.7 dB $\Omega$  gain, where the input capacitance is 200 fF. However, since  $R_X$  is responsible to provide bias current for both  $M_1$  and  $M_X$  in this architecture, there's a trade-off between  $A_X$  and  $g_{m1}$ , which limits reduction of input resistance.  $R_f$  is also seen as parallel to  $R_D$  and reduces transimpedance gain.

In order to yield a higher input pole frequency for 100 Gb/s application, this work proposes an architecture modification that increases  $\omega_i$  frequency while keeping  $\omega_X$  optimized. The circuit is presented in Sec. 3.4.

#### 3.2.3 Noise Analysis of conventional $g_m$ -boosted amplifier

The noise performance of a TIA is crucial in determining the sensitivity of a receiver to the input signal. A small-signal model of a conventional GBCG amplifier is shown in Fig. 3.8 for noise analysis, where the thermal noise of resistors  $R_D$  and  $R_X$ , and the channel noise of transistors  $M_1$ ,  $M_X$  and  $M_B$  are included.



Fig. 3.7: Architectural modifications to the GBCG amplifiers by Bashiri and Plett [3].



Fig. 3.8: Small-signal model for noise analysis.

This analysis serves the purpose of identifying major noise contributor and comparing noise of different architecture. Each noise source is considered independent of others, therefore the total noise at the output is the superposition of all sources separately, which is

$$\overline{v_{n,OUT}^2} = \overline{i_{n,RD}^2} \cdot R_D^2 + \left[\frac{R_D}{1 + g_{m1}r_{oB}}(1 + g_{mX}R_X)\right]^2 \overline{i_{n,M1}^2} \\ + \left[\frac{g_{m1}R_DR_X}{1 + g_{m1}r_{oB}}(1 + g_{mX}R_X)\right]^2 \left(\overline{i_{n,RX}^2} + \overline{i_{n,MX}^2}\right),$$
(3.20)

where

$$Z_T = R_D \cdot \frac{g_{m1} r_{oB} (1 + g_{mX} R_X)}{1 + g_{m1} r_{oB} (1 + g_{mX} R_X)}.$$
(3.21)

In order to compare the circuit-generated noise with the input current from  $C_{PD}$ , an input-referred noise can be obtained by dividing the total noise at the output by  $Z_T$ . The equivalent noise current density at the input is

$$\overline{i_{n,IN}^2} \approx \overline{i_{n,MB}^2} + \overline{i_{n,RD}^2} + (\overline{i_{n,RX}^2} + \overline{i_{n,MX}^2}) \cdot \left(\frac{R_X}{r_{oB}} \cdot \frac{1}{1 + A_X}\right)^2.$$
(3.22)

The major contribution to the noise can now be determined as the current source  $M_B$  and the resistor  $R_D$ , and a lower noise can be obtained with higher  $r_{oB}$  and  $A_X$ .

#### 3.3 Calculation of gain and noise from measurement

A network analyzer is often employed to measure frequency domain performance of a TIA. The  $Z_T$  is then obtained by converting the measured S-matrix into Z-matrix and assuming  $S_{12}$  and  $S_{22}$  are negligible, which is [16,21]

$$Z_T = Z_0 \cdot \frac{S_{21}}{1 - S_{11}},\tag{3.23}$$

where  $Z_0$  is 50  $\Omega$ , the impedance of the measurement system. Measured noise figure is converted to inputreferred noise with the equation [22]

$$\overline{i_n^2} = (F-1)N_i/Z_{11}, \tag{3.24}$$

where  $N_i$  is the noise from the source present at the input node,

$$N_i = \left(\frac{Z_{11}}{Z_{11} + 50}\right)^2 \cdot 4kT.$$
(3.25)

### 3.4 Proposed TIA with Diode-Connected Input Stage

#### 3.4.1 Diode-Connected Input Stage

To improve the frequency response of a GBCG amplifier, an architectural modification to the TIA is proposed and is shown in Fig. 3.9a. As in a GBCG amplifier,  $M_1$  and  $R_D$  forms a CG amplifier, and the auxiliary amplifier consists of  $M_X$ ,  $M_Y$ , and  $R_X$ . Cascode architecture is implemented to reduce miller effect of  $C_{GDX}$ at the input. The transistor  $M_B$  is diode-connected to further lower the input impedance and enhance  $\omega_1$ . A small-signal model is presented in Fig. 3.9b, where the diode-connected  $M_B$  is presented as an equivalent resistor  $1/g_{mB}$ , and the resulting  $Z_T$  is

$$Z_T = \frac{R_D}{\gamma} \cdot \frac{1}{1 + \left(1 + \eta + \frac{g_{mB}}{C_{IN}} \cdot \frac{1}{\omega_X}\right) \frac{s}{\omega_i^{\prime\prime}} + \frac{s^2}{\omega_X \omega_i^{\prime\prime}}}$$
(3.26)

where

$$\omega_i'' = \frac{G_m}{C_{IN}''} \cdot \gamma, \tag{3.27}$$

$$C_{IN}^{\prime\prime} = C_{PD} + C_{GSX}, \tag{3.28}$$
$$G_m + g_m p$$

$$\gamma = \frac{G_m + g_{mB}}{G_m} \tag{3.29}$$

Through diode-connecting  $M_B$ , the input pole frequency is modified with an increase of total input capacitance, and a multiplication factor of  $\gamma$ . A gain penalty by a factor of  $\gamma$  has also occurred, but the authors has proposed another work in [23] to recover by feeding output of auxiliary amplifier to the subsequent stage.



Fig. 3.9: (a) The proposed modified GBCG TIA, and (b) its small-signal model.



Fig. 3.10: Normalized gain function of a conventional GBCG amplifier, a GBCG with diode-connected  $M_B$  and  $L_X$  of 100pH, and one with  $L_X$  of 150pH.

A shunt inductor  $L_X$  is also included to boost  $\omega_X$ , which introduces additional zero and pole into Eq. 3.1, resulting in a third order function

$$v_{GS1} = -v_{IN} \cdot (1 + A_X) \frac{1 + sg_{mX}L_X/(1 + A_X)}{1 + sC_{GS1}R_X + s^2C_{GS1}L_X}.$$
(3.30)

With the implementation of diode-connecting  $M_B$  and peaking inductor  $L_X$ ,  $\omega_X$  can be increased about 40% compared to a conventional GBCG amplifier Fig. 3.10. Similarly,  $L_X$  needs to be carefully optimized to avoid gain peaking within signal bandwidth.

For a complete analysis, the gain function can be multiplied by a transfer function due to a series inductor



Fig. 3.11: Schematic of the buffers of the proposed circuit with DC blocking capacitors.

 $L_D$  included, which is a third order function given as [9]

$$1/\left[1+\left(\frac{s}{\omega_o u t}\right)+m_2 n_2 \left(\frac{s}{\omega_o u t}\right)^2+m_2 n_2 (1-n_2) \left(\frac{s}{\omega_o u t}\right)^3\right],\tag{3.31}$$

where  $n_2 = C_2/(C_1 + C_2)$ ,  $L_D = m_2 R_D^2(C_1 + C_2)$ , and  $\omega_{out} = 1/R_D(C_1 + C_2)$  are the parameters that determine the frequency response of the output node.

#### 3.4.2 Dummy TIA and the Second and Third Stage Buffers

The entire amplifier includes a dummy TIA and two stages of buffers. The dummy TIA is a GBCG amplifier identical to the main TIA, and it acts as a pseudo-differential counterpart that compensates the supply and ground current of the main TIA, resulting in supply and ground noise cancellation [14]. A further improvement on noise can also be achieved if a differential photodiode such as one reported in [24] is integrated.

The output of TIA and the dummy TIA is applied to two a buffer. The buffer stages are implemented with differential pairs with shunt peaking inductors (Fig. 3.11). In actual receivers, these buffers drive a capacitance load (clock and data recovery circuit), and it is able to achieve higher voltage gain with larger resistors. However, the proposed circuit drives a 50  $\Omega$  measurement system, and voltage gain can only be achieved with a large transistor and a large output current due to lower resistance. In order to maintain bandwidth of the buffers, a voltage loss is deliberately allowed.

For measurement with a network analyzer, the also circuit includes input and output DC-blocking capacitors of 8pF ( $C_{B1}$ - $C_{B3}$ ). One of the output differential node is terminated to a 50  $\Omega$   $R_{TERM}$  for single-ended S-parameter measurement.

#### 3.4.3 Noise Analysis

A low frequency equivalent schematic of the TIA stage is shown in Fig. 3.12 by replacing the inductors with short circuits. Thermal noise of resistors and transistors are included to perform noise analysis, and the analysis result serves the purpose of identifying major noise contributor and comparing noise of different



Fig. 3.12: Low frequency equivalent circuit for noise analysis.

architecture.

The input-referred noise is obtained as

$$\overline{i_{n,in}^2} \approx \overline{i_{n,MB}^2} + \overline{i_{n,RD}^2} \cdot \gamma^2 + \overline{i_{n,M1}^2} \cdot \left(\frac{g_{mB}}{g_{m1}(1+A_X)}\right)^2 + (\overline{i_{n,RX}^2} + \overline{i_{n,MX}^2}) \cdot \left(\frac{g_{mB}R_X}{1+A_X}\right)^2. \tag{3.32}$$

Comparing to a conventional GBCG TIA, the noise of the proposed design is higher in two aspects. First, noise from  $M_1$  cannot be ignored due to lower source resistance  $1/g_{mB}$ . Second, the noise from  $R_D$  is larger by a factor of  $\gamma$  due to reduced  $Z_T$ . To obtain similar signal-to-noise ratio with a higher noise, a higher input power level is required, which implies that the proposed design may be limited to short-distance communications such as in a data-center.

#### 3.4.4 Limitation of inductor values

Inductors are implemented as integrated metal loops, as shown in Fig. 3.13a. Iterative EM simulations are performed to find the inductor size for optimal performance. The distance between a metal loop and the ground plane is maximized for a lower parasitic capacitance. However, parasitic capacitance also occur between the metal loop and the semiconductor substrate. To simplify analysis, consider an inductor model consisting of only parallel inductor and capacitor, the equivalent impedance is

$$Z_L = sL_X \cdot \frac{1}{1 + s^2 L_X C_X}.$$
(3.33)

Fig. 3.13b plots the imaginary part of  $Z_L$  for three cases: an EM-simulated result of  $L_X$  implemented for XA, a calculated  $Z_L$  with inductor only, and a calculated  $Z_L$  based on Eq. 3.33. The calculated result of Eq. 3.33 matches the EM-simulated result finely, and a peaking can be observed at self-resonance frequency  $\omega_{SRF} = 1/L_X C_X$  due to the presence of parasitic capacitance. The larger the inductance value desired, a larger parasitic capacitance will be present, resulting in a lower resonance frequency.

The SRF becomes a limitation of inductor for high frequency applications. The impact on circuit design can be observed with a simple CS amplifier with a series  $R_X - L_X$  load, such as in the case of the XA. The calculated frequency response plotted in Fig. 3.14 shows a CS amplifier with  $R_X$  having 3dB bandwidth of



Fig. 3.13: (a) Example structure of implemented inductors, and (b) the imaginary part of  $Z_L$  for three cases: an EM-simulated result, a calculated  $Z_L$  with  $L_X$  only, and a calculated  $Z_L$  based on Eq. 3.33 with  $L_X$  and  $C_X$ .



Fig. 3.14: Frequency response for a CS amplifier: (a) with an R load; (b) with a series R and ideal L load; (c) with a series R and L with SRF.

approximately 80 GHz. With an ideal inductor  $L_X$ , its bandwidth may be extended to 140 GHz. After taking the SRF into account, however, a peaking occurred at 60 GHz resulted in undesirable frequency response. Therefore, for wideband-amplifiers operating at 70 GHz and above, bandwidth-enhancement inductors should implemented with smaller values such that SRF is not in the vicinity of desired bandwidth.

### 3.4.5 Simulation and Measurement Results

The proposed circuit is designed with IBM 32nm CMOS SOI technology. SOI technology benefits high frequency operations by having lower parasitic drain and source junction capacitance. The transistors are biased with approximately 0.3 mA/ $\mu$ m to result in  $g_m$  of 26.5 mS and  $f_T$  of 380 GHz. The photodiode capacitance  $C_{PD}$  is a 50 fF capacitance emulated on chip, and  $R_D$  and  $R_A$  are both 100  $\Omega$ . The circuit consumes 45 mA from a 1.5V supply, where the TIA and the dummy TIA dissipates 11 mA each, and the



Fig. 3.15: Die photo of the proposed circuit.



Fig. 3.16: TIA measurement setup.

second stage and the third stage buffers consumes 7 mA and 16 mA, respectively. A microphotograph of the chip are shown in Fig. 3.15 and the core layout area is 300  $\mu$ m x 350  $\mu$ m.

On-chip probing was taken in three different configurations, as shown in Fig. 3.16. Measurement of 1 GHz to 40 GHz was performed directly via a network analyzer, 50 GHz to 70 GHz, and 75 GHz to 100 GHz measurements was performed with an additional V-band frequency converter, and an additional W-band frequency extender, respectively. A single-ended circuit configuration eases the measurement process by eliminating the need of differential to single-ended in each frequency bands.

The  $S_{11}$  and  $S_{21}$  measured results are converted to  $Z_T$  with Eq. 3.12. The raw  $Z_T$  measured data for frequencies of 1–40 GHz, 50–70 GHz, and 75–100 GHz is shown in Fig. 3.17a, Fig. 3.17b and Fig. 3.17c, respectively. In Fig. 3.17a, it can be seen that the DC-blocking capacitors has a cut-off frequency around 1 GHz. The variation from chip to chip can be as large as 3 dB. The V-band (50–70GHz) measurement in Fig. 3.17b shows undesirable dips due to the multiple waveguide–coaxial-cable conversions. The connector used for the V-band measurement limits around 70 GHz so the results above 70 GHz in Fig. 3.17b is unusable. The measurement of W-band (75–100GHz) used a different set of instrument that gives better results, and



Fig. 3.17: The raw  $Z_T$  measurement data for the frequencies of (a) 1 – 40 GHz, (b) 50 – 70 GHz, and (c) 75 – 100 GHz.

it shows  $\pm$  3 dB difference from chip to chip between 70–80 GHz as in Fig. 3.17c.

Smoothing out the measured result and combined together, the measured overall gain is obtained as 26 dB with a bandwidth of 74 GHz, and it shows good correlation to simulation results (Fig. 3.18). The good correlation suggests that the simulated TIA is reliable, which is 37 dB with bandwidth of 74 GHz.

Utilizing the measured S-parameter response, eye diagram is simulated with a 100 Gb/s, 20 mA, PRBS  $2^{31} - 1$  coded current input (Fig. 3.19). A square-wave input current with rise time of 1 ps is used to simulate a source with wide range of frequency components. The eye diagram shows eye height of 167.5 mV with jitter of 4.4 ps. Clear eye opening indicates that the circuit is able to operate at the desired data-rate.

Noise figure is only measured up to 40 GHz due to lack of measurement equipment and the input referred noise is calculated from it using the Eq. 3.24 (Fig. 3.20). The input-referred noise current density is measured as 155 pA/ $\sqrt{Hz}$  at 20 GHz, and it can be extrapolated to obtain an estimation of 181 pA/ $\sqrt{Hz}$  at 70 GHz. The integrated noise across bandwidth is approximately 38.4  $\mu$ A,rms and the average across bandwidth is



Fig. 3.18: Simulation and measurement of  $Z_T$ .



Fig. 3.19: Eye diagram simulation with measured data.

149 pA/ $\sqrt{Hz}$ . The integrated noise translates to -11.1 dBm noise floor with a 0.5 A/W photodiode, which is due to higher noise density and due to integration of a wider bandwidth.

Table 3.1 summarizes performance and characteristics of recent, state-of-the-art TIAs with measurement results. It is difficult to make a fair comparison of the performance due to differences in processing technology, gain, bandwidth, and power dissipation. Among the seven TIAs shown in Table 3.1, the proposed TIA has the largest bandwidth of 74 GHz, and the TIA of [17] has the next largest bandwidth of 50 GHz followed the one in [11] of 42 GHz. Among the three TIAs, the proposed work dissipates total power of 67.5 mW, while the TIA of [17] dissipates 49 mW, and the TIA of [11] 168 mW. The buffer stage of [3] degrades the bandwidth for 25% and maintains its gain, while the buffer stage of this work maintains the bandwidth and degrades the gain. The total gain of the proposed circuit is 26 dB $\Omega$  and that for other two TIAs of about 50 dB $\Omega$ . The gain of the proposed TIA stage is 37 dB $\Omega$  while dissipating 16.5 mW under 1.5 V supply voltage.

Recent works on TIA's with data-rate higher than 40 Gb/s are mostly presented in SiGe technology [25–30]. Works on CMOS technology are mostly focused on 25 Gb/s NRZ data-rate [31,32], or achieve over 50 Gb/s data-rate by employing pulse-amplitude modulation (PAM-4) signals [33] or equalization [34].



Fig. 3.20: Simulation and measurement of input referred noise.

| Reference                                                                  | TCAS<br>2010 [9]                                   | ISCAS<br>2010 [3]            | JSSC<br>2012 [12]    | ISCAS<br>2012 [11] | MWSCAS<br>2014 [10]          | A-SSCC<br>2014 [17]          | Photonics<br>2015 [13]       | This<br>Work          |
|----------------------------------------------------------------------------|----------------------------------------------------|------------------------------|----------------------|--------------------|------------------------------|------------------------------|------------------------------|-----------------------|
| Technology                                                                 | $\begin{array}{c} 0.13  \mu m \\ CMOS \end{array}$ | ${}^{65}_{\mathrm{CMOS}}$ nm | 45 nm<br>SOI<br>CMOS | 65 nm<br>CMOS      | ${}^{65}_{\mathrm{CMOS}}$ nm | ${}^{65}_{\mathrm{CMOS}}$ nm | ${}^{65}_{\mathrm{CMOS}}$ nm | 32 nm<br>SOI<br>CMOS  |
| $\begin{array}{c} {\bf Total}  {\bf Gain} \\ ({\bf dB}\Omega) \end{array}$ | 50 (18*)                                           | 47 (45 <sup>*</sup> )        | 55                   | 55                 | 55                           | 52                           | 50                           | $26~(37^*)$           |
| Bandwidth<br>(GHz)                                                         | 29                                                 | 26                           | 30                   | 42                 | 40                           | 50                           | 29.6                         | 74 (74 <sup>*</sup> ) |
| Total Noise $(\mu \mathbf{A}, \mathbf{rms})$                               | 8.8                                                | 4.35                         | 3.54                 | 3.14               | 2.5                          | 5.01                         | 9.2                          | 38                    |
| Power<br>(mW)                                                              | 45.7                                               | 39.9                         | 9                    | 168                | 107                          | 49.2                         | 3.8                          | 16.5                  |
| $egin{array}{ccc} { m Core} & { m size} \ ({ m mm}^2) \end{array}$         | 0.4                                                | 0.05                         | 0.29                 | 0.25               | 0.54                         | 0.48                         | n/a                          | $0.09 \\ (0.03^{*})$  |

Table 3.1: Comparisons of Recent TIAs

\* The gain / bandwidth / die size inside a parenthesis is the value of the TIA excluding the buffer stage.

In summary, the proposed TIA offers higher bandwidth and comparable power dissipation at the cost of low gain and high noise figure. So, it is suitable for high speed communications between servers in short distance.

## Chapter 4

# **Clock Recovery Circuit Design**

### 4.1 CDR with Mixer-Based Phase Detector

#### 4.1.1 Operation Principle of a PLL-Based Clock Recovery Circuit

The block diagram of a typical PLL implemented in clock recovery circuit for fiber-optics receiver is shown in Fig. 4.1, which consists of a phase detector (PD), a loop filter, and a voltage-controlled oscillator (VCO) [8,35]. The PD takes the input signal (data) and compares its phase with the output signal (clock), then outputs a current  $I_{PD}$  as a function of phase difference,  $\Delta \phi = \phi_{IN}(t) - \phi_{OUT}(t)$ . A control voltage  $V_{CTRL}$  is generated through  $I_{PD}$  charging and discharging the loop filter (shown as  $R_1$  and  $C_1$  in Fig. 4.1), and the  $V_{CTRL}$  adjusts the VCO output until its phase is aligned to the input.

In steady-state operation, the frequency and phase of input and output are identical  $(\phi_{IN} = \phi_{OUT} = \phi_0)$ . In this case, the  $I_{PD}$  is zero in average, and  $V_{CTRL}$  is a nominal DC voltage. Assume that the VCO oscillate at  $\omega_0$ , its output signal is

$$v_{CLK} = A_0 \cos(\omega_0 t + \phi_0). \tag{4.1}$$

Now assuming a step phase change occurred at t = 0 to the input, the non-zero  $\Delta \phi$  detected by PD results in

$$I_{PD} = K_{PD} \Delta \phi, \tag{4.2}$$

where  $K_{PD}$  is the conversion gain of the PD. The  $I_{PD}$  causes  $V_{CTRL}$  to deviate from the nominal voltage with a delta of

$$\Delta V_{CTRL}(t) = R_1 I_{PD} + \frac{1}{C_1} \int_0^t I_{PD} dt$$
(4.3)

and this results in a frequency shift,

$$\omega(t) = \omega_0 + K_{VCO} \Delta V_{CTRL}(t), \qquad (4.4)$$



Fig. 4.1: Block diagram of a circuit recovery circuit with a series RC as loop filter.



Fig. 4.2: Time domain response of the PLL where phase step and loop bandwidth are normalized to one.

where  $K_{VCO}$  is the slope of the VCO tuning function. The change in VCO output frequency can be observed as a phase-shift by an integration,

$$\phi_{OUT} = \phi_0 + \int_0^t (\omega(t) - \omega_0) dt = \phi_0 + K_{VCO} \int_0^t (\Delta V_{CTRL}(t)) dt.$$
(4.5)

With this, the output phase gradually shifts until it aligns with the input phase, and the steady-state of zero  $\Delta \phi$  is reached. The ideal time domain response of the PLL is plotted in Fig. 4.2, where the bandwidth is normalized to 1 rad/s and the values of  $K_{PD}$  and  $K_{VCO}$  are normalized to 1 A/rad and 1 Hz/V, respectively. With the step response  $\phi_{IN} = u(t)$ ,  $I_{PD}$  is generated and results in  $\Delta V_{CTRL}$  and increases output frequency.  $\phi_{OUT}$  gradually increase to follow  $\phi_{IN}$  with rise-time of approximately 2.3 s, and then the loop stabilizes and both  $I_{PD}$  and  $\Delta V_{CTRL}$  returns to zero.

Although the time-domain approach provides intuitive explanation of operation, a Laplace-domain analysis is profitable to phase noise and loop stability analysis. The small-signal model shown in Fig. 4.3 is employed to determine the frequency response, where  $K_{PD}$ , F(s) and  $K_{VCO}/s$  denotes the PD response, the loop-filter response, and the VCO response with Laplace-domain integration. The phase transfer function



Fig. 4.3: Phase domain model for linear analysis.



Fig. 4.4: Frequency response of T(s) showing bandwidth at 1 rad/s and phase margin greater than 60 degrees.

of the PLL is [35]

$$\phi_{OUT} = \frac{T(s)}{1 + T(s)})\phi_{IN},$$
(4.6)

where

$$T(s) = K_{PD} \cdot \frac{1 + sR_1C_1}{sC_1} \cdot \frac{K_{VCO}}{s}$$
(4.7)

is the loop gain. At low frequencies, the loop generates large gain  $(|T(s)| \gg 1)$  and Eq. 4.6 demonstrates  $\phi_{OUT}$  locking to  $\phi_{IN}$ . The loop bandwidth is defined as the frequency where |T(s)| = 1, which is derived from Eq. 4.7 and obtained as

$$\omega_{BW} \approx K_{PD} K_{VCO} R_1 \tag{4.8}$$

by assuming  $\omega_{BW}R_1C_1 \gg 1$ . A response of T(s) is shown in Fig. 4.4, where the bandwidth is normalized to one and the transmission zero is located at  $0.4\omega_{BW}$  to benefit phase margin.

The phase noise at PLL output is a combination of the phase noise of input signal, the voltage noise at  $V_{CTRL}$ , and the VCO phase noise [35]. The noise of input signal and the VCO phase noise seen at the output are

$$\overline{\frac{\phi_{n,OUT}}{\phi_{n,IN}}} = \frac{T(s)}{1+T(s)},\tag{4.9}$$



Fig. 4.5: Frequency response of JTF showing a low-pass function.

and

$$\frac{\overline{\phi_{n,OUT}}}{\overline{\phi_{n,VCO}}} = \frac{1}{1+T(s)},\tag{4.10}$$

respectively.

Jitter transfer is an important parameter for a clock recovery circuit, which is defined as the time-domain jitter transferred from data input to clock output. Since jitter can be related to phase noise as shown in the equation

$$\cos(\omega(t+\overline{t_n})) = \cos(\omega t + \overline{\phi_n}), \tag{4.11}$$

the jitter transfer function (JTF) is identical to the transfer function of phase noise from the input to output. The JTF is a low-pass function (Fig. 4.5), and it is required to be within limits specified by an International Telecommunication Union (ITU) standard [36].

#### 4.1.2 Operation Principle of Mixer-Based Phase Detector

One method of phase detection is to implement a MBPD, which consists of an analog mixer and a voltageto-current converter (V/I-converter). An analog mixer generates a voltage output due to phase difference of two inputs of equal frequencies, then a V/I-converter produces an output current from the mixer voltage.

Consider applying two input voltages to an ideal mixer,  $v_1 = A_1 \sin(\omega_1 t + \phi_1)$  and  $v_2 = A_2 \sin(\omega_2 t + \phi_2)$ , the mixer multiplies the two input to result in  $v_{mix} = G_{mix}v_1v_2$ , where  $G_{mix}$  is the mixer's conversion gain. The output voltage is composed of a spectrum of the sum of  $\omega_1$  and  $\omega_2$ , and their difference. Since the PLL



Fig. 4.6: Phase detector response and its small signal linearized gain.

is a low-pass function, the higher frequency (the sum) is filtered out, and the lower frequency output is

$$v_{mix} = G_{mix} \frac{A_1 A_2}{2} \sin(\Delta \omega \cdot t + \Delta \phi).$$
(4.12)

When the two input frequencies are identical,  $v_{mix}$  is a voltage as a function of  $\Delta \phi$ . For a very small  $\Delta \phi$ , Eq. 4.12 can be simplified with a first-order Taylor's series as

$$v_{mix} = G_{mix} \frac{A_1 A_2}{2} \Delta \phi. \tag{4.13}$$

A MOSFET can then be used to convert the output voltage into drain current. In the case of a differential voltage, a differential pair is utilized and the output current is [8]

$$I_{PD} = G_m v_{mix},\tag{4.14}$$

where  $G_m = \sqrt{\mu_n C_{ox}(W/L)I_{SS}}$  is the differential transconductance. The phase detector gain is the combination of the equation (13) and (14), which is

$$K_{PD} = \frac{I_{PD}}{\Delta\omega} = G_m G_{mix} \frac{A_1 A_2}{2}.$$
(4.15)

The  $I_{PD}$  output versus  $\Delta \phi$  plotted in Fig. 4.6 presents a sinusoidal function that crosses the origin, and when there's no phase difference, the PLL is in steady state operating point (OP) no  $I_{PD}$  output.  $K_{PD}$  is the linear approximation at a very small  $\Delta \phi$ , which is shown as the tangent of the curve at the OP in the figure.

#### 4.1.3 Frequency Doubling Mechanism for MBPD

An MBPD requires identical frequency for both input signals. Consider a non return-to-zero (NRZ) input data with a full-rate clock signal, where the bit period  $T_b$  is equal to one cycle of clock, and the data is sampled on every rising edge of the clock as shown in Fig. 4.7. It can be observed that the frequency of input data is half of the clock, and therefore a frequency doubling mechanism (FDM) is required for the before feeding the input to the mixer.

The conventional method to double the frequency of an input data is to perform XOR with the same



Fig. 4.7: Conceptual waveform showing relationship between data, clock and FDM output.



Fig. 4.8: Block diagram and schematic of the FDM and the mixer presented by Lee and Wu.

signal but delayed by half bit-period. As shown in Fig. 4.7, the resulting FDM output after performing XOR is identical to the clock frequency, so the phase-difference between the two input signals of the mixer can then be extracted. This process is also known as "edge-detection" since it can be viewed as generating a pulse whenever a rising or falling edge of the input signal is detected [8].

#### 4.1.4 Literature Review – MBPD and FDM Implemented in Literatures

The FDM and mixer circuit presented by Lee and Wu is shown in Fig. 4.8, which includes delay cells, an XOR gate, and a mixer [37]. The delay is split into four with one-eighth bit-period delay each, to get a total of half bit-period delay. Each delay cell is realized with the gate delay of an inductive-peaked hysteresis buffer. The XOR gate adopts a current-mode logic (CML) circuit architecture, and the mixer is a conventional double-balanced Gilbert-cell. The circuit implemented in 90 nm CMOS technology achieves a 20 GHz clock recovery with 20 Gb/s data input. The recovered clock jitter is 4.22 ps,pp, and the recovered data jitter is 7.56 ps,pp, where pp denotes peak-to-peak

A similar architecture with four delay cells presented by Sun et al. is shown in Fig. 4.9 [38]. The delay cell implemented is a pseudo-differential pair with equalization embedded via tunable source-degeneration. The equalization increases the operation data rate and reduces jitter. Another delay cell between the CLK input and the mixer intends to adjust data sampling points. A modified Gilbert-cell mixer is adopted to eliminate the DC offset due to mismatches, which leads to more accurate phase locking. The circuit realized in 65 nm CMOS technology achieves 28 GHz clock recovery with 28 Gb/s data input. The recovered clock jitter is 955 fs,rms, and the recovered data jitter is 2.59 ps,rms.



Fig. 4.9: Block diagram and schematic of the FDM and the mixer presented by Sun et al.



Fig. 4.10: AC Gain of a CML buffer compared to a resonator-based buffer in 0.13- $\mu$ m CMOS technology

#### 4.1.5 Issue with Current FDM Architecture

Both works described above implement FDM with an XOR gate and delay cells. XOR gates are able to operate for a wide range of frequencies. However, the FDM is limited for a narrow frequency range where the cumulative delay of the delay cells is 90° or its vicinity.

In order to operate above 20 GHz, inductive peaking implemented in a CML circuit are considered in [37,38]. However, for a maximally flat response, inductors can only extend its bandwidth to approximately two times the original, therefore limiting the highest operation frequency [39]. On the other hand, circuits based on LC-resonators can be designed specifically to operate at a higher frequency, thus offer a promising solution and are adopted for the proposed circuit. The frequency responses of a CML buffer and a resonatorbased buffer designed in 0.13- $\mu$ m CMOS technology are compared in Fig. 4.10 where it is shown that the gain of a CML buffer degraded by 3-dB at about 15 GHz, while the resonator-based buffer shows ability to operate 40 GHz. Therefore, an FDM based on LC-resonators are implemented to operate at 40 GHz clock frequency.



Fig. 4.11: Block diagram of the proposed clock recovery circuit.

### 4.2 Proposed Clock Recovery Circuit with Resonator-Based FDM

The block diagram of the proposed clock recovery circuit is shown in Fig. 4.11, where the circuit takes in a 40 Gb/s data and outputs a 40 GHz phase-locked clock. A resonator-based FDM, composed of a tuned amplifier and a frequency doubler, is proposed instead of the conventional XOR gate approach. The output of FDM is fed to a PD composed of a mixer and a V/I converter, and the output of the PD is subsequently used to control an oscillator.

The CDR is designed with a 0.13- $\mu$ m CMOS technology with  $f_T$  around 70 GHz. Post-layout circuit simulations are performed with Cadence Virtuoso. To consider high frequency electromagnetic (EM) coupling effect, the custom-designed inductors and inter-stage transmission lines are simulated with full-wave EM simulators and included in circuit simulations as S-parameter models.

#### 4.2.1 Pre-amplifier

A pre-amplifier is implemented to interface the FDM with measurement signal input. The amplifier adopts a  $g_m$ -boosted common-gate architecture to accommodate the wide bandwidth of a NRZ data [14, 40]. As shown in Fig. 4.12a,  $M_{A1}-M_{A2}$  operates as a common-gate amplifier, and the cascode amplifier  $M_{A3}-M_{A4}$ is included to boost the equivalent transconductance. By taking both the output from the common-gate and the output from the cascode amplifier, a single-ended to differential conversion is achieved [23]. An additional differential amplifier  $M_{A5}-M_{A6}$  as a buffer to drive subsequent stage. Peaking inductors are included for bandwidth improvement and post-layout simulation shows a voltage gain of 23.7 dB with a wide bandwidth of 26.9 GHz (Fig. 4.12b).

#### 4.2.2 Resonator-based FDM

The second block is a FDM composed of a tuned amplifier and a frequency doubler, as shown in Fig. 4.13. The tuned amplifier  $(M_{B1}-M_{B2})$  enhances the fundamental frequency while suppressing all other spectrum contents, and the frequency doubler  $(M_{B3}-M_{B4})$  doubles its frequency.

The tuned amplifier is a differential pair with the resonator load composed of the inductors  $L_{B1}-L_{B2}$ , the capacitors  $C_{B1}-C_{B2}$ , and the parasitic capacitances  $C_{B3}-C_{B4}$ . The load seen at the drain of  $M_{B1}$  /  $M_{B2}$  is

$$Z_{tuned}(\omega) = \frac{j\omega L_{B1,2}}{1 - \omega^2 L_{B1,2}(C_{B1,2} + C_{B3})}.$$
(4.16)



Fig. 4.12: (a) Schematic of pre-amplifier, and (b) its simulated voltage gain.



Fig. 4.13: Schematic of the proposed resonator based FDM.

The inductance and capacitance values are tuned such that a resonation occurs at the fundamental data frequency  $\omega_{IN}$  to result in a large  $Z_{tuned}$ . The large load enables the differential pair to achieve the desired gain with a smaller transconductance  $g_m$ , which reduces the requirement on the size of the transistor and its power consumption.

Following the tuned amplifier, a frequency doubler is implemented. The frequency doubler adopts a push-push architecture, where its output current is a combination of both the transistors  $M_{B3}$  and  $M_{B4}$ . Considering the drain current of a transistor expressed by its Taylor's series as

$$i_D = c_0 + c_1 v_{GS} + c_2 v_{GS}^2 + \dots, (4.17)$$

and consider the differential inputs as  $v_{GS,B3} = +A_1 \cos(\omega_{IN}t)$  and  $v_{GS,B4} = -A_1 \cos(\omega_{IN}t)$  when the currents  $i_{D,B3}$  and  $i_{D,B4}$  combine at the output, the components with frequency  $\omega_{IN}$  cancel out each other, but a positive combination occurs at the frequency  $2\omega_{IN}$ . The resulting output current is [41,42]

$$i_{D,B3} + i_{D,B4} \approx c_2 A_1^2 \cos(2\omega_{IN} t).$$
 (4.18)

The coefficient  $c_2$  can be obtained from the short channel drain current expression of a transistor, and a higher value of  $c_2$  can be obtained with a larger channel width or a lower overdrive voltage of  $M_{B3}$  and  $M_{B4}$  [41]. The current source  $I_{B2}$  is included in this design to obtain optimum overdrive voltage of  $M_{B3}$ and  $M_{B4}$ .



Fig. 4.14: Simulated FDM conversion gain for  $f_0$  output and  $2f_0$  output.



Fig. 4.15: Simulated time-domain FDM output (bottom) compared to an ideal NRZ signal input (top).

Similar to the tuned amplifier, the drain inductor of the doubler and the parasitic capacitance of the following mixer,  $L_{B3}$  and  $C_{B6}$ , resonates at  $2\omega_{IN}$  to result in a large load impedance, thus providing gain at the desired frequency while filtering out other frequencies. Frequency domain simulation of the FDM gives 2 dB conversion gain for 40 GHz ( $2f_0$ ) output at frequency doubler (Fig. 4.14). The 20 GHz ( $f_0$ ) signal at the output of the frequency doubler is suppressed by 25 dB due to filtering effect of resonator implemented.

With a pseudo-random binary sequence (PRBS) of  $2^{31} - 1$  bits long data input applied to the FDM, the time-domain FDM output is shown in Fig. 4.15. When the input data switches more often from one to zero, or from zero to one, the output amplitude becomes larger. When there are consecutive ones or zeroes, the output amplitude becomes smaller. This indicates some data-encoding is needed to ensure adequate data switching [43]. Note that the cycle-to-cycle amplitude variation does not affect phase-locking operation since it will be filtered out by the phase detector and loop filter.

#### 4.2.3 Mixer-based Phase Detector

The third circuit block is a MBPD shown in Fig. 4.16, which consist of a mixer and a V/I-converter. The mixer is a single-balanced architecture  $(M_{C1}-M_{C3})$ , where  $M_{C1}$  takes in a single-ended input from FDM, and  $M_{C2}-M_{C3}$  are driven by a balanced clock signal. The output voltage of the mixer can be obtained with a Taylor expansion of the differential-pair small-signal transconductance and retaining only the first term [7]. Expressing the FDM output as  $v_{FDM} = A_1 \sin(\omega_D + \phi_{IN})$ , and considering  $v_{CLK} = A_2 \cos(\omega_{CK} + \phi_{OUT})$ ,



Fig. 4.16: Simulated time-domain FDM output (bottom) compared to an ideal NRZ signal input (top).



Fig. 4.17: Simulated mixer DC voltage output versus phase difference.

the resulting differential DC voltage output when  $\omega_D = \omega_{CK}$  is

$$v_{mix} = \frac{R_{1,2}A_1A_2}{4I_{C1}}g_{m,C1}g_{m,C2}\sin(\Delta\phi), \qquad (4.19)$$

where  $I_{C1}$  is the DC current generated by  $M_{C1}$ , and  $g_{m,C1}$  and  $g_{m,C2}$  are the small signal transconductances of  $M_{C1}$  and  $M_{C2} / M_{C3}$ , respectively. This implies that the conversion gain can be obtained by increase drain resistors, or by increasing transconductances.

A simulation is performed and the DC values of  $v_{mix}$  obtained by varying  $\Delta \phi$  are shown in Fig. 4.17. A calculated curve is also shown as comparison, and it implies that Eq. 4.19 can be a fairly accurate prediction. The values used for calculation are  $R_1 = 200\Omega$ ,  $I_{C1} = 1.6$ mA,  $g_{m,C1} = 12$ mS,  $A_1 = 0.1$ V,  $g_{m,C2} = 8.8$ mS and  $A_2 = 0.3$ V.

The differential pair  $M_{C4}-M_{C5}$  converts  $v_{mix}$  into output current, and the current is transferred to the output with three current mirrors ( $M_{C6}-M_{C7}$ ,  $M_{C8}-M_{C9}$ , and  $M_{C10}-M_{C11}$ ). When  $v_{mix}$  is positive, the current from  $M_{C11}$  charges the loop filter and increases  $V_{CTRL}$ . When  $v_{mix}$  is negative, the loop filter discharges through  $M_{C9}$ . Ideally, a PD should output zero current at zero phase difference, independent of  $V_{CTRL}$  values. However, due to channel-length modulation of MOSFETs, there is a mismatch of the drain current for  $M_{C9}$  and  $M_{C11}$ , and equal quiescent current only occurs at one particular bias condition. Moreover, since the V/I-converter operates with the DC output of mixer, the conventional method of adjusting the DC bias voltage with an error amplifier may not be feasible [44]. To mitigate the issue in this work, the VCO is designed to oscillate at the desired frequency at the  $V_{CTRL}$  voltage where the drain currents of are



Fig. 4.18: Simulated PD output current versus phase difference.



Fig. 4.19: (a) Time domain  $I_{PD}$  output for three different time delays, and (b) time average  $I_{PD}$  output versus time delay.

in equilibrium. This does not poses an issue to an optical receiver since the system operates at one fixed data-rate.

The resulting output from the PD is shown in Fig. 4.18. The  $K_{PD}$  is approximately 6.5  $\mu$ A/deg at zero  $I_{PD}$ . The zero  $I_{PD}$  point is slightly deviated from zero phase difference due to the presence of high frequency component.

Considering again a PRBS  $2^{31} - 1$  data input applied to the FDM, and a delay is added to simulate different  $\Delta \phi$ , the output current shows a DC offset with a high-frequency ripple as in Fig. 4.19a. When there is no delay, the clock and data are in phase and the  $I_{PD}$  has a ripple of 62  $\mu$ A with zero DC. When the clock leads the data by 90 °to result in  $\Delta \phi = -\pi/2$ , the  $I_{PD}$  has the minimum average current of  $-240\mu$ A. At  $\Delta \phi = \pi/2$ , the  $I_{PD}$  has the maximum average current of  $+240\mu$ A. The delay versus average value of the current resembles a sinusoidal waveform as shown in Fig. 4.19b, which correlates with the analysis performed. The peak value is smaller compared to Fig. 4.18 since the average FDM output voltage is lower with a PRBS input.



Fig. 4.20: Schematic of the proposed VCO.



Fig. 4.21: Simulated phase noise of the VCO.

#### 4.2.4 Voltage-Controlled Oscillator

The last circuit block implemented is a VCO composed of a conventional NMOS cross-coupled VCO with a buffer, as shown in Fig. 4.20. The cross-coupled transistors  $M_{D1}-M_{D2}$  forms a negative- $g_m$  cell that compensates the loss in the LC tank, which is a center-tapped spiral inductor and two NMOS varactors  $C_{CK1}-C_{CK2}$ . Current source  $I_{D1}$  is employed at drain node to bias the cross-coupled transistors, and PMOS current source is employed due to its lower flicker noise [45]. An inductor-loaded buffer is included to isolate the VCO from the rest of the circuits. An additional resistor loaded buffer is included for measurement purpose.

The VCO can be tuned from 38.3 GHz to 41.5 GHz with  $V_{CTRL}$  ranging from 0 V to 1.5 V. At 40 GHz output,  $K_{VCO}$  is obtained as 1.868 GHz/V, and a moderate phase noise performance is achieved with -94 dBc/Hz at 1 MHz offset [46,47]. The flicker noise corner frequency is approximately 1 MHz (Fig. 4.21).

### 4.3 Measurement of the Proposed Circuit

The microphoto of the fabricated chip is shown in Fig. 4.22, and the size of the layout is  $1.35 \text{ mm} \times 0.7 \text{ mm}$ . On-chip probing measurement is performed, with a continuous-wave signal of approximately 20 GHz applied to the input to emulate a 40 Gb/s signal, and the output fed to a spectrum analyzer.

A design flaw has result in the loop constantly closed, and a loop resonance has occurred that resulted



Fig. 4.22: Microphoto of the chip.



Fig. 4.23: VCO tuning range.

in unstable free-running output frequency. An additional capacitor was included to the loop-filter to limit the wandering of frequency within a 20 MHz range, and the output frequency of VCO can be estimated by taking the middle frequency from the output spectrum. The measured output frequency is plotted in Fig. 4.23 with comparison to the simulation, where it shows a tuning range of 40.8 GHz to 42.0 GHz with  $V_{CTRL}$ voltage of 0.6 V to 1.2 V. The desired operating point for the clock recovery circuit is set at 41.62 GHz where  $V_{CTRL} = 0.85$ V. Unfortunately it is not possible to obtain the phase noise of free-running VCO.

When an input signal 20.81 GHz is applied, the output produces a clean single-tone 41.62 GHz and it is phase-locked to the input signal. Due to a loss introduced by the VCO output buffer, the measured output power of the clock signal is around -30 dBm (Fig. 4.24a). The lower power resulted in a higher noise floor for phase noise measurement, which is approximately -100 dBc/Hz as shown in Fig. 4.24b. The two visible tones in Fig. 4.24b are the power-line frequency at 120 Hz and the loop resonance at 1.6 MHz. However, the higher noise floor has masked the bandwidth of the PLL, therefore a scenario as shown in Fig. 4.25 is deliberately created with a 10 dB increase of phase noise for frequency offset of 100 kHz by lowering input power level. With this input level, the bandwidth required by ITU can be designed and shown in measurement. Integrating the phase noise from 1 kHz to 100 MHz offset, peak-to-peak jitter is obtained as 755 fs.

4.1 shows a comparison for relevant full-rate CDR circuits. The proposed circuit overcomes speed issue of logic circuits owing to a resonator-based FDM, and achieved twice the data rate comparing to other



Fig. 4.24: Measurement of the output clock signal: (a) spectrum, and (b) phase noise.



Fig. 4.25: Phase noise measurement showing 16 MHz of bandwidth.

|                   | JSSC 2009 [37] | ISCAS 2012 [43]  | TCAS-I 2014 [38]                           | This Work        |
|-------------------|----------------|------------------|--------------------------------------------|------------------|
| Data Rate (Gb/s)  | 20             | 25               | 26 - 28                                    | 40               |
| Edge Det. Type    | Delay with XOR | IQ gen. with XOR | Delay with XOR Tuned amp. with freq. doubl |                  |
| Clock Jitter (ps) | 4.22 pp        | n/a              | 2.59 rms                                   | 2.38 pp          |
| VDD (V)           | 1.5            | 1.2              | 1.0                                        | 1.8              |
| Power (mW)        | 131 1          | $107^{-2}$       | 104 2                                      | 112 <sup>1</sup> |
| Tech.             | 90-nm CMOS     | 90-nm CMOS       | 65-nm CMOS                                 | 0.13-μm CMOS     |

Table 4.1: Comparison of Mixer-Based Full-Rate Clock Recovery Circuits

 $^{1}$  PD and VCO

 $^{2}$  PD and VCO with data-retiming DFF

works that implements MBPD [37,38]. When compared with the circuit BBPD in [48], the proposed circuit achieves twice the clock frequency and significantly reduces the power dissipation.

# Chapter 5

# Conclusion

A new TIA circuit topology is proposed for optical receiver above 20 Gb/s data-rate. The TIA is implemented in 32nm SOI CMOS technology, modifying the GBCG architecture with a diode-connected transistor at the input stage. Compared to a conventional GBCG amplifier, the input stage further lowers input resistance and result in a 40% increase in bandwidth. Through circuit analysis, it is found that maintaining wide auxiliary amplifier bandwidth is contributes to the overall performance. The TIA stage without buffer has achieved 37 dB $\Omega$  gain with bandwidth of 74 GHz, enabling 100 Gb/s operation.

A clock recovery circuit architecture is presented to increase clock rate. A frequency doubling mechanism implemented with a tuned amplifier and a frequency doubler enables 40 Gb/s full-rate phase detection with a 40 GHz clock implemented in 0.13- $\mu$ m CMOS technology. This shows potential of operation at a much higher clock rate when given a more advanced technology. Clock output measured peak-to-peak jitter of 2.38 ps at locked condition.

# References

- Y.-H. Kim, E.-S. Jung, and S.-S. Lee, "Bandwidth enhancement technique for CMOS RGC transimpedance amplifier," *Electronics Letters*, vol. 50, no. 12, pp. 882–884, jun 2014.
- [2] M. Atef and H. Zimmermann, "Low-power 10 Gb/s inductorless inverter based common-drain active feedback transimpedance amplifier in 40 nm CMOS," *Analog Integrated Circuits and Signal Processing*, vol. 76, no. 3, pp. 367–376, sep 2013.
- [3] S. Bashiri and C. Plett, "A 40 Gb/s transimpedance amplifier in 65 nm CMOS," in *IEEE International Symposium on Circuits and Systems (ISCAS)*. Ieee, 2010, pp. 757–760.
- [4] M. Nowell, "Beyond 100 Gigabit Ethernet, Technical Challenges," in *IEEE 802 March 2013 Plenary*, Orlando, FL, USA, 2013.
- [5] K. Vasilakopoulos, S. P. Voinigescu, P. Schvan, P. Chevalier, and A. Cathelin, "A 92GHz bandwidth SiGe BiCMOS HBT TIA with less than 6dB noise figure," in 2015 IEEE Bipolar/BiCMOS Circuits and Technology Meeting - BCTM. IEEE, oct 2015, pp. 168–171.
- [6] Z. Xuan, R. Ding, Y. Liu, T. Baehr-Jones, M. Hochberg, and F. Aflatouni, "A low-power 40 Gb/s optical receiver in silicon," in 2015 IEEE Radio Frequency Integrated Circuits Symposium (RFIC). IEEE, may 2015, pp. 315–318.
- [7] B. Razavi, Design of Analog CMOS Integrated Circuits. McGraw Hill Education, 2000.
- [8] —, Design of Integrated Circuits for Optical Communications, 2nd ed. Wiley, 2012.
- [9] J. Kim and J. F. Buckwalter, "Bandwidth Enhancement With Low Group-Delay Variation for a 40-Gb/s Transimpedance Amplifier," *IEEE Transactions on Circuits and Systems I Regular Papers*, vol. 57, no. 8, pp. 1964–1972, 2010.

- [10] R. Ding, Z. Xuan, T. Baehr-Jones, and M. Hochberg, "A 40-GHz bandwidth transimpedance amplifier with adjustable gain-peaking in 65-nm CMOS," in 2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS). IEEE, aug 2014, pp. 965–968.
- [11] S.-T. Chou, S.-H. Huang, Z.-H. Hong, and W.-Z. Chen, "A 40 Gbps Optical Receiver Analog Front-End in 65 nm CMOS," in *IEEE International Symposium on Circuits and Systems (ISCAS)*, 2012, pp. 1736–1739.
- [12] J. Kim and J. F. Buckwalter, "A 40-Gb/s Optical Transceiver Front-End in 45 nm SOI CMOS," IEEE Journal of Solid State Circuits, vol. 47, no. 3, pp. 615–626, 2012.
- [13] K. Park and W.-S. Oh, "A 40-Gb/s 310-fJ/b Inverter-Based CMOS Optical Receiver Front-End," IEEE Photonics Technology Letters, vol. 27, no. 18, pp. 1931–1933, sep 2015.
- [14] Y. Chen, Z. Wang, X. Fan, H. Wang, and W. Li, "A 38 Gb/s to 43 Gb/s Monolithic Optical Receiver in 65 nm CMOS Technology," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 60, no. 12, pp. 3173–3181, dec 2013.
- [15] J.-D. Jin and S. S. H. Hsu, "A 40-Gb/s Transimpedance Amplifier in 0.18-um CMOS Technology," *IEEE Journal of Solid State Circuits*, vol. 43, no. 6, pp. 1449–1457, 2008.
- [16] C.-F. Liao and S.-I. Liu, "40 Gb/s Transimpedance-AGC Amplifier and CDR Circuit for Broadband Data Receivers in 90 nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 3, pp. 642–655, 2008.
- [17] S. G. Kim, S. H. Jung, Y. S. Eo, S. H. Kim, X. Ying, H. Choi, C. Hong, K. Lee, and S. M. Park, "A 50-Gb/s differential transimpedance amplifier in 65nm CMOS technology," in 2014 IEEE Asian Solid-State Circuits Conference (A-SSCC). IEEE, nov 2014, pp. 357–360.
- [18] S. Mohan, M. Hershenson, S. Boyd, and T. Lee, "Bandwidth extension in CMOS with optimized on-chip inductors," *IEEE Journal of Solid-State Circuits*, vol. 35, no. 3, pp. 346–355, mar 2000.
- [19] H.-G. Bach, A. Beling, G. Mekonnen, R. Kunkel, D. Schmidt, W. Ebert, A. Seeger, M. Stollberg, and W. Schlaak, "InP-Based Waveguide-Integrated Photodetector With 100-GHz Bandwidth," *IEEE Journal of Selected Topics in Quantum Electronics*, vol. 10, no. 4, pp. 668–672, jul 2004.
- [20] S. M. Park and H.-j. Yoo, "1.25-Gb/s Regulated Cascode CMOS Transimpedance Amplifier for Gigabit Ethernet Applications," *IEEE Journal of Solid State Circuits*, vol. 39, no. 1, pp. 112–121, 2004.

- [21] D. Frickey, "Conversions between S, Z, Y, H, ABCD, and T parameters which are valid for complex source and load impedances," *IEEE Transactions on Microwave Theory and Techniques*, vol. 42, no. 2, pp. 205–211, 1994.
- [22] B. Razavi, *RF Microelectronics*, 2nd ed. Pearson, 2012.
- [23] J. Chong and D. Ha, "A 100 Gb/s transimpedance amplifier with diode-connecting input-resistancereduction in 32 nm CMOS technology," in *Midwest Symposium on Circuits and Systems*, vol. 2015-Septe, 2015.
- [24] S.-H. Huang and W.-Z. Chen, "A 20-Gb/s optical receiver with integrated photo detector in 40-nm CMOS," in 2013 IEEE Asian Solid-State Circuits Conference (A-SSCC). IEEE, nov 2013, pp. 225– 228.
- [25] Z. Xuan, R. Ding, Y. Liu, T. Baehr-Jones, M. Hochberg, and F. Aflatouni, "A Low-Power Hybrid-Integrated 40-Gb/s Optical Receiver in Silicon," *IEEE Transactions on Microwave Theory and Techniques*, vol. 66, no. 1, pp. 589–595, jan 2018.
- [26] I. Garcia Lopez, P. Rito, A. Awny, M. Ko, D. Kissinger, and A. C. Ulusoy, "A DC-75-GHz Bandwidth and 54 dBΩ Gain TIA With 10.9 pA/√Hz in 130-nm SiGe BiCMOS," *IEEE Microwave and Wireless Components Letters*, vol. 28, no. 1, pp. 61–63, jan 2018.
- [27] T. Takemoto, Y. Matsuoka, H. Yamashita, Y. Lee, H. Arimoto, M. Kokubo, and T. Ido, "A 50-Gb/s High-Sensitivity (-9.2 dBm) Low-Power (7.9 pJ/bit) Optical Receiver Based on 0.18-m SiGe BiCMOS Technology," *IEEE Journal of Solid-State Circuits*, vol. 53, no. 5, pp. 1518–1538, may 2018.
- [28] A. Karimi-Bidhendi, H. Mohammadnezhad, M. M. Green, and P. Heydari, "A Silicon-Based Low-Power Broadband Transimpedance Amplifier," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 65, no. 2, pp. 498–509, feb 2018.
- [29] S. Bhagavatheeswaran, T. Cummings, E. Tangen, M. Heins, R. Chan, and C. Steinbeiser, "A 56 Gb/s PAM-4 linear transimpedance amplifier in 0.13-μm SiGe BiCMOS technology for optical receivers," in 2017 IEEE Compound Semiconductor Integrated Circuit Symposium (CSICS). IEEE, oct 2017, pp. 1–4.
- [30] A. Awny, R. Nagulapalli, D. Micusik, J. Hoffmann, G. Fischer, D. Kissinger, and A. C. Ulusoy, "A dual 64Gbaud 10kΩ 5% THD linear differential transimpedance amplifier with automatic gain control in 0.13µm BiCMOS technology for optical fiber coherent receivers," in 2016 IEEE International Solid-State Circuits Conference (ISSCC). IEEE, jan 2016, pp. 406–407.

- [31] L. Szilagyi, R. Henker, and F. Ellinger, "20 Gbit/s ultra-compact optical receiver front-end with variable gain transimpedance amplifier in 80 nm CMOS," in 2016 IEEE MTT-S Latin America Microwave Conference (LAMC). IEEE, dec 2016, pp. 1–3.
- [32] S. Saeedi, S. Menezo, G. Pares, and A. Emami, "A 25 Gb/s 3D-Integrated CMOS/Silicon-Photonic Receiver for Low-Power High-Sensitivity Optical Communication," *Journal of Lightwave Technology*, vol. 34, no. 12, pp. 2924–2933, jun 2016.
- [33] Y. Xie, D. Li, Y. Liu, M. Liu, Y. Zhang, X. Wang, and L. Geng, "Low-Noise High-Linearity 56Gb/s PAM-4 Optical Receiver in 45nm SOI CMOS," in 2018 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, may 2018, pp. 1–4.
- [34] I. Ozkaya, A. Cevrero, P. A. Francese, C. Menolfi, T. Morf, M. Brandli, D. M. Kuchta, L. Kull, C. W. Baks, J. E. Proesel, M. Kossel, D. Luu, B. G. Lee, F. E. Doany, M. Meghelli, Y. Leblebici, and T. Toifl, "A 64-Gb/s 1.4-pJ/b NRZ Optical Receiver Data-Path in 14-nm CMOS FinFET," *IEEE Journal of Solid-State Circuits*, vol. 52, no. 12, pp. 3458–3473, dec 2017.
- [35] J. Craninckx and M. Steyaert, Wireless CMOS Frequency Synthesizer Design. Springer Science+Business Media, B.V., 1998.
- [36] International Telecommunication Union, "Recommendation ITU-T G.825 (2000) Amendment 1," 2008. [Online]. Available: https://www.itu.int/rec/T-REC-G.825-200805-I!Amd1/en
- [37] J. Lee and K.-C. Wu, "A 20-Gb/s Full-Rate Linear Clock and Data Recovery Circuit With Automatic Frequency Acquisition," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 12, pp. 3590–3602, dec 2009.
- [38] L. Sun, Q. Pan, K.-C. Wang, and C. P. Yue, "A 26–28-Gb/s Full-Rate Clock and Data Recovery Circuit With Embedded Equalizer in 65-nm CMOS," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 61, no. 7, pp. 2139–2149, jul 2014.
- [39] B. Analui and A. Hajimiri, "Bandwidth enhancement for transimpedance amplifiers," *IEEE Journal of Solid State Circuits*, vol. 39, no. 8, pp. 1263–1270, aug 2004.
- [40] Z. Lu, K. S. Yeo, J. Ma, M. A. Do, W. M. Lim, and X. Chen, "Broad-Band Design Techniques for Transimpedance Amplifiers," in *IEEE Transactions on Circuits and Systems I Regular Papers*, vol. 54, no. 3. Ieee, 2007, pp. 590–600.
- [41] H.-H. Hsieh, Y.-C. Hsu, and L.-H. Lu, "A 15/30-GHz Dual-Band Multiphase Voltage-Controlled Oscillator in 0.18-µm CMOS," *IEEE Transactions on Microwave Theory and Techniques*, vol. 55, no. 3, pp. 474–483, mar 2007.

- [42] C.-Y. Yang, C.-H. Chang, J.-M. Lin, and H.-Y. Chang, "A 20/40-GHz Dual-Band Voltage-Controlled Frequency Source in 0.13-µm CMOS," *IEEE Transactions on Microwave Theory and Techniques*, vol. 59, no. 8, pp. 2008–2016, aug 2011.
- [43] A. Zargaran-Yazd and S. Mirabbasi, "A 25 Gb/s full-rate CDR circuit based on quadrature phase generation in data path," in 2012 IEEE International Symposium on Circuits and Systems. IEEE, may 2012, pp. 317–320.
- [44] M. Terrovitis, M. Mack, K. Singh, and M. Zargari, "A 3.2 to 4 GHz, 0.25 μm CMOS frequency synthesizer for IEEE 802.11a/b/g WLAN," in 2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519). IEEE, 2004, pp. 98–515.
- [45] A. Hajimiri and T. Lee, "Design issues in CMOS differential LC oscillators," *IEEE Journal of Solid-State Circuits*, vol. 34, no. 5, pp. 717–724, may 1999.
- [46] S. W. Chai, J. Yang, B.-H. Ku, and S. Hong, "Millimeter wave CMOS VCO with a high impedance LC tank," in 2010 IEEE Radio Frequency Integrated Circuits Symposium. IEEE, 2010, pp. 545–548.
- [47] T.-Y. Lu, C.-Y. Yu, W.-Z. Chen, and C.-Y. Wu, "Wide Tunning Range 60 GHz VCO and 40 GHz DCO Using Single Variable Inductor," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 60, no. 2, pp. 257–267, feb 2013.
- [48] J.-K. Kim, J. Kim, G. Kim, and D.-K. Jeong, "A Fully Integrated 0.13-µm CMOS 40-Gb/s Serial Link Transceiver," *IEEE Journal of Solid-State Circuits*, vol. 44, no. 5, pp. 1510–1521, may 2009.