# UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL INSTITUTO DE INFORMÁTICA PROGRAMA DE PÓS-GRADUAÇÃO EM MICROELETRÔNICA ## LEANDRO ÁVILA DE ÁVILA ## **Cross-Layer Energy Model of IR-UWB for Short-Range Communication Systems** Thesis presented in partial fulfillment of the requirements for the degree of Doctor in Microeletronics Advisor: Prof. Dr. Sergio Bampi ## CIP - CATALOGING-IN-PUBLICATION Ávila de Ávila, Leandro Cross-Layer Energy Model of IR-UWB for Short-Range Communication Systems / Leandro Ávila de Ávila. – Porto Alegre: PPGC da UFRGS, 2019. 150 f.: il. Thesis (Ph.D.) – Universidade Federal do Rio Grande do Sul. Programa de Pós-Graduação em Microeletrônica, Porto Alegre, BR-RS, 2019. Advisor: Sergio Bampi. I. Bampi, Sergio. II. Título. ## UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL Reitor: Prof. Rui Vicente Oppermann Vice-Reitor: Prof. Jane Fraga Tutikian Pró-Reitor de Pós-Graduação: Prof. Celso Giannetti Loureiro Chaves Diretor do Instituto de Informática: Prof. Carla Maria Dal Sasso Freitas Coordenadora do PGMicro: Prof. Fernanda Gusmão de Lima Kastensmidt Bibliotecária-chefe do Instituto de Informática: Beatriz Regina Bastos Haro ## **ACKNOWLEDGMENTS** I would like to thank my advisor, Prof. Dr. Sergio Bampi, for all his support in recent years. He has always been a friend hand and shoulder, which bring me more than technical knowledge, gave me a lot of clarification and a way to follow until this final version of this thesis. I will be very and eternally grateful to him. It was very important for me to meet excellent professionals with whom I had the pleasure of living in our academic environment. Thanks Prof. Dr. Juergen Rochol (my unofficial coadvisor on many occasions), Prof. Dr. Edison Pignaton, and Prof. Dr. Rafael Kunst. They added precious value with their advisement for the development of my thesis, whenever I needed. And in the army, my special thanks to Lt. Col. Luciene da Silva Demenicis (even while she was in IME), Col. Mario Jorge Costa Câmara, and Col. Jorgito Matiuzzi Stocchero also for their invaluable support when I most needed it. Also, it is essential at this moment to recognize all members of my research group, since the beginning they fought on my side for novel academic contributions, always focusing on the best results. The road was turbulent, but we had meaningful results. And there is another special group of people that I would like to thank for their invaluable support and contribution, my colleagues from Lab 110 and 215, for them I would rather not make a list of names, cause I am afraid of committing any injustice. My special thanks also to the UFRGS and the Brazilian Army by last years, institutions that I have been connected and I will intend to work for a long time ahead, one of my purposes is to develop a common research project agenda together. I believe both can win for science and technology of Brazil if they work together. I dedicated this academic work to my beloved wife (Cíntia G. Simão de Ávila), to my son (Pedro) and daughter (Ana), they always gave me their cooperation and support during all the steps of this difficult journey. Thanks to my parents, Claudete and Hildo (*in memoriam*), I became the man I am. Thanks also to my brother Alex. I always say thank God for my family, my job and my career. Therefore, I feel blessed and in the obligation to dedicate me more and more to growth in all levels and areas of my life. #### **ABSTRACT** Short-range data communications and microelectronics circuits design are distinct and important fields of knowledge and there are few points of convergence between them in academic research. In the academic literature of each field there are few inter-related works and it is sparse the cross-referencing between them. All equipment for communication systems uses integrated circuit (ICs), which indicates a key connection between these engineering fields. The Open System Interconnection reference model is devoted to describe the abstraction levels in layers with their components, known as OSI layers. The power profiles of the first and second levels (PHY and MAC) for data communication are among the main concerns of this work. The PHY level is associated with the hardware (microelectronic devices included), and the MAC concentrates the low-level strategies to control the medium access. The energy of the whole system can be minimized by performing direct control on the PHY and MAC layers and, in this sense, the interplay between them constitutes a cross-layer model. At the PHY level, where the integrated circuits design contributes with gains when the investigation of architectures, circuits and devices produce better hardware performance, specific tools are used to reach power reduction, representing a partial optimization of the model. At the MAC level, other aspects impact energy consumption and need to be assessed - for example, the duty cycle and algorithms to deal with fails in the communication process. In this work, the context where the data communication is employed is based on IEEE 802.15.6 standard, for wireless body area networks (WBANs). This work focuses mainly on the logic design of the forward error correction (FEC) coding and decoding hardware. The CODECs are embedded inside the transceivers, and these transceivers can be part of the nodes of a wireless sensor network. This research addresses the energy consumption model involving these modules. At the PHY level, the electrical UWB waveforms generated by the TX are also analyzed for better energy efficiency. This thesis covers three design aspects of the communication systems, which adheres to the 802.15.6 standard, and are at different layers, modeling the energy required in a cross-layer view. Namely, these aspects are: the way by which Ultra-Wideband (UWB) communication takes place with impulse modulation over the 3.1 GHz - 10.6 GHz bands, the relationship of the analytic equations of the energetic model concerning also the link budget, and the low-power hardware comparison, focusing on the FEC decoding hardware that was designed in this thesis. The contributions of the research include: i) the pulse-shaping and modulation analysis, with a PPM modulation, with 3 or more bits, using PSWF pulses for higher spectral and energy efficiencies; ii) the BCH FEC decoder, designed in CMOS VLSI, with demonstrated advantages over the QC-LDPC, as the latter consumes 3.76 times more power for similar coding and data rates; iii) the improvement of the cross-layer energy efficiency model for IR-UWB, and its simulation in the operating scenario with multiple nodes that adhere to the IEEE standard 802.15.6. **Keywords:** WBAN, Ultra-wide Band - UWB, IR-UWB, Cross-Layer Energy Model, Energy Efficiency, CMOS Digital Design. ## Modelo Energético *Cross-Layer* de IR-UWB para Sistemas de Comunicação de Curto Alcance ## **RESUMO** Sistemas de comunicação de dados de curto alcance no nível sistêmico e a microeletrônica são dois campos distintos e importantes do conhecimento, e normalmente são tratados separadamente nas pesquisas acadêmicas. O inter-relacionamento entre eles e as referências cruzadas são esparsas na bibliografia acadêmica especializada de cada campo. Todos os equipamentos para comunicação usam circuitos integrados, o que indica uma conexão intrínseca entre estes campos da engenharia. Na comunicação de dados, o modelo de Camadas OSI (Open System Interconnection Reference Model) possibilita a abstração em níveis, com seus componentes denominados em camadas. Os perfis de consumo energético para a primeira e segunda camadas (PHY e MAC) são temas principais deste trabalho. A camada física (nível 1, ou PHY) é diretamente associada ao hardware (incluindo os circuitos integrados), na qual o projeto ou design de chips é efetivamente utilizada, e o nível MAC concentra as estratégias de baixo-nível (em hardware/software) para controlar o acesso ao meio físico de transmissão/recepção. A energia consumida no sistema todo pode ser minimizada pelo controle direto nas camadas PHY e MAC, e, neste sentido, a relação entre estas constitui o modelo de multi-camadas (Cross-Layer). No nível PHY, no qual o projeto de circuitos integrados contribui com ganhos quando a investigação de arquiteturas, circuitos e dispositivos produz melhor desempenho do hardware, ferramentas específicas são usadas para obter redução de potência, representando então uma otimização parcial do modelo. Aspectos da camada de controle de acesso ao meio (nível 2, MAC) - como o ciclo de trabalho e os algoritmos para tratar com falhas no processo de comunicação, por exemplo - também são abordados para a constituição do modelo Cross-Layer que seja energeticamente eficiente. Neste trabalho, a comunicação de dados é tratada no contexto das redes sem fio de abrangência corporal (WBAN) que aderem ao padrão IEEE 802.15.6. O projeto lógico de circuitos digitais CMOS para correção de erros à frente na comunicação (codificadores e decodificadores, ou codecs de FEC) é um foco deste trabalho. Os codecs são parte do hardware dos transceptores, e estes são parte dos nós de uma rede de sensores sem-fio. No nível PHY, as formas de ondas elétricas UWB geradas pelo TX são também analisadas para melhor eficiência energética. Esta tese cobre três aspectos do design do sistema de comunicação, que adere ao standard 802.15.6, e que estão em diferentes níveis, modelando o consumo energético numa abordagem Cross-Layer. Aspectos que estão relacionados à potência demandada no sistema, sendo eles: o formato pelo qual ocorre a comunicação em *Ultra-Wideband* (UWB), utilizando sinais em Rádio Frequência na faixa de 3,1 GHz até 10,6 GHz com modulação por impulso (IR-UWB), a correlação existente nas equações analíticas de um modelo energético (considerando-se o comportamento da potência do sinal no meio de propagação - atenuações e ganhos) e, por fim, a comparação da potência consumida pelo hardware *low-power* do decodificador de erros que foi projetado nesta tese com decodificadores alternativos. As contribuições do trabalho abrangem: i) a análise da modulação e pulse-shaping, com a modulação PPM, com 3 ou mais bits, usando pulsos PSWF para eficiência energética e espectral; ii) O decodificador FEC BCH, projetado em VLSI CMOS, com vantagens sobre o decodificador QC-LDPC, pois este consome 3,76 vezes mais potência para taxas de codificação similares; iii) o aperfeiçoamento de um modelo de eficiência energética "cross-layer"para o IR-UWB, e sua simulação em cenário de operação com múltiplos nodos aderentes ao standard IEEE 802.15.6. ## **Palavras-chave:** WBAN, Ultra-wide Band, IR-UWB, FEC, BCH, Eficiência Energética, Modelo Cross-Layer, Circuitos Digitais CMOS. ## **LIST OF FIGURES** | 1.1 | Transceiver Block Diagram | 21 | |------|-------------------------------------------------------------------------|----| | 1.2 | Example of WBAN's architecture – network for human signal monitoring. | 23 | | 1.3 | Network for a Smart City with IoT sensors around | 24 | | 2.1 | FCC Mask and its limits (indoor in red color, outdoor in green color) | 30 | | 2.2 | BER vs. SNR of wideband systems DSSS and UWB for a single user | 33 | | 2.3 | Data Rate versus Energy/Battery Lifetime | 35 | | 2.4 | Energy efficiency of various wireless (a) transmitter and (b) receiver | 36 | | 2.5 | Hub and Node WBAN Protocol | 38 | | 2.6 | Design Flow for the Logic Synthesis | 45 | | 2.7 | Physical Synthesis Flow of the Design | 47 | | 3.1 | Example of spread spectrum on the IR-UWB | 53 | | 3.2 | Gaussian pulse with variations | 54 | | 3.3 | plots of square wave and its spectrum | 55 | | 3.4 | PSWF variations in time and their respective PSD squares in frequency | 56 | | 3.5 | A PSWF pulse over time | 56 | | 3.6 | Some methods for pulse generation | 60 | | 3.7 | Circuit schematic for 5th derivative of Gaussian pulse | 61 | | 3.8 | Schematic diagram for one PSWF pulse generators | 61 | | 3.9 | Comparison of the spectral efficiency of the pulses | 64 | | 3.10 | Energy spent to transmit the 255 bytes, | 64 | | 4.1 | Cross-Layer Figure Model with General Energy Manager | 70 | | 4.2 | FEC's Encoder and Decoder in the Transceiver (IEEE, 2012) | 71 | | 4.3 | Encoder for BCH (63,51,2) | 73 | | 4.4 | Example of decoder block diagram for BCH | 74 | | 4.5 | Berlekamp-Massey Algorithm with inversion | 75 | | 4.6 | Polynomial divisor | 75 | | 4.7 | Chien's search circuit | 76 | | 4.8 | Parity check matrix constructed using the CPA approach (HAN, 2007) | 77 | | 4.9 | Comparison of the Power in a Coder/Decoder in 45 and 65 nm (100MBps). | 81 | | 4.10 | Comparison of the Power in a Coder/Decoder in 45 and 65 nm (487.5Kbps). | 81 | | 4.11 | Comparison of FECs (at 250 and 487.5 Kbps) | 82 | | 5.1 | General Hardware Architecture of a sensor node | 85 | | 5.2 | Block Diagram Example of a Transmitter | 85 | | 5.3 | Channel Decoding Block | 37 | |------|-----------------------------------------------------------------------------|-----| | 5.4 | Bit error rate comparison between uncoded and encoded systems 8 | 88 | | 5.5 | Error Probability for Uncoded and Code Rate variations | 39 | | 5.6 | Bit Error Probability versus Eb/N0 at the receiver | 90 | | 5.7 | Data Structure over MAC level and PHY layer | 91 | | 5.8 | Comparison between three forms of radio operation and duty-cycling 9 | 94 | | 5.9 | Type II HARQ - block diagram | 95 | | 5.10 | FEC mechanisms versus SNR for WBAN specification | )() | | 5.11 | Simulation results with a SNR variable | )1 | | 5.12 | BCH Code Rates for AWGN Channel (BER vs. $E_B/N_0$ ) | )2 | | 5.13 | Hub to a plane composed of three nodes, example | )3 | | 5.14 | Relative Power TX vs. Sink-Sources distances | )5 | | 5.15 | Fitted transmission power area according distances | )6 | | 5.16 | Variation of the link energy in the WBAN, depending on the distance and | | | | nodes | 17 | | 5.17 | Energy in the network for AWGN Channel up to 300 Nodes | 20 | | 1 | MAC Frame Format for WBAN | 38 | | 2 | Layout of access phases in a beacon period (superframe) for beacon mode. 13 | 38 | | 3 | MB-OFDM Bands distribution | 12 | | 4 | Initial Setup; Floorplanning; and Powerplanning | 13 | | 5 | Roating; Verification Stage; and Final IC | 13 | | 6 | Modulation Types for a Binary Example | 14 | | 7 | Modulation using codification PPM with M-ary, M from 1 to 4 14 | 15 | | | | | ## **LIST OF TABLES** | 2.1 | Time Hopping Spread Spectrum and IR-UWB Characteristics | |------|-------------------------------------------------------------------| | 2.2 | Characteristics of UWB systems | | 2.3 | UWB operating frequency bands | | 2.4 | User Priority in WBAN | | 2.5 | Power Consumption in a BLE receiver front-end components | | 2.6 | Power Consumption of a BLE Transceiver | | 2.7 | Key Parameters for an IR-UWB WBAN system | | 2.8 | TX Power consumption overview | | 2.9 | IR-UWB WBAN specifications | | 3.1 | The first five derivatives of Gaussian pulse | | 3.2 | Pulses and communication parameters | | 4.1 | BCH code rates | | 4.2 | WBAN Stream Parameters | | 4.3 | QC-LDPC Codes | | 4.4 | Area and Power Comparison for Low Data Rates | | 5.1 | Type II HARQ, Flowchart Description | | 5.2 | Hardware Power Consumption per Code Rate | | 5.3 | Simulation Parameters to model the Power in the Receiver | | 5.4 | Performance summary of UWB chipsets | | 5.5 | Roadmap of the Wireless Integrated Circuits and Systems Group 107 | | 5.6 | Performance summary of IR-UWB (Tx/Rx) | | 5.7 | Performance of the IR-UWB (7 <sup>th</sup> Derivative) | | 5.8 | Performance summary of FM-UWB (Tx/Rx) | | 5.9 | Performance and Comparison of References | | 5.10 | A Summary of the Energy from the Previous Tables | | 5.11 | Characteristics of off-the-shelf ICs | | 5.12 | Interesting Power and Energy values | | 5.13 | Simulation parameters | | 5.14 | Simulation parameters for the model | | 1 | A.1 - Estimated number of bits in the packets | | 2 | Positioning to convert in pulses over time | ## **NOMENCLATURE** | (P.T) | $\frac{1}{2}$ Power multiplying the setup time in the receiver | |---------------------|----------------------------------------------------------------| | (P.T) | setup,TX Power multiplying the setup time in the transmitter | | $\alpha_i$ | The node switching activity factor | | $\beta_i$ | Input signal time and short circuit factor | | $\chi_n$ | The eigenvalue the PSWF function | | δ | Complementary percentage value of $\phi$ | | $\epsilon_{add}$ | Addiction operation energy consumption | | $\epsilon_{enc/de}$ | c Energy consumption of encoding/decoding | | $\epsilon_{inv}$ | Inverse operation energy consumption | | $\epsilon_{mult}$ | Multiplication operation energy consumption | | $\eta$ | Power efficiency | | λ | Free space wavelength | | $\lambda_n$ | Power effectiveness of the pulse | | Ω | Bandwidth | | $\omega_c$ | Column weight | | $\omega_r$ | Row weight | | $\phi$ | The percentage of encoder or decoder in the whole circuit | | $\psi_n(t)$ | PSWF function | | $\sigma(x)$ | Error-location polynomial | | $\sigma$ | Standard of the normal function | | $\sigma^2$ | Power spectrum energy | | $\sigma^2$ | Variance parameter of the pulse | Delay since the reference $au_0$ - $\xi$ Modulation parameter rate - $\xi$ Ratio of the Rx/Tx mean time spent - a Amplitude of the signal - B Bits in preamble or in data - b Bits per symbol - BW The bandwidth - C LDPC code - c A constant - c A valid codeword - C(t) Channel capacity - c(x) Transmitted message - $C_{L_i}$ Node load capacitance - d Distance - E Energy - e(x) Probabilistic error - $e_0$ ; $e_1$ Electric fields - $E_T^{TX}$ Minimum threshold for communication - $e_{amp}$ Energy consumption of the power amplifier - $E_{PL}$ Losses due the communication link - $E_{RX}^{Com}$ Receiver energy spent on the communication - $E_{RX}^{hw}$ Energy due to remaining hardware except for encoder or decoder - $E_{RX}^{i,Proc}$ Receiver energy spent in digital processing - $E_{TX}^{j,Com}$ Transmitter energy spent on the communication - $E_{TX}^{Proc}$ Transmitter energy spent in digital processing - ${\cal E}_T^{Com}$ Total energy from the communication $E_T^{Proc}$ Total energy of the processing circuit erfc(x) Complementary error function *f* Frequency $F_{op}$ Operating frequency g(t) Generator polynomial $G_d$ Power gain factor $g_r$ Receiver gain $g_t$ Transmitter gain $h_r$ Receiver height $h_t$ Transmitter height k Original message length L Packet length $LLR(c_i)$ Log-likehood ratio M Cardinality of the constellation M Modulation parameter m Mean of the normal function m Number of bits in the message $n \times m$ Codeword length vs. a set of parity equations n Encoded message length $N_f$ Receiver noise figure $N_{CW}$ Number of codewords $N_{pad}$ Pad bits number $N_{PSDU}$ Number of bits in PSDU packet $N_{tx,i}$ Number of transmissions required for success in communication P Power $P_b$ Probability bit error $P_c$ Circuit power consumption $P_{rx}$ Receiver power consumption $P_{syn}$ Frequency synthesizer power consumption $P_{tx,circ}$ Circuit power consumption of the transmitter $P_{tx,RF}^{i}$ RF power consumption of the transmitter Pr Probability Ratio between codeword lenght and additional parity bits r(x) Received message $S_T^{RX}$ Sensitivity level of the receiver SNR Signal-to-Noise in dB T Pulse duration for PSWF t Time $T_p$ Pulse duration $T_s$ Smapling time between pulses $T_{ACKW}$ Wait time for a acknowledge message $T_{on}$ Transceiver on time $T^i_{packet}\,$ Transmitted frame duration $T_{tr}$ Frequency synthesizer setting time W Bandwidth $x_0$ Initial point of the normal function c Speed of the light *k* Number of bits in a packet t Correction capacity of FEC decoder **H** Edge between variable and check nodes y Received codeword at the decoder input ## LIST OF ABBREVIATIONS AND ACRONYMS ADC Analog-to-digital converters AWGN Additive White Gaussian Noise ASIC Application-Specific Integrated Circuit APP "a posteriori" probability BCH Bose, Chaudhuri, and Hocquenghen codes BEP Bit Error Probability BER Bit Error Rate BP Belief Propagation BPSK Binary Phase Shift Keying CAP Contention Access Phase CMOS Complementary Metal-Oxide Semiconductor CNs Check Nodes CPA *Circulant permutation array* CTS Clock Tree Synthesis CSMA Carrier Sensing Multiple Access DBPSK Differential Binary Phase-Shift Keying DCA Digital-to-analog converters DQPSK Differential Quadrature Phase-Shift Keying DSE Design Space Exploration EAP Phases-exclusive Access phase Eeff Energy Efficiency Eeff-M Energy Efficiency Model ECC Error Control Coding ECG Electrocardiogram EEG Electoencephalogram EIRP Effective Isotropically Radiated Power FCC Federal Communications Commission FCS Frame Check Sequence FE Front-End FEC Forward Error Correction FFT Fast Fourier Transform FIR Finite Impulse Response Filter FM-UWB Frequency Modulated Ultra Wideband FSK Frequency Shift Keying GEM General Energy Manager GDSII Graphic Design System II HARQ Hybrid Automatic Repeat reQuest HBC Human Body Communications IC(s) *Integrated Circuit(s)* IEEE Institute of Electrical and Electronics Engineers IoT Internet of Things IP Intellectual Property IR-UWB Impulse Radio for Ultra Wideband ISM Industrial, Scientific, and Medical ISO International Organization for Standardization ITRS International Technology Roadmap for Semiconductors ITU International Telecommunications Union LEC Logic Equivalence Checking LDPC Low Density Parity-Check Codes LLC Logical Link Control LOS Line of Sight MAC Medium Access Control MAP Managed Access Phase MB-OFDM Multi-band OFDM MICS Medical Implant Communication Service MPDU MAC Protocol Data Unit MSDU MAC Service Data Unit M2M Machine to Machine MSA Min-Sum Algorithm NB Narrowband NLOS Non-Line of Sight PHY Physical Layer PLPC Physical Layer Convergence Protocol PLL Phase Locked Loop PPDU Physical Protocol Data Unit PPM Pulse Position Modulation PRF Pulse repetition frequency PSD Power Spectral Density PSDU Physical Service Data Unit PSWF Prolate Spheroidal Wave Function(s) OFDM Orthogonal Frequency Division Multiplexing OOK On-Off Keying (modulation) OSI Open System Interconnection QoS Quality of Service QPSK Quadrature Phase-Shift Keying RAP Random Access Phase RC Resistance and Capacitance RF Radio Frequency RFID Radio Frequency Identification RS Reed-Solomon Code RTL Register Transfer Level RX/Rx Reception or Receiver SAP Service Access Point SDC Synopsys Design Constraints SDF Standard Delay Format SINR Signal to Interference Plus Noise Ratio SNR Signal to Noise Ratio SPA sum-product algorithm SoC System on a Chip TCL Tool Command Language TX/Tx Transmission or Transmitter TOA Time of Arrival UWB Ultra Wideband VHDL VHSIC Hardware Description Language VNs Variable Nodes VHSIC Very High Speed Integrated Circuit VLSI very large-scale integration WBAN Wireless Body Area Network WET Wireless Energy Transfer WMTS Wireless Medical Telemetry Services WSN Wireless Sensor Networks WUR Wake-up Radio ## **CONTENTS** | 1 II | NTRODUCTION | 20 | |-------|---------------------------------------------------------------------|----| | 1.1 | Motivation | 21 | | 1.2 | Objectives | 24 | | 1.3 | Relevant Issues and Contributions | 25 | | 1.4 | Thesis Organization - Outline | 26 | | 2 C | CONTEXTUALIZATION AND THE STATE OF ART | 27 | | 2.1 | IR-UWB Communication | 28 | | 2.1.1 | UWB regulation | 29 | | 2.1.2 | IR-UWB spectral management | 31 | | 2.1.3 | UWB PHY Specifications | 33 | | 2.2 | Standard Overview (IEEE Std 802.15.6) | 36 | | 2.2.1 | Data Communication in UWB | 38 | | 2.2.2 | Comparison of the Power in Transceivers for Data Communication | 39 | | 2.3 | Digital Integrated Circuit Design based on Standard-Cells | 40 | | 2.3.1 | Some Low-Power Techniques | 41 | | 2.3.2 | Digital Circuit Design Method: the Standard-cell Approach and Tools | 43 | | 2.4 | Related Works | 47 | | 2.4.1 | Cross-Layer Reference Work | 48 | | 2.4.2 | BCH Reference Work | 49 | | 2.4.3 | A Low-power IR-UWB CMOS Transceiver | 49 | | 2.4.4 | Initial Specification Summary | 51 | | 2.5 | Summary | 51 | | 3 U | IWB PULSE SHAPING AND MODULATION | 52 | | 3.1 | Pulse Shaping | 52 | | 3.2 | Pulse Shape Analysis | 56 | | 3.2.1 | PSWF – Frequency vs. Time Design | 57 | | 3.2.2 | Pulse Generation Hardware | 59 | | 3.2.3 | Pulse Reference Values | 62 | | 3.2.4 | Synchronization Problem | 62 | | 3.3 | Modulation and Coding Evaluation for Energy Efficiency | 63 | | 3.3.1 | Modulation Techniques Comparison | 65 | | 3.3.2 | Specification Summary (Cont'd) | 65 | | 3.4 | Summary | 66 | | 4 | FORWARD ERROR CORRECTION (FEC) DECODING | 67 | |------------|-----------------------------------------------------|-----| | 4.1 | FEC Review | 67 | | 4.1.1 | BCH and LDPC Theory | 67 | | 4.1.2 | FEC Code Performance | 68 | | 4.1.3 | Reconfiguration Process | 69 | | 4.2 | FEC Architectures | 71 | | 4.2.1 | Forward Error Correcting Codes | 72 | | 4.3 | Power Comparison Ralated to the Modules of the FECs | 80 | | 4.4 | Complementary Comparison of Hardware Energy | 81 | | 4.5 | Summary | 82 | | _ | | | | | CROSS-LAYER ENERGY MODELING | 83 | | 5.1 | Communication Aspects of PHY Layer in WBANs | 83 | | 5.1.1 | | 84 | | <b>5.2</b> | Channel Coding for WBANs | 86 | | 5.2.1 | Energy Impact of the Protocols in MAC Level | 89 | | 5.2.2 | Cross-Layer Energy Efficiency Model | 92 | | <b>5.3</b> | Link Budget Influences | 99 | | 5.3.1 | Link for the WBAN (Distance vs. TX/RX Power) | 102 | | <b>5.4</b> | Benchmarking and Roadmapping | 105 | | 5.5 | Enhanced Cross-Layer Proposal | 111 | | 5.5.1 | Energy Efficiency Model (PHY Level) | 111 | | <b>5.6</b> | Summary | 122 | | 6 | CONCLUSIONS AND FUTURE WORK | 123 | | | | 123 | | 6.1.1 | | | | 6.1.2 | | | | 6.1.3 | - | | | 6.2 | Future Works | | | <b>U.</b> | rutte works | 120 | | REF | ERENCES | 127 | #### 1 INTRODUCTION A new stage of technological resources is emerging in the first decades of the $21^{st}$ century. Big Data and data analytics, new buzzwords that congregate several concepts, have a central role in a system where the information is unfolded over several domains such as Internet-of-Things (IoT), healthcare system integration, data fusion, advanced sensors spread over Smart-Cities or smart physical infrastructures, and so on. All these topics are interconnected in a certain way. Technical advances in microelectronics, in parallel with this macro-context, e.g., three dimensions (3-D) Integrated Circuits (ICs) (VERHELST; DEHAENE, 2009), FinFET transistors below 10 nm sizes, and many others, are available to drive to new information processing ways and to modify key aspects of the power consumption of the most advanced hardware, and ICs in particular. Microelectronics has an essential role in such scenario, developing key components of the devices that allow sensing the available data, gathering information by wireless and wired networks that operate seamlessly, selecting, storing and processing them (GAMBINI et al., 2012). As every single part of the whole system spends very tiny portions of energy, at some moment, the total energy required for this widely distributed system reaches a considerable amount of energy to keep it unceasingly operating. Thus, ultra-low-power hardware components and the respective low-power communication systems are increasingly under demand. Moreover, humanity always strived for better quality of life. A multidisciplinary work-force becomes every day more active across the academic and industrial world, in some cases improving this effort with government support. Therefore, one of the areas of greatest commitment is the preservation of the human health, an area in which financial resources are heavily spent, where the biomedical systems are inserted. Enabling new, challenging and deep research in engineering, towards modern and smart health-care systems. Recent advances for better biomedical systems are studied today, as addressed in many academic surveys (CHEN et al., 2011), (ULLAH et al., 2012), (MOVASSAGHI et al., 2014). As a result, the establishment of the standards and amendments by IEEE is synchronized with the current research efforts. The new wireless technologies applied specifically to human interface, medical communication system, and interaction possibilities contribute to stimulate research on several related areas and to exploit new energy efficiency techniques. Based on these IEEE standards, every single work brings contributions for effective deployment of human health care and associated activities. Reduction of the energy consumption can be studied at different levels, starting at the source that generates the information in the human body, up to the information sink - and including all ways and communication channels in which data propagation occurs. The self-reconfigurable features of the communication system intend an adaptation based on the feedback from the physical communication channel, regarding its energy which impacts the quality of information sent or the demand for better quality of service (QoS). This feedback signal alerts the transceivers about the needs for changes, which include actions to promote variations on rates, on channel coding schemes, filter adjustments with a selection, all that may change the communication throughput and quality. The self-awareness over RF channel is directed to get the better optimization point with such variations and at some moments to fulfill the immediate needs. Keeping these aspects in mind, this thesis analyzes the energy efficiency of parts of a communication system that could be used in the biomedical domain. A few energy optimization opportunities are subject of the research. The transceiver block-diagram of Figure 1.1 represents the architecture of the interface between the link and the physical layers. In this research we focus in wireless body area network (WBAN), following the standard 802.15.6 from IEEE (IEEE, 2012). The research is about the PHY layer integrated with the MAC sublayer to exploit the benefits of such a cross-layer model, considering that the applications are in the biomedical field. Figure 1.1 – Transceiver Block Diagram. Source: (ROCHOL, 2012), modified by the author. In Fig 1.1, the following LLC, MAC, CC, and CD blocks stand for, respectively, the Logic Link Control, the Channel Coding, the Medium Access Control, and the Channel Decoding. Particularly, the channel coding and decoding (CC/CD) will be addressed at the logic circuit design level in this thesis, to model its energy efficiency. ## 1.1 Motivation Currently, energy efficiency is a very relevant topic for many new technologies. Low Power, used internally in the RF circuits, is always pursued and now also has an inflection point characterized by the use of Ultra-Low Power range. The new advances in wireless communication systems combined with the constant evolution and challenges of IC (integrated circuits) design provides the motivation for the topic of this Thesis, in which the IR-UWB system as well as circuit characteristics will be assessed and researched. An example is the level of evolution and of the scalability of the digital CMOS designs, which benefit directly from the geometry scaling that has marked successive CMOS generations over the years. More recently, Low Power CMOS design techniques became a hard-to-miss feature for these designs. Moreover, the increased demand for digital signal processing integrated into a single chip or using systems-on-a-chip (SoC) have resulted in a great need for new designs and research in applying low power CMOS to communication systems. Even bio-engineering technology has benefited from the low power characteristics of CMOS circuits. For the energy efficient systems design, it is desirable to develop and to master a model that represents the energy consumption across the first two layers (ISO, 1988) and that translates the impact of any technological microelectronic node into the system. Such model needs parameterization and scalability, and that provides the motivation of this Thesis to tackle energy consumption in different levels of the IR-UWB communication system. A particular application scenario is also needed to feed critical model parameters. For this reason, medical system applications are in the background of the motivation for the global objective of this work. The application of the UWB frequency range and the intermittent nature of the communication, as well as its desired low power, are good properties of the system to be applied in biomedical sensor instrumentation. The high capability of penetration in biological tissues, which allows non-invasive methods, is also advantageous in UWB. The use of Impulse Radio (IR) can satisfy the necessary efficiency and simplicity by leveraging the use and access of the spectrum, which is the best option under some circumstances, current in use (LECOINTRE; DRAGOMIRESCU; PLANA, 2008). In the IR technique, very short pulses are transmitted spending relatively minimum energy. The IR has an inherent simplicity in the design of the TX, since it is carrier-less. The challenge in this communication system by IR is that the RX has to perform a non-synchronous detection of such short pulses, which presents timing and multi-path effects that have to be dealt with by the RX front-end design. A significant number of academic and industrial investigations on UWB using such technology are known usually by the term IR-UWB (OPPERMANN; HAMALAINEN; IINATTI, 2004). In the years 2000 and onwards, much research was devoted to IR-UWB circuits and systems development. Furthermore, there are several open optimization possibilities in literature, since a simple digital circuit design up to a complex organization of intra-layer blocks, surfacing all the concepts that bring advantages in terms of ultra-low power use. An example of these advantages may be characterized in a practical case, such as the delayed replacement of pacemakers in patients due the durability extended over the time of their batteries, as a result of the use of energy efficiency circuits. Nevertheless, the application mentioned in this thesis is the WBAN using UWB; the energy model is also intended to be suitable for non-medical areas (*e.g.*, in the military area). Therefore, this wireless sensor network paradigm serves a variety of medical and non-medical applications with the respective adjusts, but only IR-UWB for WBAN is addressed in deep here. As a case study, Fig. 1.2 shows an example of outdoor application, where the architecture follows a star topology of nodes (devices in the runner and the smartphone), monitoring in real time and directing the information to a medical monitoring center. The devices coupled to the runner include sensors and RF transceivers, which have a convergence point to receive Figure 1.2 – Example of WBAN's architecture – network for human signal monitoring. Source: the author. and coordinate all data (the Hub). The collected information travels through the networks until arriving at the agent that will carry out the analysis - which in this example could be a center of medical monitoring. The remote individual information, such as Oxygenation $(SpO_2)$ , Electroencephalogram (ECG), Electrocardiogram (EEG), Electromyogram (EMG), and Blood Pressure, are then available to the monitoring center anywhere in the globe. Resources sharing minimizes or even avoids the traffic overload generated by applications in a complex network such as smart surveillance using the Internet of Things sensors, as shown in (KUNST et al., 2018). To deal with an heterogeneous network scenario (*e.g.*, smart cities and military surveillance such as borderline security) some requirements are essential, for instance, keep the thresholds of delay, interference, and jitter on the communication, QoS of the real-time video application, low overhead control mechanism, cost-benefit of the spectrum allocation, and respect the time constraints for handover operation. Figure 1.3 represents an example. The IEEE 802.15.6 deals with body area networks or wireless communication for personal networks with the purpose of short-range monitoring and communicating actions to actuators, if and when necessary. This standard regulates mainly the techniques for medical application of the most critical physiologic signals, as a way to get better results to diagnose diseases and alterations. Although not restricted to biomedical applications, in a broader sense it could be seen as a wireless sensor network (WSN). Moreover, WBAN's research is a current topic, either to find reliability, a better energy efficiency during a long time of operation, or even to meet the security requirements. Thus, new concepts can be availed for the purpose of this research. Finally, these bidirectional and interconnected topics bring by themselves more motivational strength. IoT Devices Cyber Cloud Mobile device Network Server Medical Monitoring Center Figure 1.3 – Network for a Smart City with IoT sensors around. Source: the author. ## 1.2 Objectives The establishment of high-level energy models is one of the objectives of this work. This will be done assuming UWB communication for the WBAN. Several aspects are addressed, and these include: i) the effects of the UWB pulse shapes generated at the TX in terms of energy efficiency, as an impulse radio (IR-UWB); ii) the hardware of the FEC decoder analysed at the CMOS logic design abstraction; iii) the evaluation of the Wake-Up Radios (WUR), the interaction between the PHY-MAC protocols, and the Front-End power consumption; iv) the link budget and the assessment over the MAC performance. The two main layers (PHY and MAC) of the cross-layer architecture of the WBAN are addressed for such purpose. Additionally, the peer-to-peer communication made by IR-UWB baseband transceivers in a star topology (IEEE, 2012) is the base to configure the communication system context for a body-area environment. As the thesis addresses issues of digital communications and links them to microelectronics design, the PHY level of the system is focused. The FEC hardware is where CMOS digital logic design is applied from the FEC algorithm down to the FEC logic netlist, where the correspondent power equations and estimated values can be determined with high precision by the EDA tools. In the application scenario, the thesis considers the energy model of an IR-UWB transmitter. Furthermore, an overview of the algorithms and their protocols for energy efficiency at MAC level are presented in this work as a part of the system, as well as the power consumption that is used at the operation level, necessary for communication and information exchange. A set of energy benchmarks and related works from literature are assessed as references. The final power budget of the whole system is aimed to be achieved with its theoretical and analytical evaluation. Synthetically, the research questions which comprise the body of the results in this thesis are: - 1. Determine which waveforms and modulation variations for the UWB transmission have the best performance, analyzing the parameters such as the coding efficiency, payload impact, throughput, and the gain from theoretical prolate spheroidal wave functions (PSWF) usage. - Investigate the forward error correction block (FEC), working with FEC coding (PHY layer) with VHDL synthesis possibilities to estimate the power for different code types, and FEC parametric variations. - 3. Verify the energy consumption of the baseband block using data from a synthesis of the VHDL code (checking the usual energy per bit/pulse). The resulting circuits will impact on the global energy consumption model. - 4. The communication between two transceivers, considering the channel, the link budget, leading to a new Energy Efficiency Model ( $E_{eff}$ -M) for the network. - 5. Additionally, get the specifications for the energy efficiency model, based on MAC protocols and mainly in the operation duty cycles. #### 1.3 Relevant Issues and Contributions The main contribution is to define a new model based on the existing ones (KARVONEN; IINATTI; HÄMÄLÄINEN, 2015), replacing the instruction cycle components by the information of power consumption from the benchmarking and the decoder hardware of the FEC. The contributions are aligned to the respective objectives, and the microelectronics design aspects are emphasized in the FEC hardware design for low power consumption: - UWB waveforms and modulation techniques are explored and optimized for energy, contributing to an adjustable level of energy efficiency at both the TX and RX. - Different implementations of UWB transceiver design are analyzed, with a comparison of their features, mainly the power, IC area, and operational frequency. - The work is based on the WBAN architecture, thus the UWB transmitter is planned to be suitable to such network and to consider the pros and cons due to the cross-layer interaction. - The problem formulation is made to improve the model of energy, where an assessment of the best synchronization between nodes and the hub is necessary. Taking advantage of the technological scale independence of digital design to suit it into microelectronics technological nodes, some designs were exploited such as 180 nm, 65 nm or smaller. The FEC coding schemes and the IR-UWB perspectives are reviewed. The proposal places the IR-UWB as the core communication system for this kind of network. At this point, the Energy Efficiency Model begins to be modeled toward a cross-layer Approach, considering the MAC and PHY interdependencies. Improvements are investigated in a cross-layer energy efficiency model for IR-UWB transmitter, applications in WBANs are modeled mainly at the PHY layer as focus, aiming to maximize the energy efficiency. The adaptation of the IR-UWB to the spectrum and environment conditions is necessary for the communication to achieve the best results. Then, the energy evaluation for the system is directed to the hardware design, for the codecs designed in CMOS VLSI, and to the top level analysis. ## 1.4 Thesis Organization - Outline This thesis is organized as follows: Chapter 2 presents an introduction to Impulse Radio (UWB context and regulation) and an overview of the Standard IEEE 802.15.6, regarding the wireless body area networks. To bridge the text towards those interested only in the data communication field, an introductory text about microelectronics and its design techniques is included in this chapter. In Chapter 3, at the bottom level of the model, a Pulse Shape Analysis is made, and some digital designs exemplify the PHY layer. Possible energy gains due to specific UWB signal waveforms and time-frequency domains are considered. The circuit of the transmitter is detailed, and the proposed design intend to be compatible with modulation modes and techniques. The concepts and theoretical basis are also placed in this chapter for the IR-UWB. Forward Error Correction for decoding mechanisms is addressed in Chapter 4, as well as the conceptual review of the forward error correction methods for BCH and LDPC. Both FEC algorithms are considered in this chapter, two different FEC designs are addressed, and a power comparison among these designs are presented. The convergence of results is presented in Chapter 5, highlighting the link budget influences, Communication aspects of PHY Layer and Channel Coding for WBANs, such as the power values from the benchmarking and the cross-layer energy protocols (some MAC layer procedures are addressed). At this point, the improvements on the energy model are presented, highlighting the impact on energy efficiency. Finally, Chapter 6 presents the conclusions of this Thesis and lists possible future works. #### 2 CONTEXTUALIZATION AND THE STATE OF ART The Work Group of IEEE to standardization of WBANs published the IEEE 802.15.6 (IEEE, 2012) in February 2012. The purpose was to regulate networks where low-power devices are their components, a WSN of short-range applied at human body area, taking signals to a single hub – which is, in other words, the coordinator or the sink of a network with several nodes. This chapter presents a quick overview of some aspects of this standard, as a way to view the problem addressed in this thesis. The IEEE 802.15.6 Standard gives the possibility to choose between Narrowband (NB), Ultra Wideband (UWB), and Human Body Communications (HBC) as the PHY layer to be used in the WBAN. They are used to proceed with communication and determine how the access mechanism will work. The author chose the scope of this research to be the UWB, where the microwave frequencies are employed with ultra-fast pulses, which lie in a range from 3.1 GHz up to 10.6 GHz. The properties of these pulses can be exploited for the proposed IR transmitter design. This spectrum has as one of its advantages the possibilities of use in other applications, not limited to the human body scan. The UWB was considered since in the first decade of the years 2000 it was proposed as a better choice to deliver Industrial, Scientific, and Medical (ISM) signals. It has a complete set of work experiences of more than one decade. There is a scenario favorable to merge and to consolidate knowledge on UWB in the present (ULLAH et al., 2012). Other commercial technologies evolved over the years to compete for short-range communications, all based on narrow band communications, like Bluetooth Low-energy (BLE) and ZigBee, which today have a more significant impact in real world applications than the proposed 802.15.6 in an UWB PHY. Some potential pros and cons of UWB for communication are: - IR-UWB has simple implementation when compared with another types of radio hardware, specially at the RF front-ends. - UWB has a considerable Multipath resistance, but it is susceptible to interference. - UWB is considered by some as ideal for short-range communications, although such claim has not led to the rise of UWB communications over the more consumer-market penetration experienced by Bluetooth and 802.11.nn WiFi, for example. - The pulsed natured of the UWB TX/RX can achieve both high and low (for power savings) data rates, i.e. throughput is easily adjustable. - The duty cycle nature of IR-UWB can cause synchronization issues, but it is easy to handle. - By using UWB hardware, its simplicity will imply a more simplified RF front-end, and subsequent signal processing, and with the consequent reduction in the energy consumption. Good examples of applications using UWB are given in the literature, and include automotive guidance radar, RF Identification (RFID), penetration radar imaging, telemetry and location devices, surveillance and guidance systems, targeting tracking, and particularly others used in the military context. Some other examples of ISM usage of UWB are exemplified by satellite communications, as global navigation (ANTREICH; NOSSEK, 2011), radar systems based on UWB signals, soldiers locations in campaign arranging and maneuver, among others (GAM-BINI et al., 2012). Specifically, the concerns of this research converge for the use of IR-UWB over a WBAN context. In medical applications, the collection of biological signals and all of their data through monitoring devices allows viewing and controlling user conditions in specific scenarios. The networks dealing with biomedical signaling of the human health conditions will be increasingly present also outside hospitals facilities. Thus, the human-machine interface or even Machine-to-Machine (M2M) interfaces tend to be always used in the next generation of operational designs. As a specific case of WSN, the WBAN scenarios follow a star topology, distributing the elements from an initial network configuration to slaves nodes through the master node (hub or the coordinator). The correct coordination exchanging messages on the communication system keeps it running. Thus, between the hub and nodes direct frame exchanges occur, and, eventually, a relay-capable node can be added. Thus another level of communication can work and enlarging the network extension becomes possible using this two-hop star WBAN topology. As mentioned before, in the global model of the IR-UWB baseband transmitter this work focus on two layers: PHY layer, and the MAC sub-layer (of the Link Layer). And both are addressed in a Cross-Layer model. For that reason, the required MAC definitions will follow IEEE 802.15.6. Remarkably, there are other standards that can be used as references in addition to IEEE 802.15.6, for instance: IEEE 802.15.4j TM-2013 – an alternative physical layer extension to support medical body area network (MBAN) Services operating in the 2,360 MHz – 2,400 MHz band and IEEE 802.15.4k – physical layer specifications for low energy, critical infrastructure monitoring (LECIM) networks. Summaries of these WBAN standards exist in some works of the literature, *e.g.*, (BARRAS, 2010). The specification for low data-rate, low consumption, and very low hardware and architecture complexities was placed by IEEE 802.15.4 standards, developed for Wireless Sensor Networks. It is a low-power version of IEEE 802.11 for WLANs and the precursor of IEEE 802.15.6. There are also other sets of frequencies for biomedical engineering applications, *e.g.*, with a range of 402-405 MHz used for implants in the medical implant communication service (MICS). Therefore, these standards represent opportunities to be exploited in this field. ## 2.1 IR-UWB Communication UWB radio frequencies are spread over the 7.5 GHz bandwidth, ranging from 3.1 GHz to 10.6 GHz. UWB has been considered since the early 2000's as one of the most useful choices for short-range communications. By consequence, the ultra-low power transceiver designs for UWB is a current topic in scientific studies, where shaping the pulses of sub-nanosecond duration is a challenging circuit design task. Impulse radio for UWB (IR-UWB) transmission is carrier-less and has human body permeability as one of its properties, which makes it attractive in terms of energy consumption, versatility, and potential for a short-range application. It demands low-power and simple hardware implementations while imposing a fine time resolution achievable in advanced complementary metal oxide semiconductor (CMOS) integrated circuits. Ensuring low-power consumption in IR-UWB is of paramount importance in current technology (GHASEMPOUR et al., 2012). Improved energy efficiency ( $E_{eff}$ ) is attainable with a transmitter design which allows transmission parameters to self-reconfigure (OTT; EISNER; EIBERT, 2012). Traditional setups of such parameters use Gaussian- and square-like pulses. Throughout this work, the waveforms exploration is extended by the use of Prolate Spheroidal wave functions (PSWFs) (NEVES et al., 2012), (SLEPIAN; POLLAK, 1961), in addition to the former ones. The block that produces PSWFs is typically linked to the modulation block in the transmitter, where the modulation techniques, e.g., pulse position modulation (PPM), on-off keying (OOK), or binary phase shift keying (BPSK), also aiming to improve energy savings (NIEMELA; HÄMÄLÄINEN; IINATTI, 2013). Therefore, the IR-UWB analysis will evaluate three basic classes of waveform implementations: (I) Gaussian pulse and its derivatives, (II) PSWFs function/formats, and (III) square pulse. The comparison metric will be the impact caused by each class of waveform on the $E_{eff}$ of the IR-UWB system. As mentioned before, implementations of this UWB system are in satellite communications, such as global navigation, radar systems, medical imaging, embedded in vehicles, surveillance and monitoring areas, radio frequency identification (RFID), sports medicine, among others (GAMBINI et al., 2012). The energy concepts used for wireless body area networks (WBAN) communications are extendable to correlated wireless communication systems, such as wireless sensor networks and electronic vests to monitor physiological activities, for example, electrocardiogram (ECG) (CHÉTELAT et al., 2016). Applications of IR-UWB in the context of WBAN, specified in the standard IEEE 802.15.6 (IEEE, 2012), deal with 250 Kbps or 487.5 Kbps, as is the case of e-health systems that are used to monitor human conditions by their biological and behavioral signals. Are essential to keep both values in mind for the simulations described in some of the chapters ahead. ## 2.1.1 UWB regulation The regulations of the USA Federal Communications Commission (FCC, 2002) and by the European Association for standardizing information and communication systems (ECMA organization), which issued the standards (ECMA, 2008a) and (ECMA, 2008b), are essential references for this subject. FCC standardization provides a spectrum mask for UWB channels and ECMA deals with High Rate Ultra Wideband PHY and MAC Standard including its interface. How to perform the most efficient integration among OSI layers is an issue left to the actual system implementation. The understanding of RF transmitter blocks contributes to such purpose. A large number of documents dealing with UWB brings the masked figure determined by the (FCC, 2002). It is limited in dBm per MHz with the corresponding variation for each frequency. Figure 2.1 shows the limits of this mask, both indoor and outdoor. In most of the mask, Indoor overlaps the outdoor, indicating that the effective isotropic radiated power is the same for both. Figure 2.1 – FCC Mask and its limits (indoor in red color, outdoor in green color) Source: (FCC, 2002), modified by the author. The development of the circuits, as mentioned in (WENTZLOFF, 2007), should then focus on the energy efficient aspects. The advantages need to be evaluated in terms of energy consumption from the several IR-UWB implementations existing in the literature, thus new constraints can bring new opportunities, an example of the recent 5G boundaries at 5.25 or 6 GHz are inside UWB range, invalidating some implementations whereas open issues in IR-UWB still exist to solve problems like interference and co-working. On the other hand, differently from the FCC, the mask provided by the European ECC regulation is located at two separate bands, which have operating frequencies in 3.1-4.8 GHz and 6-8.5 GHz. ECMA originally stood for "European Computer Manufacturers Association" and is now seen as an international body (BARRAS, 2010). In 2005, ECMA released two proposals for high-speed UWB standardization. The ECMA-368 standard (ECMA, 2008a), entitled "High Rate Ultra Wideband PHY and MAC Standard" was approved as ISO/IEC 26907 International Standard in March 2007, specifying a distributed medium access control (MAC) sublayer and a physical layer (PHY) for wireless networks, compatible to high data rate communications between a diverse set of mobile and fixed electronic devices. It was followed by the ECMA-369 standard - "MAC-PHY Interface for ECMA-368" (ECMA, 2008b), later approved as ISO/IEC 26908 and specified the MAC-PHY interface for a high rate for ultra-wideband wireless transceiver. The European Telecommunications Standards Institute (ETSI, 2015) also uses two bands (from 3.1 GHz to 4.8 GHz, and from 6.0 GHz to 8.5 GHz), which differs from the FCC specification. The International Telecommunication Union - Radiocommunication Sector (ITU-R) (SECTION, 2015) is another body involved in these regulations. ## 2.1.2 IR-UWB spectral management The short-range wireless communication collaboratively working in an *ad hoc* sensor network is a promising technology. A way to implement that is using, in the PHY layer, the IR-UWB or another IR technique, like the Multi-band Orthogonal Frequency Division Multiplexing (MB-OFDM), as shown in Appendix "D". Where the spectrum is divided into up to 14 separated bands with a range of 528 MHz each. Meanwhile, UWB can be defined as a direct to antenna transmission, for which the emitted signal bandwidth (BW) is around 500 MHz (the signal bandwidth defined for a -10 dB reduction with respect to the peak in its power spectrum density) or for signals of 20% of fractional bandwidth (SECTION, 2015) and (ETSI, 2015), which is the -10 dB BW divided by the band central frequency. Also, UWB is associated with radio equipments employing signals with a bandwidth higher than 1.5GHz or, as an alternative, whose signals have a -10 dB intensity range spread over at least 25% of the spectrum. The spread-spectrum due the IR-UWB has its bandwidth located from 3.1 GHz up to 10.6 GHz, reflecting a major challenge to generate at the TX antenna ultra-fast pulses of sub-nanosecond duration, in a sequence as in Equation 2.1. This transmission mode has other features which make it attractive in terms of energy, versatility, and potential use. For instance, the IR-UWB does not use a carrier, the transmission can employ low power, simple hardware (without an RF mixer), and has a very low time resolution. The digital modulations in this context can be OOK (on-off keying) or PPM (pulse position modulation), considered later in this study. $$UWB_{signal}(t) = \sum_{-\infty}^{\infty} amplitude.signal_{waveform}(time - delays)$$ (2.1) A summary of techniques for UWB signals is as follows. The IR-UWB can be divided into two different multiple access techniques, the Direct Sequence Ultra-Wideband (DS-UWB) and the Time Hopping ultra-wideband (TH-UWB). In MB-OFDM the spectrum has sub-bands of the bandwidth of 528MHz each (ECMA-369). The Orthogonal Frequency Division Multiplexing (OFDM) technique is used in both Spread Spectrum Direct Sequence Spread Spectrum (DSSS) and Time Hopping Spread Spectrum (THSS), and they are generated by continuous sinusoidal waves modulated with a fixed carrier frequency. Otherwise, the DS-UWB and TH-UWB are typically baseband signals and are composed by narrow UWB pulses. For standard purposes, the DS-UWB (the shaded row) is the main interest for the WBAN design, with bandwidths set at 499.2 MHz. Some characteristics of such signals are summarized in Tables 2.1 ## and 2.2. Table 2.1 – Time Hopping Spread Spectrum and IR-UWB Characteristics. | | Type | Characteristics | Bandwidth | |-------------------------------------------------|--------|------------------------------------------|------------------| | Spread Spectrum DSSS Sinusoidal continuous wave | | Sinusoidal continuous waves are | >= 500MHz | | | THSS | modulated with a fixed carried frequency | | | IR-UWB DS-UWB | | Baseband signals | < 500MHz | | | TH-UWB | and narrow UWB pulses | (typ. 499.2 MHz) | | Similarities 1) Advantageous use of a very large bandwidth. | | | | |-------------------------------------------------------------|--------------------------------------------------------|--|--| | | 2) Avoid effects of interference from signals sources. | | | Source: (HERINGER, 2007), complemented by the author. Table 2.2 – Characteristics of UWB systems. | Specifications | MB-OFDM | IR (DS-UWB) | |-----------------------|--------------------------|-------------------------------| | Number of Sub-bands | 3 mandatory up to 14 | 2 (3.1 – 4.85; 6.2 – 9.7) GHz | | Sub-band Bandwidth | 528 MHz | 1.75 GHz (lower band) | | Number of Sub-carries | 122 | Baseband Signals | | Spreading Factor | 1, 2 | 1-24 | | Data rates (Mbps) | 53.3, 80, 110, 160 | 28, 55, 110, 220 | | | 200, 320, 480 | 500, 660, 1000, 1320 | | Modulation | QPSK | BPM (mandatory), MBOK | | Multiple Access | Based on time-freq codes | Based on PN codes | Source: adapted from (HERINGER, 2007). Fig. 2.2 shows the trend of the bit error-rate (BER) in the transmission according to the channel quality (i.e. the signal-to-noise ratio - SNR at the RX), which indicates the behavior of the wideband system for two modulation techniques presented in the Table 2.1 (DS-UWB and DSSS). The trend in the figure is for a single user. For multiple users, the UWB tends to keep the same behavior, and the DSSS presents improvements in terms of error overruns. The SNR can be translated further in terms of the necessary energy to be coupled at the TX antenna to proceed with a given BER in the communication. The European Commission (EC) through the European Technical Standards Institute (ETSI) deals with -41.3 dBm/MHz in the frequency bands 4.2 to 4.8 GHz and 6.0 to 8.5 GHz. Extendable at the lower frequencies to 3.1 GHz and also up to 9 GHz in the higher band. Singapore, for instance, to stimulate the study and development of UWB usage, established an UWB friendly zone, allowing the relaxation of the effective isotropic radiated power limit of -35.3dBm/MHz from 2.2 GHz to 10.6 GHz. The IEEE-Std 802.15.3a proposal for high-data-rate has led to the multi-band MB-OFDM (14 bands of 528 MHz) UWB, a carrier-based communication protocol that divides the 3.1-10.6GHz UWB spectrum. The 802.15.3a task group also outcomes a Direct Sequence (DS) UWB standard, supported by the UWB Forum, using a very narrow pulse (from 100 ps to 1 ns) and considering a low band from 3.1 and 5.15 GHz and a high-frequency band between 5.825 and 10.6 GHz (FERNANDES; WENTZLOFF, 2010). Figure 2.2 – BER vs. SNR of wideband systems DSSS and UWB for a single user. Source: (GHAVAMI; MICHAEL; KOHNO, 2004). The OFDM technique has been widely used in several types of wireless systems. Biomedical monitoring is being increasingly investigated, primarily when it is assigned to WBAN context. An example of a digital modulator VLSI circuit intended to use these techniques is presented in Appendix D. A vast number of contenders or competitor possibilities are in use in the universal worldwide ISM used around 2.4 to 2.5 GHz, in this case for narrowband transmissions. In this frequency management scenario two other important bands exist: the Medical Implant Communications Service (MICS) and the Wireless Medical Telemetry Services (WMTSs). The former has the frequency range of 402-405 MHz and is used for implant communications, whereas the latter is used for medical telemetry systems, from 420-870 MHz, also in narrowband spectrum. It is worth to mention that, in general, the communication authorities of a country are capable of regulating the available frequencies for WBANs. ## 2.1.3 UWB PHY Specifications There are two modes of operation in the WBAN according to the standard 802.15.6: the default mode and the high QoS, the latter is used only in high-priority medical applications. IR-UWB is mandatory in both default and High QoS modes, and FM-UWB is optional as PHY in the default mode. The PSDU construction for the transmission is formatted with one scrambler, one interleaver, and the FEC encoder, adding Pad bits to complete the block. The Bose, Chaudhuri, and Hocquenghem (BCH) FEC encoder allows, in default mode, the use of a message with length of 51 bits generating the encoded message with 63 bits length, the maximum bit correction in that case is "2", BCH(63,51,2). Otherwise, at high quality or simply high QoS mode, the encoded message will have 126 bits length for a 64 bits message with a capability of up to 10 bit-flips of correction, ensuring higher robustness, i.e. BCH(126,64,10). The Pad bits are added to align at a symbol boundary. The Equation 2.2 gives the pad bits number according to the cardinality of the constellation (M) of a given modulation scheme, according to the standard. The $N_{CW}$ is the number of codewords, and each "N" is related to its equivalent number of bits in a packet. "n" and "k" are taken according to the BCH(n,k,t) code, representing the encoded message and the original one. $$N_{pad} = log_2(M) \left[ \frac{N_{PSDU} + (n-k)N_{CW}}{log_2(M)} \right] - \left[ N_{PSDU} + (n-k)N_{CW} \right]$$ (2.2) To proceed transmission, the PPDU bits are transformed into RF signals. For QoS mode, the BPSK/QPSK modulation schemes are chosen, and OOK is preferred for default mode. The modulation scheme that is not the priority in each mode becomes optional in that mode. But for both, the wideband frequency modulation (FM-UWB) is optional. The number of pulses transmitted in one second is the pulse repetition frequency (PRF) parameter. PRF and other timing parameters are set to produce the expected data rates for OOK and DBPSK/DQPSK modulations. As an example of setting parameters that influence the data rate, the FEC code rate for high QoS must be set as 0.5, otherwise it is set to 0.81. As well as the number of possible pulse positions can be 32 and the waveform can vary in nanoseconds pulse duration (2.003 ns - 64.103 ns). Table 2.3 contains the low and high bands according to the IEEE 802.15.6 standard, and at least one of specified band groups shall transmit in the UWB. Figure 2.3 represents the scenarios in which short-range data communication are grouped in broad regions (in the energy vs. data rate graph) (MOVASSAGHI et al., 2014). It is to be noted that systems requirements that may comply to the IEEE standards 802.15.6 and 802.15.4 are very promising for sub-mW power, long battery life (months to years), in applications that can operate at very low data rates (below hundreds of Kbps). The medical applications and WBAN, considered in this thesis, require the communication system to stay in the bottom left extreme of figure 2.3. This is the energy constraint that is the focus of this work. Some authors (SAPUTRA, 2012) mentioned that the energy efficiency improvements in autonomous applications such as implantable biosensors are required to compete with other radio schemes (*e.g.*, wake-up receiver or super regenerative transceiver). Fig. 2.4 depicted the context of Power (in mW) versus the throughput of the application (bit/s). In this case, contrasting to the one shown in Fig. 2.3, several technologies are presented separately on the Table 2.3 – UWB operating frequency bands. | Channel | Band | $f_{central}$ (MHz) | $f_{min}$ (MHz) | $f_{max}$ (MHz) | |---------|------|---------------------|-----------------|-----------------| | 0 | Low | 3494.4 | 3244.8 | 3744 | | 1 | Low | 3993.6 | 3744 | 4243.2 | | 2 | Low | 4492.8 | 4243.2 | 4742.4 | | 3 | High | 6489.6 | 6240 | 6739.2 | | 4 | High | 6988.8 | 6739.2 | 7238.4 | | 5 | High | 7488.0 | 7238.4 | 7737.6 | | 6 | High | 7987.2 | 7737.6 | 8236.8 | | 7 | High | 8486.4 | 8236.8 | 8736 | | 8 | High | 8985.6 | 8736 | 9235.2 | | 9 | High | 9484.8 | 9235.2 | 9734.4 | | 10 | High | 9984.0 | 9734.4 | 10234 | Source: IEEE 802.15.6 modified by the author. Figure 2.3 – Data Rate versus Energy/Battery Lifetime. Source: (MOVASSAGHI et al., 2014), modified by the author. Figure 2.4 – Energy efficiency of various wireless (a) transmitter and (b) receiver. ## 2.2 Standard Overview (IEEE Std 802.15.6) The communications regulated by the IEEE Std 802.15.6 occur in the vicinity of, or inside, a human body. It is titled as IEEE Standard for local and metropolitan area networks - Part 15.6: Wireless Body Area Networks. However, due to a multitude of components and devices, the standard does not specify the type of them and how these components operate, as well as their format. Each application and engineering development requirements will guide the choice (it could be a sensor application type, *e.g.*, for IoT) and other key communication system specifications - for example, the selected spectrum. The standard details some essential aspects that come from: the general framework elements, the MAC frame formats and functions, and even the three PHY specifications, where the security service is intrinsically associated. This section highlighted some of such aspects, as an overview focused on the UWB, ISM, and MICS. The standard serves well the cross-layer modeling purpose of this work, because the first two layers of the network (PHY and MAC) are conforming to the proposed analysis. Three PHY layers already mentioned are supported by the MAC level according to this standard. How to increase the useful life of each node in a WBAN is a significant attribute currently pursued by the researchers because it directly impacts the energy required for the operations of the whole system. Moreover, battery life is key to many WBAN systems in health monitoring. Thus, new and low-power techniques in the energy-constraint sensor nodes are extremely desirable. The standard allows to accommodate in this direction, formatting the packets into unique frames, while guiding the access mechanism. The relationship between energy and data packets lengths is direct. A low-overhead protocol is important for low-power communications. The power used is proportional to the amount of bits per second required for communication following the standard. Therefore, for each PHY layer, it is essential to define frame formats, functions, modulations, and security services as a general specification. All of them are detailed by default according to some limits where the structure of IEEE Std 802.15.6 is characterized by logically organized data packets that allow the elaboration of hierarchical and sub-contained information. Due to the direct relationship between dissipated energy and the data unit organization in a communication environment, the WBAN standard proposes a hierarchical, self-contained packet structure (data packets nominations and acronyms) organized as follows: - MAC protocol data unit (MPDU), MAC service data unit (MSDU), physical-layer protocol data unit (PPDU), physical service data unit (PSDU), UWB physical protocol data unit (UWB PPDU), start frame delimiter (SFD), clear channel assessment (CCA), physical layer convergence procedures (PLCP), the physical header (PHR), and synchronization header (SHR) (IEEE, 2012). This distribution in MAC structures provides how communication of useful data payload will be carried out, and such data framing has a substantial impact on energy consumption. The Standard defines the PPDU and the MPDU as data structures to allow communication in the network, where the PPDU includes the MPDU binary frames. as well as the PPDU contains information obtained from RF signals. Appendix C details the assembly of the packets format according to how they are used throughout this work, for simulations, and to obtain the results. The medium access protocols used in the MAC for the WBAN have three modes as option. They are: channel access modes, beacon mode with superframes (the beacon periods of equal length bound the superframes), non-beacon mode with superframes and non-beacon mode without superframes. In general, since less energy is spent with less transmission and data re-transmissions, the third option is more energy efficiency. The wireless communication being focused in this thesis has as characteristic to cover a short range, closer to vicinity of, inside, or over the human body, but not limited to this environment (dealing with other possible targets, such as isolated IoT sensors or multiple-sensors networks). It uses existing ISM bands as well as frequency bands approved by authorities of each country. By the way, the regulatory and the standard can be adjusted regarding the main priority levels (eight in all). The data rates are distributed by each technique, $57.5 \sim 971.4$ Kb/s (NB); $202.5 \sim 15,600$ Kb/s (UWB); and $164 \sim 1,312.5$ Kb/s (HBC). Other aspects addressed are the QoS and, as a special case, the extremely low power operation mode is also possible, which is highly desired in the applications targeted in this work. Table 2.4 presents the organization of the standard due to the traffic importance over the communication channel. Users with highest level of priority will have more chance of occurrence and success on data transfer. Algorithms exist, *e.g.*, slotted-Aloha, that tend to guarantee the data flow according to the use probability. Finally, the structure presented by the standard shows the components of the entity arrangement (Hub and Nodes through the network), responsible for the communication and data transfer. Fig. 2.5 illustrates part of such context. The protocol is designed to support QoS, reliability, very low power at low data rates, data rates up to 10 Mbps, and noninterference requirement. The protocol details are beyond the scope of this work, while it is noted that this network needs | <b>User Priority</b> | Traffic designation | Probability | |----------------------|---------------------------------|----------------| | 0 (lowest) | Background (BK) signaling | 0.0625 - 0.125 | | 1 | Best effort (BE) | 0.0937 - 0.125 | | 2 | Excellent effort (EE) | 0.0937 - 0.25 | | 3 | Controlled load (CL) | 0.125 - 0.25 | | 4 | Video (VI) | 0.125 - 0.375 | | 5 | Voice (VO) | 0.1875 - 0.375 | | 6 | Medical data or network control | 0.1875 - 0.5 | | 7 (highest) | Emergency or medical event | 0.25 - 1 | Table 2.4 – User Priority in WBAN. Source: IEEE Std 802.15.6 (IEEE, 2012). to work with small error rates to avoid unnecessary power consumption caused by repetitive retransmissions. Also, some protocols can be designed to coordinate the communication between network entities, *e.g.*, Node(s) and Hub via PHY layer convergence protocol (PLCP) or logical link control (LLC), as a way to optimize energy consumption. **NODE HUB MAC SAP** MAC SAP LLC Protocol LLC LLC Management Entity MAC Protocol **MAC Sublayer MAC Sublayer** PHY SAP PHY SAP PLPC Protocol **PLPC** PLPC **PHY Sublayer PHY Sublayer** PHY Protocol NB UWB HBC NB UWB HBC Figure 2.5 – Hub and Node WBAN Protocol. Source: IEEE Std 802.15.6 (IEEE, 2012) and (ROCHOL, 2012), modified by the author. #### 2.2.1 Data Communication in UWB The modulation schemes supported by the IEEE 802.15.6 IR-UWB are on-off keying (OOK), differential binary phase shift keying (DBPSK), and differential quadrature phase shift keying (DQPSK). Moreover, the standard supports three operational PHYs, where the mandatory is UWB and human body communication PHYs, while the narrowband is optional. Hence, the modulation schemes used in this work have focused on UWB only, but in a broader investigation, modulation techniques different than those proposed by IEEE Std 802.15.6 could be considered to improve the overall energy efficiency. # 2.2.2 Comparison of the Power in Transceivers for Data Communication In a low energy network, for instance in an IoT, the primary requirement is the energy autonomy over long periods of time, and the use of low data rates for monitoring is then recommended to this case. Bluetooth Low-Energy (BLE) is the most commonly used by the industry today to meet this requirement, and much intensive research is devoted to develop dedicated BLE TX-RX chips to achieve this. Power values of the front-end modules of a BLE receiver are shown in Table 2.5, as a particular reference chip design in 28nm CMOS. 2.75 mW is much more significant power than the expected consumption by the digital processing, especially in the baseband radio partitions. Table 2.5 – Power Consumption in a BLE receiver front-end components. | Receiver Module | Power (mW) | |-----------------------|------------| | DCO, including Buffer | 0.4 | | Buffer, Divider | 0.4 | | Divider | 0.2 | | AD Converter | 0.25 | | DT IF | 0.8 | | LNA | 0.7 | | Total | 2.75 | Source: (KUO et al., 2016). An example of power consumption at 2.4 GHz of a multi-mode BLE for WBAN is given by Table 2.6. A transceiver for biotelemetry applications working under 1 V supply, consuming 5 mA is shown. The CMOS fabrication technology used in this chip was 130 nm, with a die area of $5.9 \ mm^2$ . There are differences when a comparison is made with the IR-UWB, e.g., the modulation and the circuit implementation. The reference power values for this WBAN design, with DPSK modulation, around $5.9 \ mW$ and $12 \ mW$ (for RX and TX mode, respectively), are useful for comparative purposes along this work. Table 2.6 – Power Consumption of a BLE Transceiver. | RX Mode (GHz) | Condition | Power (mW) | |---------------|------------------------|-----------------| | 2.4 | without and (with ADC) | 4.8 (6.5) | | TX Mode (GHz) | Modulation | Power (mW) | | 2.4 WBAN | DPSK | 5.9 ~ 12.3 | | 2.4 BLE | GFSK | $4.6 \sim 14.5$ | | 0.9 | FSK | $1.7 \sim 2.5$ | Source: (WONG et al., 2013). # 2.3 Digital Integrated Circuit Design based on Standard-Cells The design of digital parts of the receiver and transmitter follow industry-standard methods for designing. The FEC decoders design for the receiver are an example of the application of CMOS digital standard-cells design. The analog and RF parts of the front-end of the receiver follow other methods for analog and RF CMOS design. The complex CMOS system-on-chip (SoC) design methods to integrate the RF front-end and the baseband circuits and baseband processor in the same chip is beyond the scope of this work. In all parts of this microelectronics system, both power and performance trade-offs are design objectives frequently under constant evolution in research and development, all dealing with the design of nanoscale components on-chips. Integrated circuits are then closely linked to what is possible to achieve in power reduction, and the state-of-the-art ICs need to be mapped by the system designers to explore the system design space that complies with the communication system requirement. In this thesis the link is investigated through an energy model that can be applied to the IR-UWB system as a whole. The current and constant evolution of CMOS processes are being driven to few nanometer units, using FinFET transistors, e.g., 14 nm, 10 nm, and, even more recently, breaking the barrier down to 7 nm or 5 nm CMOS. These advanced technologies, based on FinFETs, are being used for digital parts of the design, and are the key for current products such as memories (DRAM and NVM), multi-core CPUs, FPGAs, and complex baseband processors of very high performance for 4G and 5G public wireless networks. The RF front-ends and especially IoT devices which require extremely low cost tend to be designed in more mature technology, with minimum linewidths which are larger than those of aforementioned FinFETs. Some examples of submicron CMOS foundry processes available in the market are in the following minimum resolution CMOS nodes: 180 nm, 130 nm, 65 nm, 45 nm, 32 nm and, in some cases, 28 or 22nm CMOS. In the design of the FEC decoders for the UWB receiver this work will opt to use a 65nm CMOS technology, as this technology appears to be a good compromise between the requirements of low-cost for IoT devices, and the technical requirements of the RF TX-RX Front-End, of the logic density and low power consumption required from the digital parts of the SoC. As previously mentioned, the microelectronic devices (*e.g.*, sensors, IR-UWB transceivers, etc.) are used in a WBAN to monitor biological signal activities in the vicinity or over the body, or by implanted devices. Continued operation of those components over long periods is a necessary feature for this particular targets. As the CMOS circuitry is power-savvy, their longevity is increased. The low-power digital techniques are present to aid in this purpose. #### • Low Power Techniques: The device density increases due to CMOS scaling and by novel fabrication and design techniques. In general, higher densities lead to improving the IC performance, whereas the power consumption reaches higher levels as more logic gates are integrated in the system. Dy- namic and static power consumption in the CMOS circuitry are strategic points to act on behalf of the energy optimization. The switching activity of signals at each internal component nodes (e.g., diode or transistor units) is responsible for the dynamic power consumption. It is increasing according to the number of gates in use and this power increases linearly with the frequency at which the particular node switches over time. The capacitive component of dynamic power increases quadratically with the DC supply. Advanced CMOS node (below 65nm) use supplies at or under 1.2V, mostly even below 1.0 volt. Whereas the static leakage has a strong temperature dependence, and dependencies on the technological node, due to its geometry, how tiny the MOSFET channel length is; and it is linearly dependent on the supply voltage. Decreasing MOSFET widths by design reduce static power (or off-state leakage), but the leakage power becomes proportionally higher from one CMOS node to a finer node, especially in the digital logic portion of the SoC. In summary, a list of design methods are incorporated in the CMOS digital design, such that a complex toolbox of design steps are used by digital CMOS designers to accomplish lower static and dynamic power consumption for a given function. Some of those design techniques, e.g., substrate biasing (for leakage reduction), clock gating (for dynamic power reduction), dynamic supply voltage control, and many others can be very sophisticated and are beyond the scope of this work to explain them. Lower power consumption is also enabled by IC fabrication techniques used in CMOS, which require modification in process steps, in the dielectric and metal film characteristics. #### **2.3.1** Some Low-Power Techniques In the digital designs developed and explained in Chapter 4 and in the Appendix D, different low power CMOS technologies are used. The FEC decoder is designed with 65nm transistors, and the OFDM block of Appendix D with 180nm transistors. A MOSFET transistor, as the most important active device in a semiconductor wafer, has the layout defined with the spatial distribution of the gate, source, and drain, over a bulk or over a doped well on top of a semiconductor substrate. The primary design relation of this layout is the ratio between "W" (width) and "L" (length) of each transistor. In an accurate view, the effective electrical channel length "L', is taken in fact from the actual distance between the lateral diffusion at the source and at the drain doped regions, measured after fabrication. The physical "W" after fabrication can be slightly different than the CAD layout view, due to lithography effects, to encroachment of the isolating field oxide (RABAEY; CHANDRAKASAN; NIKOLIC, 2008), or other trench isolation effects related to the actual chip fabrication. A key aspect to notice in each process is how small is the minimum length allowed by the fabrication of the MOSFET transistors. The transition frequency - a measure of how quickly transistors can switch - improves inversely with the transistor L. Faster switching frequency is also related to the reduction of the capacitance loads. However, drain to source subthreshold leakage currents also increase as transistors have smaller channel lengths. LEAKAGE CONTROL - The switching power depends on the charging and discharging of capacitances and on current switching in inductive loads. The static dissipation, or leakage power dissipation, depends on 3 factors: a- reverse-biased diodes, b- source-to-drain subthreshold currents in the MOSFET channels, and c- tunneling of carriers from the semiconductor surface to the gate electrodes under DC bias. While the leakage occurs mainly due to the gate and channel constitution, the scaling influences and determines the static power consumption in MOS circuits. Notably in CMOS nodes below 90 nm, the subthreshold and tunneling currents become a design concern. In these cases, the gate tunneling leakage current increases as the gate oxides are thinner. To reduce this tunneling, all CMOS process below 45nm used a higher-K dielectric constant, i.e., higher than the silicon dioxide dielectric constant. LOW-POWER LIBRARIES -The use of low-power logic cell libraries for the digital CMOS design is a method to reduce the power consumption. Those libraries use transistors with threshold adjustment, and channel lengths slightly tuned to reduce the DC power consumption, namely the so-called leakage power. Specific libraries for each foundry, and full-custom designs with low power methods can minimize the overall power consumption, at the same performance levels expected from the SoC. The designer can choose which libraries will be used in the design, e.g., the standard library with standard logic cells, the low-power optimized logic cells which have higher threshold voltage or thicker gate oxide, or the high performance cells, where logic cells have higher current capability, are faster, and consume more power than circuits designed with the former libraries. Full custom designs, not entirely based on standard-cells, can then manage to explore more the speed and power trade-offs, at the cost of a longer design cycle. All designs done in this work used the low-power standard-cell libraries provided by either the foundries or by design companies under foundry-specific licenses. CLOCK GATING - The input changing performs switching activities in a combinational circuit, which occurs fewer times than in a sequential circuit. The flip-flops are responsible for this fact because the clock under normal conditions is enabled, even without required repercussion in the cascaded circuit. The logic path of a sequential circuit has a more expressive dynamic providing more power consumption. Another component of the clock network distribution with highest toggle rate are the buffers, that have a high current flow to minimize clock delay. A solution is provided by the clock gating technique, used at the architecture or at the digital gate level of the design, when the signal transmission to unnecessary parts of the circuit is avoided, especially reducing power loss in the flip-flops and buffers of the clock distribution network in certain hardware blocks. The idea is to selectively shut-off the clock circuits during periods of inactivity of the logic blocks, in a logic condition that can be identified by the system and logic designers themselves, or even identified automatically by the logic synthesis tool that takes the RTL description of the circuit as input. SUBSTRATE BIASING - In a cutoff mode, the current from the drain to the source is minimal, in a MOSFET device. But in deca-nanometer devices, as the scale of the channel length is very small, the sub-threshold leakage current will have significant contribution (of the order of few or tens of nano-amperes per transistor) to the DC dissipation. The well-substrate in PMOS biases the body of the transistor to a voltage higher than Vdd and in an NMOS to a voltage lower than Vss. Briefly, the well voltage controls (reduces) the subthreshold leakage currents and, consequently, the required DC power. The study of operation at sub-threshold voltages is present in research, like as (STANGHERLIN, 2013) (supply voltages in the range of 200 mV to 400 mV). This technique reduces the power consumption using the trade-off between power (both DC and dynamic) and performance. ## 2.3.2 Digital Circuit Design Method: the Standard-cell Approach and Tools In a top level or system view, the management of files is necessary to correctly design with the digital tools in the digital synthesis processes of the design flow. The designer has to command and control the development versions and to select the useful commands of the input scripts, the order of tool processing, the organization of the design directories or folders. In Chapter 4 the modules were designed at the gate level for implementation. The physical flow for carrying out the ASIC (Application-Specific Integrated Circuit) is out of the purpose of this work, but it is also included in the overview in this Section to give a complete view of the design necessary tgo reach functional silicon. Several hardware architectures have been developed to produce IPs (Intellectual Property) for FEC encoding and decoding, which are addressed later in Chapter 4. The design method reviewed here is based on standard-cells (Std-cells) which previously defined the gates (grouped in technology libraries). Such Std-cells facilitate the logical and physical implementations of the ASIC according to a digital design flow, which is defined by a sequence of design steps, based on description files, scripts, and subsequent design analysis EDA tools. Electronic Design Automation (EDA) tools comprise software applications that facilitate the designsl, follow the pre-defined rules, and incorporate the use of model files specific to the CMOS technology used. Some of the design flows explore specific features like Low-Power CMOS, others analyse electrical signals integrity under operation, others verify the design according to the likelihood of defects during fabrication (i.e. verify the Design for Manufacturability). The methodology to implement a microelectronic design is connected with the tool and technological procedures, which depends on the Process Design Kit (PDK) or the library. All the process can be distributed in the specification files in four categories: - Register Transfer Level (RTL) containing the VHDL code (.vhd or .v), it is a high level description to organize the std-cells (logic gates) according to the netlist files, to determine how the circuit works. - Constraints (SDC) this file ("Synopsys Design Constraints") bounds the timing and some design intent aspects used for power analysis, such as the wire load model. For sequential circuits, the maximum target clock frequency is defined using the .sdc file. - *Simulation (SIM)* the files for the simulation have the rtl file list (.f), the testbench used (.vhd), and the scripts of some requirements, *e.g.*, net delays (.sdf/.cmd). - Synthesis (TCL) these files (in "Tool Command Language") can be used in both logical and physical synthesis. The script (.tcl) contains a set of commands for the tool to synthesize the design, automatizing the procedures in the EDA tools execution to achieve the best results. The files used in a design flow, independently of the tool or company that provides it, have the purpose of describing the physical characteristics and specifying the area, power, and timing of the std-cells. The group of files related to a PDK for an ASIC is: - *LIB* (.*lib*) this file has the values of the area, power, timing, temperature and voltage for each of the std-cells. - *Tech files* (.tf / .tch and .ict) characterization of interconnects capacitance and resistance table. The ".ict" is generated from ".tch" file, it has the same information, but in a different format. The ".ict" generates the capacitance table file (.captbl) and the ".tch" is used in the power-rail analysis \*. - *Std-cells HDL* (.v or .vhdl) file that contains the description of the digital elements in Verilog or VHDL language. Generally, all the circuits of the CMOS design are described in a set of files, with the scalability (technological node independence) as the main advantage of this description at the register-transfer level (RTL). - *LEF* (*.lef*) this is the Layout Exchange File, where physical characteristics are included in this file for the synthesis, mainly resistance and capacitance (RC). The geometric specification of the cells (delineation thickness) of the layout is also present in this file. - *CapTable (.captbl)* the purpose of this file is the accurate RC parasitic and interconnections modeling, based on ".lef" and ".ict" files. - *GDSII* (.*gds*) The Graphic Design System is a file with all technical information to manufacture the design. - LibGen files (.cl) it describes the physical characteristics of the technology for Rail Analysis\* in a specific format. - *CDB* .(*cdb*) this file is used during the physical implementation to Signal Integrity Analysis\*\*, to verify the crosstalk effects. - *Spice Netlist* (.sp) it is a file describing the Std-cells for Layout versus Schematic comparison (LVS), used by the Spice tool. The cross-talk effect occurs when there is capacitance or inductance (or both) coupling in the wires, degrading the signal propagation. <sup>\*</sup> Rail Analysis is the verification of the quality in the distribution of the supply voltage through the wires and vias of the circuit. <sup>\*\*</sup> Signal Integrity Analysis is made for verification of timing and possible cross-talk between signal paths in the final circuits layout. Fig. 2.6 presents the steps for logic synthesis in a design flow. The input files are read by the tool and after each step with design verification, occurs the generation of the logic netlist (listing only std-cells or macroblocks), the accurate simulation reports and the necessary files for proceeding later with the physical synthesis. Svnthesize Read Files Library to Mapped .lib, .lef, Reports: .cptbl time, area, Elaborate power, gates Insert DFT the Design HDL **Tool Scripts** .v, .vhd **Optimizations** Optimize .sdf, .spef, Clock Gating .load Constraints .sdc .v Output Files Synthesize Generation to Generic Figure 2.6 – Design Flow for the Logic Synthesis. Source: the author. Fig. 2.7 represents the flow followed by the physical synthesis to implement the design in a file that will produce all information for the manufacturing (GDSII file). Therefore, the design flow follows these steps: - Floorplanning in this stage, the distribution of dimensions is adjusted in the layout; the positioning of the macros or sub-modules of the project in the right places is mandatory. Floorplanning considers some geometrical constraints in a design. Conceptually, the macros are hardware Intellectual Properties or pre-designed hardware elements that are available for the layout. Similarly, the pads are placed in the layout. All these procedures are made aiming to reach better timing results and an optimized integration between the electronic elements of the IC; - Powerplanning the planning of the power supply lines distribution of the layout is made. Metal rings (VDD and VSS) are created around the core of the layout. The lines of metal Layer 1 are distributed in the layout for connection of the VDD and VSS pins of the std-cells. As a consequence and in favor of a homogeneous distribution across the circuit, the necessary horizontal and vertical metal stripes are created along the entire core of the layout; - *Placement* the placement is when the components are ordered in the layout. The logical interconnections between the std-cells determine the placement, to achieve timing results that respect the frequency constraints; - *Clock Tree Synthesis (CTS)* in this stage, the clock signals are routed to all the Flip Flops of the circuit core. A buffers and/or inverters insertion is necessary, in order to reduce the skew or, at least, balanced it. The skew effect occurs when the same - sourced clock signal arrives at different times in the components of the IC, causing an immediate difference of the signal and delay problems. - Routing it is the stage where metal wires and vias are placed in the best position, aiming to connect the existing elements. After CTS, the definition of logical connections given by the netlist file, which represent exactly the paths created using such metal wires and vias, determines the paths for the electrical interconnection of std-cells and I/O pins, and other macros of design. - *Metal Fill Insertion* some unused spaces in the layout are filled with dummy blocks in the physical design stage. This technique provides a reduction on the dielectric thickness variation, improving the pattern of the metal density and the planarity of the final product. The capacitance must be unaltered after this process. - *RC Extraction* The parasitic extraction aims to verify how much the layout of the project affects some electrical characteristics present. Such characteristics are present and generate diverse and often undesirable effects, such as capacitances and resistances between the components present. And the Verification Flow, also in the right part of Fig. 2.7, has the distribution that follows: - *Timing Analysis* such analysis is made to verify if all timing constraints have been respected by the synthesis tool. Typically, it proceeds parallelly during the physical synthesis stages, but the tool allows to consult the results at any moment. - Signal Integrity Analysis it is the verification to check interferences between adjacent components, i.e., how the electrical signal presenting in a path can modify the signal of an adjacent channel (Crosstalk effects) sometimes interfering in the timing, violating the specifications of the design. - *DRC/LVS Verification* Design Rule Check (DRC) is to see if the design rules have been respected, according to the technological node used by the fabrication processes. The Layout versus Schematic (LVS) Analysis aims to validate if the postlogic synthesis netlist schematically corresponds to the final layout, with 100% of the connections made. - *Power/Rail Analysis* this verification is a power analysis to physically check the positions and interconnections of the elements in the circuit layout with precision. The powerplanning generated by the previous process is also verified in the Rail Analysis in this stage. - Formal Verification with the netlist generated from the final layout the same verifications in the netlist obtained from the logical synthesis. In the formal verification, the device under test (designed IC) follows check rules exhaustively, limited to its size, where the complexity exponential grows according to the size of the netlist. - Funtional Verification Similarly to formal verification, the netlist the verifications for logical synthesis are repeated. Different devices (for example, transistors, gates, state elements of the circuit) are verified regarding their functional operations, following the netlists (The analysis abstracts a logically-correct gate and a state primitive model). The RTL model is also compared to the logic model for functional verification. • *Signoff* - It is the final verification, before the tapeout (when the GDSII file is sent to manufacturing), repeating the previous checks, filling a checklist to assure a solution without rule violations and that the layout meets all specifications and requirements. Figure 2.7 – Physical Synthesis Flow of the Design. Source: the author. As a complement for this work, the digital design flow was exercised for the design of an OFDM modem. This design is described in the Appendix D, as it is related to the a possible modulation schemed adopted in relevant communication systems nowadays, and it is an option in the standard focused here. Appendix F attaches a quick survey and overview about the most current and advanced CMOS technology, based on FinFET transistors, used for silicon IC design in the most advanced products like memories and processors. #### 2.4 Related Works A selection and overview of current works are presented in this section, regarding the communication system of interest, the IR-UWB. The goal is to provide a specification for the subsequent developments in the following chapters. These works introduce the cross-layer modeling, the FEC concepts, and provide reference works in the main topics of this thesis. This review is a key auxiliary towards the cross-layer energy model of IR-UWB that still meets the request under the standard of interest. #### 2.4.1 **Cross-Layer Reference Work** Gathering and processing needs of sensor signals are increasing very rapidly, ushering the era of IoT systems. The system monitoring resources have to be reliable, when applied on WBAN systems that deal with health monitoring, or critical infrastructure monitoring. Today the expansion of technological conditions for human health is receiving great research attention, and it is a key application for IoT growth. One of the main topics is the necessary energy to keep such a system working. Novel technologies, from sensors, microelectronics, optical networks, and much more, allow now to achieve forms of gathering, processing and data transmission, unthinkable years ago and often filling the energy consumption requirements based on the regulations. One of the primary references for modeling two layers of the communication system is (KARVONEN, 2015). This author recently presented some useful data and whose philosophy is in alignment with the approach sought in this thesis. Some of the data used to derive results from the cross-layer models, in this work and in the reference, are also based on the standard studied. Table 2.7 contains some communication parameters used by Karvonen *et al.* (KARVONEN; IINATTI; HÄMÄLÄINEN, 2015) in their work. They serve as the reference to constrain the parameters of this study, leading to new formulations to build a new model appropriate for the applications focused herein. Parameter Description Value Unit $\overline{\text{BW}}$ 499.2 bandwidth 3993.6 central frequency $f_c$ $N_{c\underline{p}b}$ number of chips per burst 16 Table 2.7 – Key Parameters for an IR-UWB WBAN system. MHz MHz $T_p$ 2 pulse duration ns 2 integration time per pulse ns $\overline{R}$ 0.975 uncoded data rate Mbps $P_{tx,circ}$ transmitter circuitry power consumption 2 mW receiver power consumption 20 mW Source: (KARVONEN; IINATTI; HÄMÄLÄINEN, 2015). Table 2.7 presents the transmitter circuitry power consumption (2 mW) that is different when compared to the transmitter power consumption for RF signal (37 $\mu$ W). These parameters are defined by authors closely following the corresponding ones in IEEE Standards 802.15.4 and 802.15.4a. It is noticed that the power of circuitry takes into account the inside processing, I/O and memory controlling, and other baseband components of the equipment. Whereas the receiver power is higher than the transmitter because the reception is continuously activated. #### 2.4.2 BCH Reference Work The BCH FEC algorithm was the first selection in the standard. An overview of key works in BCH FEC is done in this subsection. A specific tool is introduced by (JAMRO, 1997), which generates the gate level description of any BCH codec with correction capability from 3 to 10 bits. This approach is like a parameterized High Level Synthesis (HLS) process, yielding an RTL description. The tool operates over finite fields of the form GF(2<sup>m</sup>) to generate the VHDL scripts corresponding to the BCH codes. These decoding processes have been broken down into three separate steps: syndrome calculation, the Berlekamp-Massey algorithm (BMA) and the Chien search. The BMA considers the serial with inversion and parallel inversion-less options. The VHDL templates are produced from a "C-like Program" after choosing the design parameters and the structure, aiming at a synthesis to be done subsequently. The main goals of the proposed tool are to manipulate input and output files, to save design time, and to obtain hardware as efficient as hand-made circuits. That tool approach is used throughout this thesis to produce the BCH RTL descriptions for the synthesis results. A novel multi-channel BCH decoder optimization for common error cases was the topic of (DILL, 2015), and the aim of that architecture is to reduce the hardware area. The BCH hardware employs shared decoding blocks. The overall performance is increased while still saving area in that design. The reduction of area was around 47% – 71%, when compared with a traditional multi-channel implementation. The dynamic power consumption was saved by 44% up to 59%. The throughput and the lifetime of NANDs are increased using this approach, reducing the number of errors to correct. The straightforward extension of the work is to create reduced Chien solver units, besides increasing the pipeline stages complexity of the decoder due to the additional arbitrators. A fully configurable decoder architecture needs to be better investigated, considering the Chien Search, the root solver, and the error polynomial generator proposed. ## 2.4.3 A Low-power IR-UWB CMOS Transceiver The development of a standalone wireless integrated IR-UWB transceiver was presented by (BARRAS, 2010), which uses the diversity strategy for reliable communication with frequency multiple access (carrier-based IR-UWB, a hybrid model). A 0.18 $\mu$ m CMOS technology is employed in that work to build an integrated circuit for the transceiver. The range of frequencies in UWB covered by this radio device is centered at 3 GHz and 5 GHz, with precision around 20 MHz. The main components assessed by the author (BARRAS, 2010) are the voltage-controlled oscillator, the phased closed-loop and, in the Front-End of the receiver, the Low-Noise-Amplifier (LNA). All of these components, among others, aim a low-power IR-UWB based on CMOS for a radio-frequency transceiver. UWB signals can be classified as Impulse-radio (IR) or Carrier-based (CR). The former is transmitted in the baseband and, usually, the transmitter generates short electrical impulses that are coupled to the antenna, without resorting to frequency conversion (or mixing). As a carrier-less method, it requires a very low duty-cycle, and the shape of the short pulses is a challenge to control. The IR-UWB has great potential for ultra low-power and low data-rate communication. The latter, carrier-based, has characteristics of the heterodyne radios with multiple carriers to modulate and require better spectral accuracy. The work presented in (BARRAS, 2010) highlights some features desired for their IR-UWB transceiver, such as bit rates between 1 and 10 Mbps, low power consumption at maximum data rate (less than 50~mW), robustness and reliability in dense multipath environment, robustness against strong narrowband interference (*i.e.*, , Bluetooth interferers), and very small form factor (simplifying the use with battery and a single antenna). From (BARRAS, 2010), Table 2.8 shows the power consumption in terms of current for the transmitter, while the unused blocks are not taken into account in the power budget. Two signalling schemes are used in the TX. In the LRPM (low-rate pulsed modulation) signalling scheme, isolated short rectangular pulses are generated for the baseband, filtered and up-converted to RF, with BFSK applied to it. And the PMCW is the phase modulated continuous wave, where the phase modulation is directly applied on the CW RF signal, which reduces the energy per bit about 10% for a lower data rate (1 Mbps vs. 5 Mbps). Table 2.8 – TX Power consumption overview. | | LRPM | PMCW | |-----------------|----------------------------|------------------------| | | 5 Mbps | 1 Mbps in Burst | | | | $f_0 = 4.5GHz$ | | PLL | 23 mA (100 %) | <b>2.8 mA</b> (1.2 %) | | Baseband | 5.8 mA | 2.2 mA | | - input stage | 1.7 mA | - | | - shaping filt. | 1.9 mA | - | | - buffers | 1.5 mA | 1.5 mA | | - bias+other | 0.7 mA | 0.7 mA | | Modulator & | 1 mA (10 %) | <b>0.35 mA</b> (3.2 %) | | output amp. | $(\approx 17mA - pk)$ | $(\approx 17mA - pk)$ | | Total TX | 29.8 mA | 5.35 mA | | | $\approx 10.7 nJ/b$ | $\approx 9.7nJ/b$ | | | 29.8mA x $1.8V = 53.64 mW$ | <b>9.63</b> mW | Source: (BARRAS, 2010). The total current of the transmitter and the approximated energy per bit are considered for a supply voltage of 1.8 V in Table 2.8. The current consumed by each block of the device is listed, and the PLL for carrier generation presents by far the highest power consumption. As a result, the power consumption of the circuit achieves at peak (53.6 mW), and regarding a duty-cycle of 16 % the power consumption was 15.1 mW. Those figures are for a 5 Mbps data rate. For this work, which was among the state-of-the-art until the year 2009, some aspects taken into consideration were the topology, architecture, CMOS technology, supply voltage, as well as modulation scheme, carrier specs, power amplifier used, and compliance (to FCC or ECC). # 2.4.4 Initial Specification Summary Table 2.9 presents a first approach for the desired specification of a low-power transceiver to be used in WBAN communications, based on (IEEE, 2012) and on literature surveys. Table 2.9 – IR-UWB WBAN specifications. | Parameter | Range | Unit | |---------------------|---------------------|--------| | Distance | up to 10 | m | | Throughput | up to 10 | Mbps | | Latency | 10 | ms | | Scalable packets | from 10 to 1 | Kbits | | Effective BER | less than $10^{-5}$ | - | | Precision | 10 | cm | | Energy per bit (TX) | less than 1 | nJ/bit | | Energy per bit (RX) | less than 10 | nJ/bit | Source: (IEEE, 2012) and Surveys. Values for energy less than 1 nJ/bit (TX) and 10 nJ/bit (RX) are very difficult to achieve, even at state-of-art ICs. ## 2.5 Summary This Chapter presented a brief overview about UWB communications, followed by the high-lights of the standard, which selects UWB as one possible implementation strategy for the PHY. The review pointed to some characteristics that deserve consideration for understanding the works that follow in the next chapters, linking the IEEE 802.15.6 Std and the proposed developments in the thesis. Briefly, the purpose of this Chapter was not to reproduce the detailed standard, but to highlight key aspects of the MAC sublayer and PHY Layer, whose importance for a new energy efficient model calls to be considered. Other points of these layers will be covered in the next chapters to ground the basis for an IR-UWB for WBANs. Key aspects were presented, as they are useful in terms of energy savings and opportunities to be exploited according to the standard. In other words, this Chapter introduces the Impulse Radio communications, as well as the WBAN standard, as a background information. The Integrated Circuit design field, the design method, and the digital flows were briefly explained, aiming to show the advantages concerning energy savings possible in the implementation technologies: the CMOS circuits and the low-power design techniques used in this work. Also, some reference works in IR-UWB TX-RX were addressed, indicating the direction in which this research was developed. #### 3 UWB PULSE SHAPING AND MODULATION This Chapter discusses short-pulse waveform performance analysis, which is a crucial element for energy analysis in a communications system — investigating, also, the modulation schemes that are relevant in the context of the IR-UWB transceiver to be modeled. The energy efficiency is evaluated, considering the vast multitude of pulse shapes for IR-UWB in WBANs, as well as the resources and techniques used for such purpose (DE ÁVILA et al., 2016). # 3.1 Pulse Shaping Pulse formats generation in IR-UWB, for the great majority, is aiming to be adequate to the FCC mask in UWB frequencies. Some options of Pulse Shape can be analyzed to be suitable for such mask, from the basic format of a square pulse, to Gaussian pulses, to other more complex waveforms like the Gaussian derivatives. Table 3.1 shows the respective equations of such pulses in time and frequency domains. In equations, $\sigma^2$ is the the variance parameter of the pulse, "t" is the time and "f" the frequency. Table 3.1 – The first five derivatives of Gaussian pulse | n | g(t) | G(f) | |---|------------------------------------------------------------------------------------------------------------------|------------------------------------------| | 1 | $-\frac{t}{\sqrt{2.\pi}.\sigma^3}.e^{-\frac{t^2}{2.\sigma^2}}$ | $2.j.\pi.f.e^{-2.(\pi.\sigma.f)^2}$ | | 2 | $-\frac{1 - \frac{t^2}{\sigma^2}}{\sqrt{2.\pi}.\sigma^3}.e^{-\frac{t^2}{2.\sigma^2}}$ | $-4.(\pi.f)^2.e^{-2.(\pi.\sigma.f)^2}$ | | 3 | $\frac{3.t - \frac{t^3}{\sigma^2}}{\sqrt{2.\pi}.\sigma^5}.e^{-\frac{t^2}{2.\sigma^2}}$ | $8.(j.\pi.f)^3.e^{-2.(\pi.\sigma.f)^2}$ | | 4 | $\frac{3 - 6.\frac{t^2}{\sigma^2} + \frac{t^4}{\sigma^4}}{\sqrt{2.\pi}.\sigma^5}.e^{-\frac{t^2}{2.\sigma^2}}$ | $16.(\pi.f)^4.e^{-2.(\pi.\sigma.f)^2}$ | | 5 | $\frac{-15 - 10.\frac{t^3}{\sigma^2} + \frac{t^5}{\sigma^4}}{\sqrt{2.\pi}.\sigma^7}.e^{-\frac{t^2}{2.\sigma^2}}$ | $32.j.(\pi.f)^5.e^{-2.(\pi.\sigma.f)^2}$ | Source: (EMAMI, 2013). Equation 3.1 corresponds to an UWB signal that is the sum of impulses over all the time of Figure 3.1 – Example of spread spectrum on the IR-UWB. Source: (EMAMI, 2013) the communication. " $T_s$ " represents the sampling (or repetition) time between pulses, and " $\tau_0$ " is equivalent to the delay starting at the reference time of each pulse. This equation allows to correlate the data set with the waveform coding generated. The "a" is the amplitude, varying for each different pulse generated and, in practical sense, can be expressed in Volts. The instant power and energy of each impulse is related to the a squared. The power spectrum curves generated by this equation are shown in Fig 3.1 for the Gaussian pulses (GP). "n" corresponds to the derivative degree of the GP, from "1" to "4" at the plot. $$UWB_{signal}(t) = \sum_{-\infty}^{\infty} a.pulse((t-n).(T_s - \tau_0))$$ (3.1) Equation 3.1 is well known with its variation for many works, for example, in Ott *et al.* (OTT; EISNER; EIBERT, 2012). That work makes an analysis over modulation schemes, impulse shapes, and data rates, based on IR-UWB variable topology that results in Power Spectral Density (PSD) curves fitted with the theoretical model, for a $4^{th}$ -ary PPM. Some innovative aspects to consider of that work are the switching alternation to choose the sub-bandwidth and the analog filter modeling using its own antenna. The power spectral density (PSD) of an UWB system is generally considered to be extremely low. The PSD, Eq. 3.2, is the ratio between "P", that is the power transmitted in watts (W), and the bandwidth (BW) of the signal in hertz (Hz). The PSD is calculated in watts/hertz (W/Hz). The BW for UWB system is very wide, compared to the BW of a narrowband signal. Typical UWB values are, for a Bandwidth of 7.5 GHz and transmitted power of 1 mW, the correspondent PSD will be $0.013 \times 10^{-13} \text{ W/Hz}$ , (GHAVAMI; MICHAEL; KOHNO, 2004). $$PSD = \frac{P}{BW} \tag{3.2}$$ The Gaussian pulse can have its parameters varied, each pulse showing a very different temporal shape format, from narrow up to large width variation), and delayed around a reference point in time, as shown in Fig. 3.2. Gaussian pulse waveforms are depicted with variations over a normalized time and amplitude. Figure 3.2 – Gaussian pulse with variations. Source: the author. A very narrow pulse can also be denominated an impulse, since it concentrates all the energy within a limited time interval, or even to a limited portion of the spectrum and time (according to Parseval's theorem). The terms will be treated as interchangeable for convenience, since Impulse-Radio has already been established in the technical literature. Not to mention that a mathematically ideal impulse is not achievable for a real IR-UWB system, and several research (previously cited in the previous Chapter) are more and more assuming that CMOS circuits can generate narrow pulses in terms of energy amount with tens or hundreds of pico-seconds duration. The trade-off between time and frequency is quite attractive as a research challenge, using a myriad of new techniques. For example, to discover how the current CMOS technology can produce pico-second impulses while at the same time reaching the optimal point for the application is a good research challenge, especially considering the wideband antennas that need to be integrated in the same chip package and the related power losses at the TX. Another example of the Gaussian pulse and its spectral correspondence is shown in (GHASEM-POUR et al., 2012). The variables are component terms used for the generation of such waveforms. The first order derivative can also be expressed as follows. Other authors work with Figure 3.3 – plots of square wave and its spectrum. Source: the author. higher order derivatives, for example (EMAMI, 2013) presents the $5^{th}$ order derivative, which can be exploited to match a FCC Mask (FCC, 2002). In the same direction, the $7^{th}$ order derivative is proposed in equation (3.3), with its equation for corresponding frequency in (3.4). The equations are given by successive derivatives formulations, according to deductions presented in Appendix A. $$g(t) = \frac{-t^7 + 21.t^5.\sigma^2 - 105.t^3.\sigma^4 + 105.t.\sigma^6}{\sqrt{2.\pi}.\sigma^{15}} e^{-\frac{t^2}{2.\sigma^2}}$$ (3.3) $$G(f) = -128.j.(\pi \cdot f)^{7} \cdot e^{-2.(\pi \cdot \sigma \cdot f)^{2}}$$ (3.4) Among several derivative of the Gaussian pulse, the work presented at Tuan-Anh *et al.* (TUAN-ANH et al., 2007) generated a $7^{th}$ order and an ultra low-power circuit to generate the mono-cycle pulse for IR-UWB applications, filling the FCC Mask (a=500 mV and $T_p=800$ ps). The dynamic energy consumption was 4.7 pico-Joules (pJ) per pulse in their case, considering a 1.5 V supply. The CMOS used is the TSMC 0.18 $\mu$ m process. This energy value is considerable, and its reduction requires future CMOS designs. Regarding the wave forms over time and frequency, some illustrative plots are shown. For example, Figures 3.3 and 3.4 contain the plots of the square and Gaussian (GP) waveforms over time and their respective harmonic decomposition in frequency. The sinc behavior of the PSD of the square pulse shows a disadvantage over the GP more confined PSD. In Fig. 3.3 a normalized square pulse is plotted, from which it is possible to work with the features that best suit for each design. The figure also shows the correspondent sinc (sin f/ f) spectrum of a square pulse spread in normalized frequency. A Prolate Spheroidal Wave Function (PSWF) pulse presents a great advantage in terms of power spectrum confinement, and power efficiency by consequence. The PSWF is presented normalized in time and amplitude. The two plots of Fig. 3.4 show the results obtained from equation 3.1, as its kernel function. The figure is representing multiple squares in the frequency Figure 3.4 – PSWF variations in time and their respective PSD squares in frequency. Source: the author. domain as PSDs corresponding to PSWF signals in time, and the vice-versa is also true. Figure 3.5 – A PSWF pulse over time. Source: the author. # 3.2 Pulse Shape Analysis This section presents a complementary investigation for an energy efficiency model applied to UWB wireless communications system, as the power budget definition is one of the aims for Cross-Layer optimization. Different equations for pulses generation and their spectral correspondence through this chapter aim to clarify how to match the FCC mask. The variation of pulse forms and shapes reflect over how to accommodate the spectral profiles, according to each occupied band, and with bandwidths complying to the masks. Beginning with the theoretical study of some basic waveforms and their impact on the energy consumption. These waveforms will help the simulations of energy efficiency related to modulation and coding, where the total energy consumption is evaluated by varying the signal modulations and the applicable WBAN payload. The results show a comparison between both aspects as method to choose the better energy efficiency, given the assumptions of some design parameters and the correct choice of the generic configuration. The trade-off between time- and frequency-limited is definitely a solution for future improvements. As an example of the Gaussian pulse and its spectral correspondence, Eq. 3.5 shows the time components used in generation of such waveforms and its correspondence in frequency. The derivative orders can be found in (EMAMI, 2013), as follows. $$g(t) = -\frac{t}{\sqrt{2.\pi}\sigma^3} e^{-\frac{t^2}{2.\sigma^2}} \to G(f) = 2.j.\pi.f.e^{-2.(\pi.\sigma.f)^2}$$ (3.5) In equation 3.5 $\sigma^2$ represents the variance parameter of the pulse. A practical pulse duration is defined as $T_p = \sqrt{\pi}.\sigma$ , even though the Gaussian pulse extends theoretically over an infinite time (t) span. ## 3.2.1 PSWF – Frequency vs. Time Design This section has a review of some works from the transceiver design to the high level aspects of UWB communication system (bottom-up approach), focused on the Prolate Spheroidal Wave Function (PSWF) contributions. As shown in the previous subsection, the PSWF has the potential to reach a better energy efficiency with a specific self-reconfigurable transmitter design (OTT; EISNER; EIBERT, 2012), (NEVES et al., 2012), given the excellent flatness and confinement of the PSWFs PSD. Nonetheless, authors normally propose the use of Gaussian or square pulses in the transmitter (TX). The block that produces PSWF is followed by other TX blocks, where some of them deal with the modulations techniques (*e.g.*, PPM, OOK, or BPSK), applying it to manage the power. Thus, investigation of the whole set, considering the transceiver, consists a worthwhile way for energy optimization in this kind of communication. It is important to note that the energy concepts used for WBAN communications are available to use in correlated areas, like WSN. The concern about such optimization is valid to get better energy efficiency over a long time of operation. In a broader sense, the energy conservation per unit of the WSN is highly desired to be reached, meaning energy savings ripped favorably throughout the whole WSN system, which could be comprised by hundreds or thousands of such components. The PSWFs properties (ANTREICH; NOSSEK, 2011) can reinforce the alternatives for best spectrum access and exploration. The concern here is to obtain an optimum pulse shape for better time synchronization. Where the delays self-correlations are used as metric, the methodology presented proposes to measure the information in a concentrated time (chip pulse shapes) with time accuracy. Obtaining PSWFs is also the aim in the implementation of particular filters. Leonardo *et al.* (NEVES et al., 2012) proposed to investigate a methodology to numerically approximate for UWB impulse generation, after a square pulse to transmit. The authors proposed to avoid interference respecting the FCC UWB limits. A PSWF pulse can be normalized in time and amplitude, and Fig. 3.5 shows the results obtained from kernel function of the equation 3.6. Some properties of PSWF wave shapes, regarding the pulse duration and bandwidth, are the following: (1) it has exactly the same XXXX for all its order; (2) it is double-orthogonal; (3) the PSWF has no DC component; and (4) it can be varied simultaneously over time and frequency. $$H(f) = a.f_u.sinc(2.f_u.t) - a.f_l.sinc(2.f_l.t)$$ (3.6) The fact that it is possible to get various square waveforms in the spectrum from the respective PSWFs deserves attention, it is a useful characteristic for low power IR-UWB. As shown in the equivalence plot in Fig. 3.4. The serial way to send pulses, one after another, can cause interference between them at remote receptions. Some encoding and orthogonal techniques can help, but with decreases in the data rate and other drawbacks that appear, as expressed by (GHAVAMI; MICHAEL; KOHNO, 2004). A good design of IR-UWB transmitter can provide, besides hardware simplicity and multiple pulse generation, embedded orthogonality possibilities, high data rates, robustness to SINR and attenuations, and content awareness for dynamic adaptations. A multi-pulse generator for four different PWSF pulses, based on a source signal, is introduced by (GHAVAMI; MICHAEL; KOHNO, 2004). Where the hardware is made to produce pulse of different shapes in a desirable number of the pulse orders, and adjustable to pulse trains resulting from the input data. Also a variety of pulse generators, selectors, and Front-Ends can be suited to PWSF method. As a way to approach these questions, the authors state the PSWF formulation (SLEPIAN; POLLAK, 1961) for future use in UWB context, as follows: $$\int_{-T/2}^{T/2} \psi_n(\tau) \frac{\sin \Omega(t-\tau)}{\pi(t-\tau)} d\tau = \lambda_n \psi_n(t)$$ (3.7) Eq. 3.8 presents the differential equation form for the same PSWF of order "n" $(\psi_n(t))$ , where $\chi_n$ is the eigenvalue of $\psi_n(t)$ . $\lambda_n$ is the power effectiveness of the pulse with respect to minimizing the time-domain inter-pulse interference. $$\frac{d(1-t^2)}{dt} \cdot \frac{d\psi_n(t)}{dt} + (\chi_n - c^2 \cdot t^2) \cdot \psi(t) = 0$$ (3.8) The "c" in Eq. 3.8 is a constant given by 3.9, as required by the Parseval Theorem, where $\Omega$ is the bandwidth and T is the pulse duration: $$c = \frac{\Omega.T}{2} \tag{3.9}$$ For the purpose presented in this work, the PSWF signal energy during the interval $\frac{-T}{2}$ to $\frac{T}{2}$ is given by Eq. 3.10, whose values range from "0" to "1". $$\lambda_n = \frac{\int_{-T/2}^{T/2} |\psi_n(t)|^2 dt}{\int_{-\infty}^{\infty} |\psi_n(t)|^2 dt}$$ (3.10) On the other hand, at PHY level, the IR-UWB circuit can be made simple, several ways can be used to implement it. Faster pulses are easier to obtain in nanoscale CMOS, and circuit timing is critical for UWB pulses at 5 GHz central frequencies and above. For instance, a 5 pico-seconds pulse may be required, but older CMOS technology nodes make it impractical to obtain such timing. UWB light pulses, with pico-seconds down to femto-seconds duration are used today in a totally different technology context, which is not possible in low-cost integrated CMOS circuits. An example toward hardware synthesis of PSWF is given by the design of (GHAVAMI; MICHAEL; KOHNO, 2004). In which the equations of PSWF are placed as a model in communication process. Fig. 3.8 shows these concepts as a schematic blockdiagram of a proposed implementation. ## 3.2.2 Pulse Generation Hardware According to (HU et al., 2009) there are three general methods to generate UWB and very short pulses, depicted in Fig. 3.6. These pulses are necessary for the IR-UWB. The first method uses a a high frequency PLL and a mixer circuit, which usually consumes considerable power, as shown in (BARRAS, 2010). That author has shown that considerable power is consumed by the PLL, since it needs to run at very high (multi-GHZ) frequencies. The second method uses pulse shaping filters, which can be done by using digital CMOS gates and an appropriate near-antena RLC filter. And the third method is a direct digital-to-analog synthesis method, which requires essentially digital circuits, a fast digital-to-analog (DAC) converter, and a reconstruction filter. The disadvantage of this technique is that a very fast DAC will consume much power. As CMOS scaling proceeded rapidly during the last years, the latter two methods became feasible, while the pulse shaping method gained more traction as it can be designed in advanced CMOS to consume less power, and especially less energy per UWB pulse, than the other two techniques. In this subsection two different circuits (Fig. 3.7 and Fig. 3.8) proposed in the literature for pulse shaping are presented. These works provide estimates of the energy per pulse transmitted. The PSWF can also be generated as shown in Fig. 3.8, an example of digital pulse assemblage. It is important to note that the blocks of the whole circuit are planned to work as a digital Figure 3.6 – Some methods for pulse generation. design, without filter use, as result of a differential equation shown in the book (GHAVAMI; MICHAEL; KOHNO, 2004) for multipulse generation. In this example, four different-order pulses are generated, adding only a few extra hardware when compared to that necessary for one PSWF orthogonal pulse. Elements of integration, derivation, subtraction, sum, multiplication and square pulse generators of a Matlab model are used to produce the final result. Circuit implementation is left out of the cited book. The use of a finite impulse response filter (FIR) to create a $5^{th}$ order pulse response is a way to accomplish the FCC mask bounds, usually employed in the literature (ZWIRELLO; WIESBECK; ZWICK, 2014). The equivalent circuit is shown in Fig 3.7 implemented only by digital components. Each inverter represents a controlled delay added in the circuit to shape the final analog pulsed signal, resulting as the response of the analog front-end (FE) circuit that follows the logic gates. The final composition of the pulses generates the $5^{th}$ derivative that leads to FCC spectral mask compliance, after the FE that also filters out the DC at its inputs. An alternative circuit to produce a PSWF pulse is presented by (GHAVAMI; MICHAEL; KOHNO, 2004), the authors show multiple (4) circuit generators for the pulses, but here it is simplified as shown in Fig. 3.8. As example of the intrachip hardware, in (GIMENO; FLANDRE; BOL, 2018) an integrated wireless chip-to-chip communication transceiver for PCI-express links is presented. The application context of this chip is different than WBAN, as it is geared for line-of-sight data server communications, at very high (multi-Gbps) data rates. The proposed transceiver uses IR-UWB communication at 10 GHz with TX output power below the EMC regulations. Pulse position modulation (PPM) scheme is used for communications up to 10 cm at 2.5-Gb/s data rate. The energy efficiency better than 6 pJ/bit could be reached in this architecture effectively demonstrated on a chip. Figure 3.7 – Circuit schematic for 5th derivative of Gaussian pulse. Source: (STOICA et al., 2004), modified by the author. Figure 3.8 – Schematic diagram for one PSWF pulse generators. Source: modified from (GHAVAMI; MICHAEL; KOHNO, 2004). #### 3.2.3 Pulse Reference Values For an analysis in terms of energy, it is important to consider the variations in the shape of the wave, coder efficiency, payload impact, throughput, and the gain given by the waveform. As well as the values of reference. Each modulation technique depends on the waveforms, as they lead to different energy consumption points. Three classes of waveforms are compared: Gaussian Pulse, PSWF pulse, and a Square pulse, taken as reference. Figs. 3.2, 3.4, and 3.5 contain illustrative examples of these pulse shapes. Each pulse is shown in a different color. Each figure has only three pulse curves of the specific class, depending on their parametric variations. It is possible to find in the literature works using higher order derivatives. For example, (EMAMI, 2013) presents the $5^{th}$ order, which can be exploited to match a FCC mask (FCC, 2002). Among several derivative of the Gaussian pulse, the work presented by Tuan-Anh *et al.* (TUAN-ANH et al., 2007) generated a $7^{th}$ order and an ultra low-power circuit to generate the mono-cycle pulse for IR-UWB applications, filling the FCC mask. The dynamic energy consumption was 4.7 pico-Joules (pJ) per pulse in their case with a 1.5 V pulse amplitude (set by the supply). The 0.18 $\mu$ m CMOS technology was used in their design. Mohsen *et al.* (GHASEMPOUR et al., 2012) used a 0.3 pJ per pulse in a 90 nm CMOS technology in their simulations. $4^{th}$ and $5^{th}$ derivatives were implemented in their transmitter. The total power consumption goes up to the $\mu$ W level at a 100 MHz pulse repetition frequency (PRF). Gambini et al. (GAMBINI et al., 2012) did another low-power design in 65nm CMOS for UWB: a transmitter for short range communications was implemented. At both system and circuit levels, the architecture is adaptable according to link and channel conditions. Their reference values achieved for the system were 290 pJ in the receiver and 25 pJ in the transmitter per pulse/bit (GAMBINI et al., 2012), using pulse shaping that approached as much as possible the ideal PSWF waveform. ## 3.2.4 Synchronization Problem PSWF techniques can be employed to model waveforms and the link (ANTREICH, 2011). The application using an IR-UWB transmitter is also addressed in terms of waveform implementations that brings intrinsic delays regarding communication. This problem can be solved by using some reconfigurability features. Therefore, filter adjustment and other software defined radio techniques are essential requirements for such a system. The PSWF shaping constitutes a possible way to contribute with this objective, and broadly with the research. Other signals are studied as a reference model (*e.g.*, the Gaussian pulse and its derivatives), but PSWF is part of a family of particular signals, allowing one of the solutions to achieve the synchronism targeted. Investigating some of the trade-offs between time-limited and frequency-limited relations can collaborate with the elaboration of the energy efficiency model applied to the wireless communications system. The PSWF are analyzed as a way to contribute with this objective, in the context of the IR-UWB transmitter. In the Tx/Rx process, along with the signal waveforms, the way the signal propagates in the medium is relevant. Where the jitter problem must be considered, especially when the external communication occurs in an asynchronous mode. Another critical point to consider, to establish an analysis in such a context, is the Time of Arrival (TOA) of the data (from Tx to Rx). In TOA estimation, the delays are intrinsically implicit and must be treated. Internally in the transmitter, phase locked loop (PLL) is a component that allows synchronizing the clock for impulse generation. The necessary PLL use leads to higher circuit energy consumption, but during operation to keep the synchronization on the long-term, it is useful for the energy-savvy mode in a communication process. There are examples in the literature on how to implement the PLL (ALVARADO; BISTUÉ; ADÍN, 2011), and its basic blocks are: - (i) Phase Frequency Detector (PFD); - (ii) Voltage-Controlled Oscillator (VCO); and - (iii) High-Frequency Divider (HFD). "The VCO and HFD are crucial in the whole power consumption of this complex circuit as these have to work in the high-frequency bands of the application" (ALVARADO; BISTUÉ; ADÍN, 2011). ## 3.3 Modulation and Coding Evaluation for Energy Efficiency Three classes of waveforms are compared: Gaussian pulse, PSWF pulse, and a square pulse, in terms of spectral occupancy and power bandwidth. A pulse energy of 175 pJ was assumed as a reference value of 0.0875 W for a 2 ns pulse duration. For comparison purposes, it is considered that all pulses have the same maximum value in the symbol interval, which is the same for all waveforms. Some parameters like the normalized power, the energy for a single pulse, and WBAN payload of 255 data bytes were considered for simulations. The comparison leads to PSWF pulse being narrower than the other pulses, according to Fig. 3.9. The results point to a better energy relationship for the minimum payload that represents the composition of the WBAN packets. One of the major goals of this work is to find the best trade-off between the modulation schemes and a better pulse shape to use in terms of energy and spectrum (PSWF pulse is more concentrated in spectrum than the others). The symbol intervals have to deal with other problems that may prevent higher M-ary in the communication scenario, like: jitter, timing and phase delays, and intersymbol interference. IEEE 802.15.6 standard (IEEE, 2012) defines BPSK/QPSK, FM-UWB, and OOK as possible alternatives for modulation. Each alternative causes particular impacts on the energy consumed for data communication. According to the standard, OOK is the default modulation -50 -50 -150 -150 -150 -200 -250 -250 -300 -5,15 -5,2 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 -5,25 Figure 3.9 – Comparison of the spectral efficiency of the pulses. Source: the author. configuration. Fig. 3.10 presents the impact on energy of the payload (255 bytes) of the WBAN for 10 different PRFs. The comparison shows the higher energy efficiency of PPM modulation. Appendix E reviews the some basic modulation modes that are available to use in a transceiver for WSN communication. An example for PPM with variable index of modulation is also shown. Figure 3.10 – Energy spent to transmit the 255 bytes, . Source: the author. The impacts of pulse shapes and modulation format on energy and power consumption for WBANs based on UWB are evaluated in this section. An impulse radio is addressed for this purpose in UWB. The basic waveforms Gaussian pulse derivatives and Prolate Spheroidal Wave Functions (PSWF) are presented to study the trade-off between time- and frequency-limited signals as a possible solution to increase the energy efficiency. The total energy consumption is measured via simulations, considering the signal characteristics, the WBAN payload, the pulse position modulation (PPM), and on-off keying (OOK) schemes. The optimizations from digital baseband block model based on techniques reviewed by the scope of these proposal, and based on the concepts of an IR-UWB system. In such model the main points to Digital Circuit (IC) Design are: the waveforms and related modulations, and some aspects of the Link by itself and MAC sublayer, all in terms of power consumption. In this Section, these three points whose optimization is expected are described from some perspectives and the acquired results. # 3.3.1 Modulation Techniques Comparison Variations on the PPM codification were verified and analyzed. The principle that each modulation technique has different energy consumption, and consequently different power spent during operations, is the starting point for these evaluations. The goal of the design is to find signals that provide better energy estimates at the same time that their effects on the variability of the energy efficiency ( $E_{eff}$ ) are considered. Based on existing WBANs and IR-UWB basic models (KARVONEN; IINATTI; HÄMÄLÄINEN, 2015) (AKYILDIZ; VURAN, 2009), the payload was selected as a metric to represent the amount of consumed energy. The remaining WBAN frame components were considered as energy constants. The data structures of the transceiver impact on the energy consumption. The PPDU and MPDU allow communication between the hub (coordinator node) and the remaining nodes. The PPDU contains information represented by the RF signals during the communication, under a given codification and modulation, where the payloads are inserted. The energy is effectively consumed when there are high level signals on the transmission, *i.e.*, when a great number of "1's" is required to represent the message over time. To save energy, the basic idea is to reduce the average level into a minimum amount of "1's". The "0's" are not necessarily transmitted within the synchronized period of IR-UWB, which reduces the amount of energy spent. In the proposed approach, transmission data periods, varying from 0 to 255 bytes (as defined by (IEEE, 2012)), are modeled as trains of pulses, in which the binary data of the frames are represented according to both each energy level and the chosen coding scheme. For energy comparison purpose, one bit is equivalent to one pulse in PPM (M = 1) and in OOK coding. ## **3.3.2** Specification Summary (Cont'd) For the specification parameters in Table 3.2, all waveform have a maximum amplitude of 1 V, or a maximum 1 Watt of instantaneous power over 1 $\Omega$ output resistor. All three pulses are equalized in time. Simulations developed in C/C++ and Matlab were performed considering these parameters. The simulations assumed random data for pulse trains and cycles of iterations to ensure a more stable curve to homogenize the results variations. Table 3.2 – Pulses and communication parameters | Gaussian, PSWF, and Square Pulses | Parameters | |--------------------------------------|----------------------------------------------| | Sample Frequency (GHz) | 21.2 | | Bandwidth (MHz) | 499.2 | | Frequency Range (GHz) | 3.1 - 10.6 | | dt (ps) | 0.4 | | $-$ Eb/ $N_0$ (dB) | $7.8 (BER = 10^{-4})$ | | BER | $10^{-3} \sim 10^{-5}$ | | $4th$ Gaussian Drvd: $ au$ and $T_c$ | $0.93 \times 10^{-9}$ and $2 \times 10^{-9}$ | | Normalized Power (W) | 0.0875 | | Energy in 2ns (pJ/pulse) | 175 | | Total Energy (nJ/payload) | 357 | Source: the author. # 3.4 Summary Some essential theoretical and design-related aspects in the domain of pulse-shaping analysis and pulse-shaping hardware synthesis were presented in this Chapter. An introduction of the Prolate Spheroidal Wave Function is given with the intention of finding the best trade-off in terms of energy when compared with two other functions (Square and Gaussian) normally used as design goals. This Chapter closed when the modulation and coding for energy efficiency were analysed, with a performance evaluation. Energy and power values obtained as results of the simulations are reference data for the next sections in which a Cross-Layer Energy Efficiency Model is developed. # 4 FORWARD ERROR CORRECTION (FEC) DECODING One of the main purposes of this Chapter is to reach the values of area and power for low data rates, based on the hardware of the encoders and decoders that are inside the RF transceivers. Whereas the main subject to be addressed here is the mechanism for a prior error correction, which ensures reliability in the message transmitted by redundant data insertion according to specific forward error correction encoding. Two candidate FEC encoders are initially reviewed. #### 4.1 FEC Review Forward error correcting codes focused in this work are the Bose-Chaudhuri-Hocquenghem (BCH) and its variations, and the Low-Density Parity-Check Codes (LDPC). Other codes like Reed-Solomon (RS), Turbo Code or Turbo-like Codes are not considered in WBAN communications, and hence will not be implemented nor discussed throughout this thesis. # 4.1.1 BCH and LDPC Theory BCH cyclic codes are important linear block codes, independently developed by their authors Bose, Chaudhuri, and Hocquenghem (BCH), who contributed each with a letter in the algorithm acronym. The theory was grounded in the sixties (MOREIRA; FARRELL, 2006). Their operation over finite fields (a Galois field in binary base GF(2)), allows a wide range of rates, modifying the throughput and the correction capability according to the application. Appendix B has a table with the finite fields and BCH basis considered. These codes can be seen as a specific case of the Reed-Solomon (RS) codes, as they use the same principle although in a non-binary Galois field (MOREIRA; FARRELL, 2006). In BCH codes, each codeword contains a message to be transmitted and the calculated parity bits. For a given codeword of "n" bits, there are "k" message bits and "n-k" parity bits that ensure a correction capacity up to "t" errors in each codeword. The code characteristics are completely described using the notation BCH(n,k,t) and the code rate is given by the ratio " $\frac{k}{n}$ ". The message encoding is based on an algebraic approach, *i.e.*, it uses a generator polynomial to create the parity bits. Then, the data to be transmitted sequentially enters into a linear feedback shift register (LFSR) whose connections are defined through the selected polynomial. For instance, the BCH(63,51,2) code has $g(t) = 1 + x^3 + x^4 + x^5 + x^8 + x^{10} + x^{12}$ as its generator polynomial. As the code grows in size and correction capability, this polynomial includes more elements to calculate the parity bits. Table 4.1 shows all BCH codes that were used in our circuit designs for the decoding. In this thesis the decoders were designed in CMOS. This is due to the greater degree of complexity of the decoder when compared to the encoder, as it comprehends several steps to decode the message. First, the decoder performs the syndrome calculation of the received Table 4.1 – BCH code rates | BCH type | Code Rate | |----------------|-----------| | BCH(127,64,10) | 0.5 | | BCH(63,39,4) | 0.619 | | BCH(63,45,3) | 0.714 | | BCH(63,51,2) | 0.809 | Source: the author. codeword, so it can count the number of errors. If this error count is equal or less than the "t" value, the Chien search mechanism (CHIEN, 1964) calculates the roots of a polynomial in finite fields, determining the error-location polynomial " $\sigma(x)$ ". Next, it calculates the roots of the computed error-location polynomial to determine which bits are flipped (x's values from $\sigma$ ). Once these positions are known, the decoder can calculate the correct value of these bits and, hence, deliver the correct text message to the receiver baseband processor. #### 4.1.2 FEC Code Performance The forward error correction block (FEC block) need to be investigated to allow energy efficiency in the channel coder. As a part of Channel Coder block, which has also the interleaver and the scrambler, with the functions of minimizing the binary repetition and add robustness to the communication system. Both do not change significantly the power issue, but the analysis of FEC is promising (KARVONEN, 2015). Considering the type of FEC coding to be chosen (e.g., BCH, LDPC, RS, etc.), it is possible to get significant energy efficiency gains. Other types of FEC coding like convolutional or Turbo codes on the PHY protocols can be added. The turbo codes are introduced at 1993, until then the Viterbi algorithm predominates, sometimes concatenated with another code, as powerful codes in error-correction. Besides the purpose of saving energy, there are other reasons for the use of FEC. For instance, whereas the computational capabilities increase in a WSN or WBAN, the number of errors that follow them also increases. The work of (KARVONEN, 2015) proposes to take more transceiver parameters to improve the cross-layer energy efficiency model that deal with hard conditions, *e.g.*, by assuming imperfect synchronization or more realistic channel. Comparisons between wake-up periods and its mechanisms are also proposed. And moreover, WBANs with adaptive algorithms searching for the optimal thresholds on the run time would be desired, according to the bit error probability of a target environment and communication channel condition. # **4.1.3** Reconfiguration Process According to (VERHELST; DEHAENE, 2009) in an energy-driven design, using a top-down approach, the power consumption of the resulted implementation should be taken into account for the selection of the algorithm, aiming high performance. Moreover, an energy-optimized design of the selected algorithm will typically be subject to a significant performance degradation compared with its theoretically derived performance at the algorithmic level. In such a stage, how to proceed the reconfigurations through the use of algorithms is vital for energy saving. As mentioned in Section 5.2.1, the global aspects must take into account the implementations and hardware specifications. In that case, (OTT; EISNER; EIBERT, 2012) shows a work to deal some with UWB bands for an IR-UWB, using a reconfigurable software. The pulses are generated through digital techniques for baseband radio on FPGA, covering the frequency range from 6 GHz to 10 GHz with 1 GHz of bandwidth per channel. The data rate goes up to 500 Mbit/s, with 7 ns of pulse duration, in a Front-End of low complexity, adapting the own antenna as a bandpass filter. Additionally, the PSD of pulse train modulated with $4^{th}$ -ary PPM with a PRF of 1/Ts. These specifications allow reconfigurable concepts to build a flexible radio prototype, a recent model that serves as an example to follow. # 4.1.3.1 Filtering and Front-End Information Interfacing Other works have different ways to implement the filtering; for instance, (NEVES et al., 2012) has a proposal of PSWF filtering using operational transconductance amplifier (OTA): Gm-C filters. A method of UWB pulse generation through numerical approximation is used for multiple access schemes. Gm-C integrators are used to implement a state-space optimization from PSWFs pulse technique, showing ways to achieve it in the analog domain. #### 4.1.3.2 Operation Modes As shown in previous sections the main three modes of IR-UWB radio equipment can work consider the coordination in the communication process, synchronous, asynchronous, or the use of the wake-up principle. The necessary energy for operation of the radio equipment is defined by each duty cycle and how long the communication continues. Aggregating the quality of circuit used and which techniques are employed, *e.g.*, the modulation and waveforms chosen for such a task. In such sense a better energy efficiency in radio equipment leads to a critical study of the transmission and reception process, of the internal processing modes, and how useful is the top level information received. The process and control module can be translated into a digital circuit design. Operational modes can be programmed to execute some functions, including also digital signal processing techniques. For instance, to control the channel access considering some stochastic occurrence and the bit-error probability. The success in communications corresponds to 100% (P=1), and less than this represents troubles in communications (p = 1 - n, where p is less than 100%), while "n" is the probability of errors. Thus, for a reconfiguration process, the operations modes can help, and the algorithm of energy efficiency varies according to each mode with more chances of success. ## 4.1.3.3 General Energy Manager (Control based on SystemC Module) The reduction of the number of procedures execution in a communication process is pursued as a way to get energy efficiency. The decreasing number of required re-transmissions is an example of how successful the communication is. One of the goals is to reduce the energy consumption due to the minimum active time. If the microsystem formed by the radio equipment has a general energy controller or manager (GEM), the chances that better energy efficiency is achieved will rise. Figure 4.1 shows an example of a supervisory and control system, for which the implementation can be done in software, as SystemC inserted in C/C++, or in VHDL code. It is an excellent example of cross-layer supervision and control algorithms working in favor of energy savings. Figure 4.1 – Cross-Layer Figure Model with General Energy Manager. Source: the author. The work of (SUHONEN et al., 2012) presents a general hardware architecture of a platform based on sensor nodes, with communication subsystem enabling wireless communication, computing subsystem allowing data processing and the management of node functionality, sensing subsystem connecting the wireless sensor node to the outside world, and power subsystem providing the system supply voltage. Some examples of algorithms to be implemented in Link Layer are to the LLC, access, and scheduling the pipeline of the communication, giving information about the process to the general energy manager. In the same way, the PHY layer elements can provide the GEM feedback on how the front-end operations are proceeding, and the respective specifications changes (for the F/E and Channel). #### 4.2 FEC Architectures Noise is inherent to any communication system and, consequently, it is fundamental to improve the channel reliability through forward error correction (FEC) codes to reduce the amount of corrupted data. The aim in this section is to compare the hardware quality of results of two different FEC decoders applied to the IEEE 802.15.6 WBAN (wireless body-area network) standard. In this context, both BCH and Low-Density Parity Check (LDPC) codes are considered, since the former is established on the WBAN standard and the latter offers a channel capacity achieving performance. As accurate power estimation can only be found through real circuit stimuli, the decoders processed messages created according to the WBAN packet specification and pre-transmission processing as illustrated in Figure 4.2. Scrambler FEC Encoder Pad BIT Interleaver PSDU Descrambler FEC Decoder Pad BIT Out Deinterleaver receiver Figure 4.2 – FEC's Encoder and Decoder in the Transceiver (IEEE, 2012). Source: the author. All decoder architectures were synthesized for a commercial CMOS 65 nm technology aiming for low power and low data rate communications. Results showed that BCH codes offer a lightweight, energy, and area efficient implementation, especially for higher code rates. Despite the improved correction capabilities, the QC-LDPC decoders we have implemented in the same technology are, at least, $3.3 \times$ larger and consume $3.76 \times$ more than a BCH decoder for the same code and data rates. A comparison of the BCH and LDPC decoders is presented, as the literature does not present such correlation in the context of the UWB 802.15.6 communication. Emphasis is given to the power dissipation results since most sensor nodes are battery-operated or rely on energy harvesting. In a broader sense, for WSN, it is crucial to assess the quasi-cyclic LDPC (QC-LDPC) architectures and algorithms as alternatives, to minimize overall consumption in such networks. Two QC-LDPC architectures are described and implemented. # 4.2.1 Forward Error Correcting Codes Forward error correcting codes focused in this work are the Bose-Chaudhuri-Hocquenghem (BCH) and its variations, and the Low-Density Parity-Check Codes (LDPC). Other codes like Reed-Solomon (RS), Turbo Code or Turbo-like Codes are not considered in WBAN communications, and hence will not be implemented nor discussed. The FEC decoder is inherently more complex than the encoder as it tries to obtain the original message from a collection of bits that may or may not have errors. Hence, this work focus only on these decoders to establish a fair and realistic comparison between their hardware performances. The IEEE 802.15.6 Standard determines a plethora of rules which shall be followed to have a full-compliant sensor. This work addresses only the medium access control (MAC) parameters for the ultra-wideband (UWB) physical implementation. Hence, Table 4.2 shows a summary of the data stream characteristics that these FEC architectures need to support. Table 4.2 – WBAN Stream Parameters | Parameter | Range | |------------------------------|--------------| | Throughput (Kbps) | 250 to 487.5 | | Payload length (bytes) | 0 to 255 | | Minimum packet length (bits) | 4032 | Source: the author. It is worth noting that the minimum packet length considers both PHY and MAC header for UWB and no payload to be transmitted. The 802.15.6 standard supports higher data rates, but such rates are not in the scope of power-savvy WSNs or WBAN communications, in which the low power and low data rates are key constraints. The FEC hardware power is analyzed for target rates below 500 Kbps. ### 4.2.1.1 Bose, Chaudhuri, and Hocquenghem (BCH) Table 4.1 shows some possible BCH codes. 0.5 and 0.809 are present in the IEEE standard. The BCH's architecture is based on a previous work (JAMRO, 1997), the architectures generated from its C/C++ based software, with some adjustments, are suitable to BCH requirements. There are a great number of works dealing to improve similar architectures, e.g., (DILL, 2015) and those that deal specifically with some part of the Codec algorithms (e.g., Chien Search procedure (YOO; LEE; PARK, 2016)). This work was concerned with the implementation of the most simple and operational BCH architecture, as in (JAMRO, 1997). The IEEE 802.15.6 Std predicts, for instance, the use of encoders and decoders such as those in Figures 4.3 and 4.4. Where the number of registers (LFSR) and their structures vary according the code rate (the rate between the number of bits in the message and the coded word) that is employed. Once the message is received, the decoding algorithm follows these steps: - Syndrome Identification - Minimal polynomial selection - Recursive routine to improve the accuracy - Chien Search routine - Error position identification in polynomial - Correction of the error The BCH decoder as shown in Fig. 4.4 represents these mentioned steps in its architecture, where the capacity of bits correction depends of the code rate selected. Thus, if is detected that the number of corrections exceeds such capacity, a re-transmission is necessary. Examples of architectures can be seen in (YITBAREK, 2014) for a serial data throughput case. The throughput of the WBAN according the standard for the default conditions is 250 kpbs and for the High Quality of Service is 487.5 kbps, the last represents a higher degree of reliability. A dedicated BCH architecture applied to WBAN follows this requirement. How the communication occurs and the amount of energy is necessary to perform an operation in such WBAN system are object of studies (KARVONEN; IINATTI; HÄMÄLÄINEN, 2015). In a Cross-layer approach, other aspects are considered to avoid energy consumption in the worst case. One scheme to be used when FEC fails is the automatic repeat request (ARQ) which provides again the lost packets. But, each new requisition implies in the use of more energy. This scheme can be associated with the FEC mechanisms, which is know as H-ARQ, as mentioned in the standard. Figure 4.3 – Encoder for BCH (63,51,2). In a communication process the received message (r(x)) is the result of the transmitted message (c(x)) with the probabilistic error (e(x)) inserted on it. As shown in Figure 4.4, the number of syndromes (SI, S2, etc.) is according with the existing number of the error(s) and the respective polynomial(s) "e(x)". In the figure the syndromes are represented as outputs of the m's blocks, corresponding to the minimal polynomials. The designs of this work follow the strategy presented in (JAMRO, 1997). The recursive algorithm with "t-1" iterations, represented in Figure 4.4, the BMA with inversion (BMAi) is deployed in the decoder. The syndromes are inputs inserted in the BMAi block, where successive increments by one of the degree of $\sigma(x)$ are made until find the error Figure 4.4 – Example of decoder block diagram for BCH. location polynomial as result of this process. Therefore, the degree and the roots of this final polynomial are directly a consequence of the transmission errors. The hardware follows the blockdiagram of Figure 4.5, which is based on (JAMRO, 1997) and (MASSEY, 1969). An alternative direct method, for the case of t=2, is calculating the syndromes as the remainder in the polynomial divisions of the r(x) by the minimal polynomials in $m_1(t)$ , $m_2(t)$ , e.g., using the Peterson's method. For the $\sigma(x)'s$ resolution, when the *Code Rate* is equal to 0.81, syndromes polynomials $(S_1$ and $S_3)$ are used and two $\sigma(x)'s$ are calculated. For the WBAN's generator polynomial, there are two error location polynomials: $\sigma_1 = S_1$ and $\sigma_2 = ((S_3 + S_1^3)/S_1)$ , where $S_1 = 1 + X + X^6$ and $S_3 = 1 + X + X^2 + X^4 + X^6$ . A division circuit (by $S_1$ ) for $\sigma_2$ calculation is given by Figure 4.6, with six registers and two XOR-gates. The Chien's search mechanism is used and tracks the error(s) by comparison of the values in the registers since initial $\sigma's$ , see Figure 4.7. The error has been found when all coefficients of the error location polynomial are equal to zero. The bit values are delayed in a buffer and a XOR gate correct each flipped bit in the sequence. On the next clock cycle each value in the register is updated, thus a new iteration occurs until all bits have passed. #### 4.2.1.2 Low-Density Parity Check (LDPC) Low-Density Parity Check (LDPC) codes are channel capacity-achieving forward error correction codes proposed by Gallager in early 1960s (GALLAGER, 1962). These codes may be constructed either as a block code or a convolutional code (WANG; CUI; SHA, 2011), although the focus here is only on the linear block approach due to simplicity of both code construction and hardware implementation. In this case, a LDPC code C is characterized by a $n \times m$ sparse parity matrix $\mathbf{H}$ where n is the codeword length and m is the set of parity equations. A valid codeword $c \in C$ is the null space of $\mathbf{H}$ , i.e, for c to be valid, all parity equations should be Figure 4.5 – Berlekamp-Massey Algorithm with inversion. satisfied (LIN; COSTELLO, 2004). Hence, each codeword has k = n - m message bits and m parity bits. A performance and complexity comparison between LDPC and turbo codes are provided by Thomas J. Richardson and Rüdiger L. Urbanke (RICHARDSON; URBANKE, 2001). They consider to use of the sparseness with parity-check matrix to obtain efficient encoders, and a linear time encoding could be used to optimize codes. Moreover the encoding complexity is essentially quadratic although the associated coefficient can be made quite small, still remaining practical. In (GALLAGER, 1962), the LDPC as an excellent decoding algorithm that allows high Figure 4.6 – Polynomial divisor. Figure 4.7 – Chien's search circuit. throughput and better error correction performance, operating close to the Shannon limit, but only from the 2000s it was possible to implement it. One of the reasons is because of the need for a large amount of capacity to store information, and large integrated memory. Consequently, it is necessary provide energy to this part of processing task, implying in more energy consumption. Other significant fact was became current that some important standards are predicting the use of LDPC, such as IEEE 802.11n, 802.16e, 802.22, and so on (IEEE, 2012). Further, the LDPC code C can be represented graphically through a bipartite graph known as Tanner graph (TANNER, 1981). This representation maps the codeword bits into variable nodes (VNs) and parity-check equations into check nodes (CNs). Each matrix cell $p_{(i,j)}$ filled with 1 in $\mathbf{H}$ represents an edge between the VN<sub>i</sub> and the CN<sub>j</sub>. Figure illustrates the Tanner graph of the LDPC code described by the matrix 4.1. Circles and squares represent the VNs and CNs, respectively. $$\mathbf{H} = \begin{bmatrix} 1 & 0 & 1 & 1 & 0 & 1 \\ 1 & 1 & 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 & 1 & 1 \end{bmatrix} \tag{4.1}$$ In this work, only regular codes which have constant column and row weights are considered. This means that each column and row in the matrix ${\bf H}$ have exactly $\omega_c$ and $\omega_r$ non-zero values, respectively. These parameters define the code rate which is the ratio between the codeword length and the additional parity bits through the equation $R=1-\frac{\omega_c}{\omega_r}$ . Matrix in (4.1) shows a (2,4)-regular code with rate 0.5 and 6-bit codeword length. Irregular codes generally offer better correction capabilities but they often demand more complex hardware implementation (OH, 2008). #### 4.2.1.3 LDPC Code Construction The original LDPC codes (GALLAGER, 1962) were based on random sparse matrices with near-capacity performance for additive white Gaussian noise (AWGN) channels. Later, (MACKAY, 1999) rediscovered these codes and proposed an efficient approach to generate these codes which guarantees that all matrix rows are linearly independent and avoid short graph cycles that compromise the code performance. Random-based parity matrices lead to complex hardware implementation especially in the encoder. Thus, structured codes based on cyclic or quasi-cyclic arrays (RYAN; LIN, 2009; HAN, 2007) offer very low error floors with simplified hardware although they have limited code rate and length. Hence, the circulant permutation array (CPA) approach is used to generate the quasi-cyclic LDPC (QC-LDPC) codes which uses an array of permuted identity matrices. Figure 4.8 shows an example of $3\times 6$ permutation array where each square represents a $m\times m$ weight-1 circulant. In this case, the permutation array size gives the number of submatrices that compose the code and, consequently, the row and column weights. The code length is equal to $\omega_T\times m$ . Figure 4.8 – Parity check matrix constructed using the CPA approach (HAN, 2007) Table 4.3 shows a summary of the QC-LDPC considered for the purpose of this work. All codes were constructed aiming for the lowest message length given the block sizes that are recommended in the WBAN standard for the BCH FEC, to achieve a fair comparison between the FECs. Table 4.3 – QC-LDPC Codes | $(\omega_c,\omega_r)$ | Code<br>Rate | Submatrix<br>Size | Message<br>Size | | | |-----------------------|--------------|-------------------|-----------------|--|--| | (3,6) | 0.5 | 13 | 78 | | | | (3,8) | 0.625 | 13 | 104 | | | | (3,11) | 0.727 | 13 | 143 | | | Source: research group of the author. #### 4.2.1.4 Decoding algorithms For each bit $c_i$ in the transmitted codeword $\mathbf{c} = [c_0 \ c_1 \ \dots \ c_{n-1}]$ the decoder calculates the a posteriori probability (APP) of such bit to be 0 or 1. For numerical stability and arithmetic simplicity, such probabilities are normally described in terms of the log-likelihood ratio (RYAN; LIN, 2009) as shown in Equation 4.2. This data is known as intrinsic information as it depends solely on the observed symbol value and the channel characteristics. $$LLR(c_i) \triangleq \frac{Pr(c_i = 0|\mathbf{y})}{Pr(c_i = 1|\mathbf{y})}$$ (4.2) where y is the received codeword at decoder input. All state-of-the-art LDPC decoding algorithms are based on the Belief Propagation (BP) algorithm where CNs and VNs in the Tanner graph exchange messages and update their relative probabilities (BIROLI; MARTINA; MASERA, 2012). Such message-passing approach is repeated until a valid codeword is decoded or the decoder reached the iteration limit. Each iteration has two phases: first, variable nodes process the inputs coming from their neighboring nodes and the intrinsic information, then they send messages with the computed extrinsic information. Once all VNs have sent their messages, the check nodes process the inputs coming only from their neighboring nodes before sending the computed data. In each iteration, the decoded bit is obtained through a hard-decision of the computed log-likelihood ratio in each node (WANG; CUI, 2007). The number of iterations impacts on the communication bit error ratio (BER) (LI et al., 2005) as they are directly related to the convergence rate of the algorithms. There are two main decoding algorithms: the sum-product algorithm (SPA) and its reduced complexity version, the min-sum algorithm (MSA) and its variants. These algorithms state how CNs and VNs will handle the information bit and, consequently, they dictate the hardware implementation. Reference (WANG; CUI; SHA, 2011) presents an overview of these algorithms and some insights about their implementation. ## 4.2.1.5 Decoder Architecture for QC-LDPC codes This research requires a comparison with an architecture very similar to that of BCH, that is, QC-LDPC is a good candidate because it has parallelism, hardware simplicity and decoding efficiency. Each transmitted bit can experiment a voltage change in the communication process, as a consequence of channel conditions, digital to analog conversion and vice-versa, circuit attenuation and a myriad of other causes. The message loses reliability with such changes in electrical values, and a mechanism to correct each bit is a premise in the ability of the FEC CODEC. Therefore, LDPC decoders follow a soft input and hard output data processing approach. The QC-LDPC architectures used in this work have a message size capable of finding the balance between correction performance, circuit area, and power dissipation. The interface controller, core controller, and processing core are the three main components present in the architecture used. The interface controller handles incoming and outgoing data communication and commands. The input interface is semi-parallel to reduce pin count without compromising the throughput. The output data bus is fully-parallel to provide all the received data in only one clock cycle. The core controller can initiate the decoding process. The core controller is responsible for managing all the signals affecting the decoder core, in addition to managing memory addressing during each step of the decoding algorithm. The decoding process goes up to the iteration limit. The decoding iteration processes several bits of the message, which means that the incoming data fulfill the memories and a tracked resource is necessary to index it. The registers are used in the ASIC implementation, instead of static random access memories, because the LDPC codes considered have lengths of up to 400 bits in this work. As an advantage, this feature allows simplification in both physical and logical design. ## 4.2.1.6 Synthesis Flow and Implementation Methodology The random-generated messages passes through the scrambler to obtain the PSDU from the MPDU. Then, each FEC encoder processes the PSDU to generate the codewords according to the code constructions. Since all considered codes are block codes, it may be necessary to generate several codewords to transmit the entire packet due to the codeword length. All encoded messages pass through a standard interleaver and, then they are sent through an additive white Gaussian noise (AWGN) channel with a high signal-to-noise ratio (SNR). This approach ensures that all received messages are error-free, a necessary condition in this context as it simplifies both the decoder verification and the hardware energy efficiency as they are subject to similar conditions. Then, these messages are used as input vector for the simulation process to generate the circuit stimulus for the precise power estimation. The simulation tool reads the synthesized hardware netlist and applies these vectors to the inputs, capturing all internal transitions. Additionally, it is necessary to include the synthesis tool-generated Standard Delay Format (SDF) file that takes into account all temporal glitches of all nets as they have an impact on the power dissipation even though they are transparent to the final computation result. At the end of this process, the simulation tool generates a Value Change Dump (VCD) file which contains all switching activity of all nodes. Therefore, the synthesis tool can read this file and consider all registered activity to perform a realistic power estimation. All decoders were synthesized using an ASIC-oriented logical synthesis using the Cadence Solution. This tool supports a physically-aware logic synthesis that enables a realistic power estimation as it considers wiring resistance and capacitance as well as the standard cells preplacement for routing estimation. Such approach requires several files from the targeted technology process: the Liberty files (.LIB) which contains all information about the standard cells like power, area, delays, etc; the Layout Exchange Files (.LEF) that describes the physical characteristics of the technology such as capacitance between metal layers, standard cell pin positions, interconnection resistance data, etc; and the Capacitance Tables (.CapTbl) which contain fine-grained capacitance values. ## 4.3 Power Comparison Ralated to the Modules of the FECs As conclusion of this section, the Table 4.4 presents the decoders were tuned to operate at frequencies that achieve the two WBAN's data rates, indicated as $F_{op}$ frequencies in the same table. It shows a complete comparison of the quality of results (QoR) of all described FEC decoders which were synthesized in the logic synthesis step with a frequency target of 10 MHz. Each actual operating frequency $F_{op}$ can be much lower for low-power, and that is determined analyzing the decoder latency and the throughput that each design has to sustain to ensure a fair comparison. Due to the small message lengths, the pipelined serial architectures of BCH decoders achieve the necessary throughput with low operating frequencies and a much smaller circuit area. Conversely, LDPC decoders consume a large area due to the amount of memory required to compute the extrinsic messages in the iterative decoding algorithm. For channels with high SNR, the bit error probability (BER) tends to be less prominent, hence FEC codes with high rate are preferred due to the small overhead. In considered cases, the BCH codes defined on the IEEE 802.15.6 Standard offer the most suitable choice due to the reduced power and area of their decoder implementation. For instance, the MSA-based QC-LDPC decoder is $3.3\times$ larger and consumes $3.76\times$ more power when compared to the BCH decoder with same code rate operating at 250 kbps. One key difference between the decoders considered herein resides on their decoding approach. While all BCH architectures use hard-decision decoders, LDPC decoders use soft-messages as inputs which translates into considerably larger datapath and memory storage. The rate 0.5 QC-LDPC decoders considered in this work require, at least, 1560 bits to store both intrinsic and extrinsic messages. Table 4.4 – Area and Power Comparison for Low Data Rates | | | | | | 250.0 kb | ps | | | 487.5 kb | ps | | |----------------|-------|---------|-------------|--------------------|--------------------|--------------------|----------|--------------------|--------------------|--------------------|----------| | Code | Code | # gates | Cell Area | Leak. Power | Dyn. Power | Total Power | $F_{op}$ | Leak. Power | Dyn. Power | Total Power | $F_{op}$ | | Type | Rate | | $(\mu m^2)$ | $(\mu \mathbf{W})$ | $(\mu \mathbf{W})$ | $(\mu \mathbf{W})$ | (kHz) | $(\mu \mathbf{W})$ | $(\mu \mathbf{W})$ | $(\mu \mathbf{W})$ | (kHz) | | | 0.5 | 1676 | 10050.0 | 6.06 | 1.98 | 8.05 | 259.9 | 6.06 | 3.87 | 9.93 | 506.9 | | ВСН | 0.62 | 671 | 3879.7 | 2.39 | 0.82 | 3.21 | 259.9 | 2.39 | 1.60 | 3.99 | 506.9 | | ВСП | 0.71 | 551 | 3170.4 | 1.95 | 0.69 | 2.64 | 259.9 | 1.95 | 1.34 | 3.29 | 506.9 | | | 0.81 | 203 | 1187.7 | 0.82 | 0.40 | 1.22 | 259.9 | 0.82 | 0.78 | 1.57 | 506.9 | | QC-LDPC | 0.5 | 6228 | 34731.8 | 20.82 | 11.23 | 32.05 | 333.3 | 20.82 | 21.90 | 42.72 | 650.0 | | SPA | 0.625 | 8395 | 46568.0 | 28.01 | 11.59 | 39.60 | 250.0 | 28.01 | 22.60 | 50.61 | 487.5 | | | 0.727 | 11504 | 64110.0 | 38.29 | 12.07 | 50.36 | 181.8 | 38.29 | 23.54 | 61.83 | 354.6 | | OC I DDC | 0.5 | 5973 | 33505.7 | 20.21 | 10.06 | 30.27 | 333.3 | 20.21 | 19.62 | 39.83 | 650.0 | | QC-LDPC<br>MSA | 0.625 | 8164 | 45198.9 | 27.35 | 10.13 | 37.48 | 250.0 | 27.35 | 19.75 | 47.10 | 487.5 | | MISA | 0.727 | 11375 | 62596.0 | 37.92 | 10.22 | 48.13 | 181.8 | 37.92 | 19.92 | 57.84 | 354.6 | Source: The author. # 4.4 Complementary Comparison of Hardware Energy A comparison of power in mW is presented in Fig. 4.9 for BCH's CODECs synthesized in 45 nm and 65 nm. Finally, taking as an example, it should be mentioned that the energy values are related to the power of each circuit, such that the power used to process a certain payload of data (bits to be transmitted) demands a certain time. In this sense, if for a bit rate of 100 Mbps (much higher than for 250 or 487.5 Kbps) consumption is around 565 $\mu$ W in the decoder module for the Code Rate of 0.5, then **5.65 pJ/bit** is spent. Representing that for this specific part of the circuit in a 65 nm of technology, the BCH coding is very energy efficient. Figure 4.9 – Comparison of the Power in a Coder/Decoder in 45 and 65 nm (100MBps). Source: the author. Figure 4.10 – Comparison of the Power in a Coder/Decoder in 45 and 65 nm (487.5Kbps). Source: the author. Figure 4.11 – Comparison of FECs (at 250 and 487.5 Kbps). Source: the author. For the WBAN cases, Fig. 4.10 shows the power consumption for implementations in 45 and 65 nm using a throughput of 487.5 Kbps. In comparison with the previous figure (4.9), the values in mW are significantly lower. As expected, the hardware for the lower code rate (0.5) involves more gates than the higher ones (0.81), wherein the former the increase in power consumption is consequently perceived. Figure 4.11 indicates how is the power consumption considering only the action of the hardware of the decoder. As is possible to see, due to the architecture the LDPC has an increased power consumption whereas BCH decreases. And for both throughput there is a little variation of power. #### 4.5 Summary This Chapter reviewed some concepts related to Forward Error Correction, and focused on the BCH and LDPC encoders. Both currently in use, the LDPC is a great alternative for high data rates. BCH decoder architecture was synthesized in CMOS in this study, and it proved to be more compact and to save power and energy compared with the more robust LDPC FEC architecture. The BCH was proved to be a better engineering choice for the WBAN applications, specially for low data rates. The reference data intended to reach in this Chapter, as a result of power analysis, were in accordance with the hardware of the FECs. For this, a comparison of their circuits, synthesized in 65nm CMOS gates, was made for the most complex component (the decoder) of each one. #### 5 CROSS-LAYER ENERGY MODELING This Chapter presents a proposal for the cross-layer model of the IR-UWB communication and how to derive it, based on the system architecture and on circuit parameters studied in the previous chapters. In order to obtain such model, an interaction between more than one layer of the OSI model must exist, provided that the intended application makes it possible to generate a functional model of the system. The main characteristics of the IR-UWB system that need to be modeled in this Thesis, and which are assumptions observed in this Chapter, are: i) The operation model of the RX is the wake-up (WUR) type, since this is much more energy-efficient than duty-cycle radios or always-on RX; ii) A star topology, as recommended by the WBAN standard 802.15.6, is assumed for the inter-node communication; iii) the interaction between the MAC and PHY layers, which include the communication protocol, is taken into account; iv) the RF link and the propagation losses from TX to RX is modelled at a high level; and v) the RF power consumed by the front-ends at the TX and RX needs to be estimated and included in this model. To serve this end, this Chapter reviews the literature data on the RX/TX power and energy consumed for RF circuits implemented in CMOS and already published by other authors. There is a vast variety of architectures for these RF circuits, but in this chapter the UWB circuits published will be focused. The duty cycling and the message receiving process are also described and modeled in this Chapter, as well as some results of the communication link budget and benchmark parameters taken from the literature for those parameters which are most relevant to model the energy consumption of the IR-UWB communication. At the end of the Chapter, the proposed model is presented and results of its application are shown. ## 5.1 Communication Aspects of PHY Layer in WBANs The transceiver operates on ultra-wideband frequencies, as it is the assumption in this work, since the UWB is one of the three modes of PHY operation allowed in the 802.15.6 standard. The reasons presented in the contextualization in Chapter 2 lead to this choice for the PHY layer. Furthermore, there is a range of aspects to consider in an IR-UWB transceiver and related to the PHY component of a Cross-Layer system, such as the correspondent frequency regulation, signal characterization, specification of the equipment and its parts, and the coding employed in the communication process. In addition to considering its architecture, internal behavior and issues related to the network. For example, the UWB has a large bandwidth - 7.5GHz in the USA - although the conventional narrowband system of other countries can divide the spectrum into two bands. Thus, the way UWB is managed or accessed by the TX/RX has implications on the final energy efficiency model. #### 5.1.1 IR-UWB Transceiver Architecture In the literature (AKYILDIZ; VURAN, 2010), there are three fundamental groups or broad categories of activities associated with system energy consumption. They are the sensing, the processing, and the communication or data transfer through an external medium. The first group is composed by monitoring (sensing) and also by its counterpart process, the interaction – *i.e.*, the data gathering or biological level controlling. In the sensing, the biological and physical values are monitored as internal information of the patients, for each physical data exists its appropriate sensor. On the other hand, the actuators are used to proceed according to some external data, regulating some control variables of the monitored system to desired values. Another cause of energy consumption during the operation is processing the collected or received information. Not all data that arrives in the processing module is useful by itself or require some re-transmission. A certain amount of data traffic is for control, signalling signals from devices, or even data redundancy. For this reason, so much energy is spent by filtering, calibrations or adjustments, and in the data routing. The transceiver (especially its radio-frequency Front-End) completes the set of energy consumption components and activities; it is a device responsible for the effective low-level communication. The analog part of the F-E can be aggregated or added with the energy budget of the processing block and with the sensor/actuator analog interfaces] parts. Typically, the reception module presents more consumption than the transmitter one, when all parts, from F-E to the baseband processing are considered. This consumption occurs because there is much more latency in the RX end, while the transmitter module depends on the transmission rate, and it can be managed to "sleep" into low-power mode durign silent periods of the TX. Figure 5.1 represents one possible architecture for a sensor node with its subsystems. From left to right, the sensor blocks, the processing units, the RF Transceiver (where all analog and RF Front-End components are), and the overall energy control. This system could be aided by an additional wake-up circuit (not represented in the figure), aiming to provide an activation signal when a communication operation is requested from the RX, or by the TX. The production and storage of energy, or both, must be integrated into the hardware. These functions are much more difficult to integrated within the same chip of the node, for technical reasons. The energy consumption over time is one of the main system concerns, since a gradual reduction of the lifetime of the batteries occurs. Depending on the system application, a lifetime goal could be in years, months, or days. The energy can be generated internally by a scavenging unit or may be just drawn gradually from a battery. In the case of the biomedical implants, the batteries may have different lifetimes. It is aimed that batteries will remain in operation as long as possible, avoiding the frequent replacement. Appropriate power management circuits are additional blocks that control the supply current to the different circuit blocks shown in the generic example of Figure 5.2. Analog blocks or mixed analog-and-digital blocks are required in all ICs for communica- Figure 5.1 – General Hardware Architecture of a sensor node. Source: (SUHONEN et al., 2012), adapted by the author. tion systems. Examples are the digital-to-analog converters (DAC), analog-to-digital converters (ADC), amplifiers, filters, RF oscillators, and radio-frequency mixers for signal band translations. In the higher-level modeling context, as treated in this Chapter, such components will not be direct approached, even the interface that contains mixed-signal design techniques and components is not in the scope to be detailed. Overall power parameters that comply with the current technology will suffice to develop an energy consumption model. Where the Digital Design of this thesis is .. RF Transmitter Blocks An example of Front-End Blocks DAC Filter Power Amplifier Oscilator Figure 5.2 – Block Diagram Example of a Transmitter. Source: the author, inspired by (WENTZLOFF, 2007) and (ROCHOL, 2012). The Channel Coder as part of the PHY Layer has sub-blocks (Scrambler plus FEC and more Interleaving). It is considered in this proposal that the FEC is the main block with significant energy consumption because it is responsible mainly for the addition of redundancy bits, which has the consequence of needing more energy for transmission. The WBAN standard proposes the use o BCH as FEC encoding algorithm. But an analysis of other coding types is necessary to verify the performance. LDPC, Turbo Codes, Convolutional (e.g., Viterbi), Reed-Solomon, and different variations of the BCH coding are candidates to such evaluation. One of the concerns is to propose alternative FEC encoding/decoding that validate the energy consumption model. For this reason, in the previous Chapter an extensive design comparison was made between VLSI implementations of both BCH and LDPC for the range of low data rates (below 0.5 Mbps) and system parameters required by the IR-UWB applications focused in this work. #### 5.2 Channel Coding for WBANs An analysis must be done about the relationship between the complexity of algorithms for error correction, the degree of efficiency of these algorithms and the amount of energy spent to perform their tasks. This is a system design parameter that can also bring more reliability to the system and directly impact the overal energy consumption. The reliability in data transmission is a key point to enhance the efficiency of a communications system. This reliability depends on several factors, such as robustness for errors from noise or interference, the time engaged for continuity of communication without successive repetitions, and primarily as a solution to the algorithms of encoding used. Once the FEC encoding combines these features with a reduced energy cost, it can be possible to have a versatile and efficient system. The FEC decoder that was designed in Chapter 4, and fully implemented in 65nm CMOS logic, which was optimized for low power and low energy consumption, is shown as one sub-block in the module (CD) in the blockdiagram of Fig. 5.3. The subblock FEC decoder follows the deinterleaver and decodes the original paylod message to be processed next by the message descrambler. As described in Chapter 4, in the FEC block the transmitter adds redundancy in the message. The FEC may use one of several types of encoding solutions such as LDPC, Reed-Solomon, Turbo Code, Convolutional (JOHANNESSON; ZIGANGIROV, 2015), BCH or other (RICHARDSON; URBANKE, 2001). And, also, its consequences reflect over the system performance, its effectiveness, and energy consumption level. The results can be expressed in terms of BER vs. SINR. Both vary over time, according to changes occuring in the communication channel. To get those results by simulation, it is also important to have meaningful models for a controlled error injection in the communication process. The current IEEE Standards for WBAN aims at having inherent FEC to operate in adverse conditions, thereby reducing the effects of errors, noise bursts, and AWGN. Variables which affect the communication system can be modeled well (KUNST, 2009) causing excess of variations and bringing unreliability. The channel through which the transmission and reception of information occur is wireless, and it is susceptible to several error Figure 5.3 – Channel Decoding Block. Source: (ROCHOL, 2012), modified by the author. types; in practice the channel behaves in a non-stationary way. Two common types of errors that appear in such system are the burst errors (caused by strong interferers causing noise over a certain duration in time), and the Additive White Gaussian Noise (AWGN), which has stationary properties that can be modelled. The communication also fails due to the problem of hidden points in terrain or the dynamic changes in the channel, as well as amplitude variations in the receiver end caused by RF propagation effects. Some results that reflect the system performance are those which allow to choose the more appropriate FEC according to the desired communication parameters, by comparison — like the relation between BER vs. SINR, adding the corresponding energy efficiency, as a way to reach the correct point of operation. The impact of errors in communications performance can be determined. A comparison with the current parameters that allow achieving the quiescent point, highlighting the best performance, is also needed. An additional contribution derived from a system-level model of Energy Efficiency, which can be exploited from the design time of such system, is the fact that some key features that directly impact energy (like the PHY choice, the FEC, the modulation scheme, and many other choices) can be adjusted very early in the design - and especially before the costly hardware and software development cycle starts for the communication system implementation. The cases in which this approach could be particularly applied are in the biomedical area, in which the limits of energy consumption for the entire system can be expressed before the circuits designs start, as in the work by (CHANDRAKASAN; VERMA; DALY, 2008). The work in (PARK, 2014) in 2014 presents a proposal of a near Shannon-capacity decoder based on Channel Coding for reliable communication system and storage. As it is known, algorithms like turbo codes and LDPC have advanced so much, almost covering the channel capacity toward Shannon's limit (RAPPAPORT, 2001), and (GHAVAMI; MICHAEL; KOHNO, 2004). One of the main points of his work was to get energy reduction. So, it is desired to have decode energy as small as possible, with near-capacity channel codes, including LDPC codes, non-binary LDPC codes, and polar codes. As an example cited by Park, Fig. 5.4 shows Shannon's limit (Eq. 5.1) and the efficiency of coding in terms of error-reducing in a noisy channel (SNR represented by $E_b/N_0$ in dB). It is possible to see in the graph the difference for the same signal-to-noise ratio in the amount of BER, which will translate into more energy that is spent to transmit a given payload. $$C(t) = W.loq_2(1 - SNR) \tag{5.1}$$ In the Equation 5.1 "W" is the bandwidth and SNR is the ratio in "dB" of signal power to noise power. Figure 5.4 – Bit error rate comparison between uncoded and encoded systems. Source: author, modified from (PARK, 2014) The turbo and turbo-like FECs dominate in terms of effectiveness, achieving better error-rate performance, closely approaching Shannon's theoretical channel capacity limit for a channel affected just by AWGN. Nowadays they are incorporated in a large variety of telecommunication standard and systems (GRACIE; HAMON, 2007). Also, it is possible to verify in relatively old works, for instance, the work elaborate by (ERKIP et al., 2001) that provides an analysis of channel coding considering the energy consumption model for mobile transmissions. Concluding that the optimum distortion-energy operating points depend on the location, and the communication strategy adopted is essential to save energy, whereas the source coder and the channel decoder spend a considerable amount of energy, and the compression could be reliable to get a better peer-to-peer communication quality. A crucial issue is how to settle the forward error correction (FEC) in WBAN application, primarily to work with a considerable BER of the AWGN channel. Most of the cases, the bit flipping changes the content of the message, and it degrades the process. Thus, the research in favor of how to find the best economic way is a valid topic in terms of energy and code efficiency. An example of Code Rate variations in comparison with non-encoded message transmission is expressed in Fig. 5.5, where the probability of errors for different convolutional codes is expressed according to the Output Transmission Power in dBm, for a given channel setting. The "R" is fractional because it represents the ratio of the parity to the total number of bits of the encoded message, while "K" is the respective payload. Figure 5.5 – Error Probability for Uncoded and Code Rate variations. Source: author, modified from (AKYILDIZ; VURAN, 2010) Code rate variation in BCH is an important choice (KARVONEN, 2015), whereby it is mandatory according to the standard in default mode in the WBAN standard. It has been proved as a good option for FEC use. Fig 5.6 shows a code rate variation for BCH coding, where the relation between signal quality reception and bit error occurrence is explicit, whereas implicitly more energy is consumed to treat the errors using a low code rate (due the parity bits which are more expressively present in this case). #### 5.2.1 Energy Impact of the Protocols in MAC Level The energy efficiency is also obtained when some conditions generate gain throughout the operation of the system. This gain results in less power consumption and can be configured at both MAC sublayer and PHY layer – where the cross-layer interaction performs the best effort to reduce energy. Figure 5.6 – Bit Error Probability versus Eb/N0 at the receiver. Source: (KARVONEN, 2015). Over the MAC level presented in the IEEE 802.15.6 WBAN, it is possible to get energy efficiency optimizations, since the self-reconfigure link layer (LLC, FEC, and MAC) is exploited to understand a better operation mode, and also in the PHY layer as mentioned before (Channel Coder and the Front-End). So, configuring the WBAN and the RF channel are important tasks to be aware when such optimization is sought. The energy-efficiency associated with superframes and dependencies can be represented according to the OSI (ISO, 1988) data structure, see Fig. 5.7. It shows the PPDU and MPDU for the IR-UWB based data communication. It is related to the number of attributes inserted in a system, for example, increasing the security or the effort to achieve a better SINR. Again, the general guideline is that more energy will be consumed associated with such increases. According to the description of the data structure given by the standard, the PSDU to transmit on the UWB PHY includes the BCH parity bits or MAC Protocol Data Unit (MPDU) at QoS mode, and this information is present in the system; or MPDU and BCH parity bits at default mode. The MAC frame body has a length of 256 bytes. The MAC Protocol is closely linked with the energy efficiency of a wireless communication system. The Standard IEEE 802.15.6 defines Slotted Aloha as MAC protocol for the IR-UWB. It determines the channel access success probability. So, the number of successful trials until reaching effective communication leads directly to energy consumption. The standards place these conditions according to the MAC protocols. The success probability defined by the standard for N nodes is equivalent to the maximum Figure 5.7 – Data Structure over MAC level and PHY layer. Source: IEEE Std 802.15.6 (IEEE, 2012). when the channel competition participation probability for a specific traffic load is 100%, *i.e.*, "N" is equal to "1" as well as the probability, in other words, there is no competition for the channel access. Otherwise, the number is less than "1". In the best scenario, the Slotted Aloha works with an optimal traffic load, which a variable slot length and the values of the channel competition probability will be directly linked with the user priority. Then, if at least these two variables are fixed, the perspective for better parametric control of the energy efficiency performance on the WBAN is improved, especially considering that the hub has the capability to coordinate all its traffic. About modulation cases, in a top-down analysis, Verhelst and Dehaene (VERHELST; DE-HAENE, 2009) mentioned two examples where the choice of the algorithm impacts on the power estimation. The selection of the optimal algorithm and the consequent mapping of it onto an architecture are cited as a strategic solution of the cross-layer design. This also allows energy efficiency definition at a higher abstraction level. This mapping includes the analog-digital and hardware-software partitioning. Supposing that the implementation aspects do not influence the choice of the algorithm. The BPSK is the best modulation algorithm to apply in terms of BER performance, between others like PPM or OOK, if simulation and communication theory are considered. But, without including the energy-efficiency as a criteria for such system design, the final implementation may not closely match the theoretically predicted performance. One key difference is that BPSK receivers require a coherent reception, which spends more energy, whereas PPM and OOK do not and potentially save energy. Another reason is that the phase noise of the oscillators interferes on the BPSK communication for coherent reception when compared with others, according to (VERHELST; DEHAENE, 2009). Different design aspects responsible for influencing power and performance must be considered jointly in algorithms and architectures levels of the DSE. As a conclusion, all the architecture and the algorithms choices that affect in the communication protocols have to be detailed in terms of acquisition, synchronization and data detection procedures, at the system-design time. And the impact of MAC protocols must be evaluated before arriving at the implementation stage, and before elaborating the complete algorithmic system choices. Further design choices, like including the power gating at the hardware level, or like the end-of-preambles sequence, with an additional and expected power consumption, can be adjusted by subsequent refinements of the algorithms. ## 5.2.2 Cross-Layer Energy Efficiency Model Akyildiz and Karvonen models (KARVONEN, 2015), (AKYILDIZ; VURAN, 2009) were developed to represent the energy consumption estimates for wireless networks, based on a set of predefined model parameters (like TX power, RX power, etc.). The purpose of this work is to identify and act in four points of the system that demonstrably reflect in a deterministic way the overall system power consumption in the application scenario being investigated in this thesis. It is also possible to divide the energy efficiency model in two classes. The first, using bottom-up modeling that starts from the actual circuit design (for example, takes from digital circuit energy estimation up to interface with the MAC level). And the other is responsible for supplying a model related to system level for a cross-layer analysis, inspired on the existing models published in the literature for different parts of the communication system, and do it with new insights and approach. It is important to highlight the main points where the energy is applied or spent, distinguishing between sources and sinks or considering the whole system. Energy consumption components list contains, for example, the communication equipment, the peripheral equipment, the internal devices of the equipment, related to the ICs, as well as the dissipation in Power stage and at the antenna. Other losses are part of the system and must be considered, *e.g.*, line dissipation and the part which is consumed through the propagation medium. The energy or power consumption has the difference of representation based on the time component. Energy divided by time shows the power behavior over certain periods of time, which may differ in system operating point, from one interval of time to another. In some sense, time can translate the system information when compared with signal amplitude, which is the better option for energy-limited devices (sensor nodes for example). From this principle, the time to execute and finish a task can be managed for better performance in terms of less power consumption; there is a margin to exploit for power saving on operations, since more time is spent, delaying the time to task completion. Therefore, some assumptions can be generally addressed in such wireless system: - the implementation of an IR-UWB transmitter requires an energy efficiency (Eeff) power budget for the design. - The customization of the cross-layer approach for a specific WBAN architecture using IR-UWB aims energy-savvy as part of the solution. - It is necessary to evaluate how the synchronism/asynchronism among nodes and the hub can impact over the energy consumption. In a particular case, considering an RF-Powered Transceiver for Wireless Sensor Network becomes possible to establish some scenarios, intending to bring satisfactory answers for these questions. In that sense, two main considerations are the costs and quality of the final product or QoS. Both items can be expressed in terms of the quality of the communication signal and energy efficiency. The main problem tackled is to define the minimum amount of energy necessary for a typical communication system over time. Once this information is available, it becomes possible to apply it in modulation schemes and other components of the system. And finally, make all the comparisons for energy efficiency. When the approach decides to investigate the energy efficiency specifically in nodes, the components of such a system must be identified according to their critical consumption points and the ways it occurs. On the wireless network each node can be classified as a transmitter, a receiver or yet both as a transceiver, which is regarding with the function of the node and its position as master or slave in such framework. A transmitter for biomedical networks application, that has as attribute the wireless communication and focused in a personal area, is most of the time dedicated to signal monitoring. When the features of a receiver are aggregate, new possibilities are given to interact with the patient. Not just monitoring, but making procedures, for example as part of the biological control cycles or delivering medicines regularly. This device can be implantable. Fig. 5.8 shows three ways where the information can be fitted across the radio duty-cycling. The energy is spent during the active periods of communication only. If there is a synchronization between the transmitter and receiver, the energy waste is avoided because synchronism is obtained. Otherwise, in asynchronous mode, successive retransmissions may occur over extended periods of time, until the data period matches the reception window. The synchronous and asynchronous receptions need to be active very frequently over time (several active listen periods in Rx), but the preamble is less for the synchronous mode. An effective way to solve this problem in current technology is the use of Wake-up Radios (WUR). The duty-cycling of WUR is appropriate for achieving energy efficiency and long battery life. Conclusion present in works like Karvonen *et al.* (KARVONEN, 2015), but under some circumstances, states that asynchronous communication is more effective in energy (also in power) saving context. It happens when the transmission occurs only in the precise moment that it is supposed to. The WUR operation is based on the principle that minimum energy is used while the system is in sleep mode, where a circuit (watchdog style) remains monitoring the reception activities. Thus, if it is the case for a transmission request, all other necessary circuits of the radio receiver will wake up when needed, dramatically reducing the average energy consumption over time. More energy may be used in the WUR when it is active, but the extreme reduction in power during sleep periods make this WUR architecture an attractive solution - supposing that frequent Figure 5.8 – Comparison between three forms of radio operation and duty-cycling. Source: Heikki Karvonen PhD Thesis (KARVONEN, 2015), modified by the author. activation of the receiver is not necessary. Another technique in the data link layer to avoid error is the use of an automatic repeat request (ARQ). The forward error correction (FEC) operation is different from the ARQ, when a message is considered, *a priori*, corrupted by noise, and the transmission is not discharged (MOREIRA; FARRELL, 2006). But, focusing on the proposed strategy that considers the definition of the roles of each layer, with MAC detecting the errors and the PHY correcting them. The IEEE 802.15.6 Std drives the methodology to be used as a good option. Moreover, the standard steers to work with Type II HARQ as depicted in the flow of the diagram, Fig. 5.9. The message reception mechanism is processed according to the represented flow. For the enumerated blocks of the figure, the Table 5.1 relates the possible states with actions described for communication, aiming the error-free reception. A proposed algorithm (N° 1) for setting up network communication is presented. By the way, in the implementation stage, it must conform to the procedures of the Type II HARQ. MAC operations for low power must be aware of the trade-off between duty cycle and performance. For instance, the high QoS requirements need to be respected, even if the communication from some nodes aggregates inconvenient delays. Another example that impacts in MAC protocols format, from (CHEN et al., 2011), is the heterogeneity of the sensors, they have different storage capacity, power consumption, and QoS requirements. Moreover, an adjustable Figure 5.9 – Type II HARQ - block diagram. Source: modified by the author from (IEEE, 2012). on running time is necessary due to low duty cycle means lower throughput and higher packet delay. Several surveys are reporting a diverse number of simulators in the literature. As an example of some proposals that act over MAC sublayer protocols, researching the best advantages in terms of global energy, *e.g.*, BANET, Humann++, and CICADA - (LATRÉ et al., 2011). It is often verified due to the proximity of concepts that usually WSN protocols can be used in part to simulate WBAN protocols (remembering that WPAN standard is defined by IEEE 802.15.4, for which the main implementations are WSNs, for short-range). However, some specificities must be considered, and one example is body temperature; it changes some parameters of the sensors. Ullah et al. (ULLAH et al., 2012) cites the routing protocols examples, doing a schematic overview that classifies them as Cross-layer, temperature aware, and clusters. The case of cross-layer protocols complies with our approach, but initiatives are scarce. Through MAC protocols simulators are aimed to get results that validate the Cross-layer aspect of this work. Controlling Access with Distributed slot Assignment protocol (CICADA) is an example of WBAN protocol used in cross-layer context. Besides, it was launched in 2007; it was the ground for several other works. The impact assessment due to the communication protocols was not deepened in this thesis, although the resulting implications deserve more attention and study. The focus was directed to the interrelationship of the MAC with the PHY Table 5.1 – Type II HARQ, Flowchart Description. | State | Description | |-------|-------------------------------------------------------------------------------------------------------------| | 0 | BCH coding, data packet and parity bits are available. If FCS fails, storage the data packet in | | | receiver; go to state (5). | | 1 | FCS (frame check sequence) and BCH (or FEC) encoding the data packet. | | 2 | TX of the Data $(D)$ and the FCS encoded data $(Q_D)$ . | | 3 | RX of the Data and $Q_D$ modified by the channel. | | 4 | Decoding FCS from $Q_D$ modified, go to next step to verify if fail (No ACK) or not (error free). | | 5 | No ACK, send the parity bits. If FCS and FEC decoding fail, then save data received in state (3) | | | and go to state (12). | | 6 | FCS coding the BCH encoded data. | | 7 | Transmit the FEC encoded data and the FCS encoded from state (6) | | 8 | RX of the FEC encoded data and FCS codification of BCH encoded data, modified by the channel. | | 9 | FCS decoding of the packet generated in state (6) after received. | | 10 | If there is no error, retrieve the systematic bits $D$ , after the BCH decoding. | | 11 | In case of error, apply BCH decoding in the BCH received and in the data saved, state (5). | | 12 | No ACK, send the data packet. If FCS & FEC decoding fail and the number of retransmissions is less | | | than the maximum number, save the data and go to state $(5)$ . If (number = max.), then go to state $(0)$ . | | 13 | TX data packet (systematic bits) and FCS encoded $D$ . | | 14 | RX data from state (13). | | 15 | FCS decoding the data from state (14). | | 16 | BCH decoding the $D$ received and the BCH encoded data saved in state (12). | Source: the author, based on (IEEE, 2012). layers. However, some algorithms are proposed as an initial approach. Simplified versions of the algorithms that process data transmission and reception were modeled according to the respective procedures 2 and 3, proposed by this author. They represent a sensor node performing the communication with the Hub which is a Wake-up Radio (WUR). Again, the basics are described for simple operation, even without considering the complexity of a complete network. The Algorithm 4 seeks to define according to the chances of success in communication the associated yield. That is, how effective will be the reception of transmitted messages, which will reflect in the amount of energy lost in unsuccessful transmissions. In this proposal, these four algorithms work in a coordinated way in such a way that the handshaking (using ACK and NACK signals) between the network nodes and the coordinator function based on the transmission of valid messages. If there is a failure, there will be either a new transmission of packages or re-transmission of the whole message. The notation used in the algorithms tried to be as intuitive as possible, with the name translating directly to what it is referring. It would be necessary at the next level of the work to merge with the II HARQ algorithm from the Standard (Fig. 5.9) to obtain a complete model. Although work on this subject is very important and deserves more extensive research, it has been chosen not to enter into this bias at this moment. It happens because the work with algorithms and duty-cycling requires specific attention, besides such research will undoubtedly result in energy optimization, but it would overload the investigation. Perhaps as future work ## Algorithm 1 Network Communication ``` 1: procedure InitiateSetup sensorData \leftarrow false 2: 3: active WUR \leftarrow false 4: Initiation: 5: if (sensorData == true) then if longTime == false then (activeWUR = true) 6: 7: else initiateTx = true 8: else remainSensing = true 9: WURactivation: 10: if activeWUR = true then 11: Rx \leftarrow activeMode 12: else 13: Rx \leftarrow sleepMode 14: if activeMode = true then 15: Tx_{ACK} \leftarrow true ``` Source: the author. Source: the author. #### **Algorithm 2** In the Transmitter (Sensor) ``` procedure InitiateTransmission 2: RepeatTxPacket: if (End\_of\_Tx == false) then 4: if Rx_{ACK} == false then packetTx \leftarrow packetMemory 6: else RPT\_FRAME \leftarrow true INC\_NSUCCESS \leftarrow true 8: if Rx_{NACK} == true then if N\_RPT\_FRAME > N\_SETTED then 10: VERIFY_EFFICIENCY() RPT TX 12: else RPT\_FRAME = true to increment N\_RPT\_FRAME 14: goto RepeatTxPacket. ``` ## **Algorithm 3** In the Receiver (WUR) ``` procedure ReceivingUntilSuccess \textbf{if} \ (RX\_MODE == true) \ \textbf{then} \ PROCESS\_MSG\_RX \leftarrow true Receiving: \mathbf{if} \ ((PROCESS\_MSG\_RX)AND(verifyFRAME) == \mathit{true}) \ \mathbf{then} Tx_{NACK} == false \mathbf{else}Tx_{NACK} \leftarrow \mathit{true} ProcessMSG: if sizeMSG \le End\_of\_Tx then packetMemory \leftarrow packetRx if End\_of\_Tx == false then RX\_MODE \leftarrow true else timeActive \leftarrow true \ \mathbf{until} \ (X\_ms) goto ProcessMSG. if X ms == settedTIME then sleepMODE \leftarrow true else active MODE \leftarrow true goto Receiving. ``` Source: the author. ## **Algorithm 4** Compute Efficiency ``` procedure ToComputeSuccessProbability if (N\_TOTAL\_FRAMES == N\_MSG then successTx \leftarrow true EFFICIENCY \leftarrow 100\% else if RPT\_FRAME == true then EFFICIENCY = (N\_TOTAL\_FRAMES/N\_MSG) if RPT\_MSG == true then EFFICIENCY = (1/N\_RPT\_MSG) ``` Source: the author. to improve the model. The contribution of this thesis to the MAC level is to have a proposal of how to proceed to manage the physical level. The algorithms presented represent this notion of the energy impact to be transferred to the modeling of the link equation. #### **5.3** Link Budget Influences In a cross-layer approach, each level (from top to bottom) of the system is modeled, allowing to evaluate the energy necessary to work, where the link is at the top in such model. *Link budget quality*, under the influence of the communication channel, is conditioned basically to both error and energy associated. Making the communication efficient regarding the less energy consumption is very relevant for the wireless sensor networks (WSN), this condition essentially depends on the amount of error that impacts on the probability of success. In WBAN-based applications, a short-range link must be established for lossless communications or at least minimizing them. The bit error rate (BER) or probability error rate (PER) are functions to estimate the error according to a channel behavior, and they are broadly investigated in the literature in terms of a complementary error function (*erfc*), thus: $$erfc(x) = \frac{2}{\sqrt{\pi}} \cdot \int_{x}^{\infty} e^{-u^{2}} du$$ (5.2) The BER for a digital source, for a coherent BPSK/QPSK, is expressed as: $$BER = \frac{1}{2} \cdot erfc(\sqrt{\frac{E_b}{N_0}}) \tag{5.3}$$ Otherwise, for D-BPSK modulation as in the WBAN standard, non-coherent method, the BER is simplified to: $$BER = \frac{1}{2} \cdot e^{\left(-\frac{E_b}{N_0}\right)} \tag{5.4}$$ The probability density function considers the tail of the **normal distribution**, generating the error function, while *erfc* is its complementary form: $$erfc(x) \equiv 1 - erf(x) \equiv 1 - \int_{x_0}^{\infty} \frac{1}{\sigma \cdot \sqrt{2.\pi}} \cdot e^{\frac{-(x-m)^2}{2.\sigma^2}} \cdot dx$$ (5.5) Eq. 5.5 computes the probability for an $x_0$ to the infinity of the tail for the normal function, where m is the mean and $\sigma$ is the standard deviation of such distribution. Figure 5.10 depicts the relation between signal-to-noise in a WBAN environment simulation, showing the respective energy per bit. The comparison presents the variation of two types of FEC (convolutional and RS) and the uncoded version (used in (IEEE, 2012)). Figure 5.10 shows three cases considering the radio link. First, the error response due to the channel quality for a similar modulation from the standard (DBPSK), without FEC coding. Then, compared to this, note the gain of Reed-Solomon FEC (similar to BCH). In the latter case, the convolutional coding curve (as one of the possible uses in WBAN networks) is the one performing worst for low SNR ratios, as shown in the green curve. Figure 5.10 – FEC mechanisms versus SNR for WBAN specification. Source: the author. There is a growing number of research results due to the expansion of bioengineering and related multidisciplinary areas. In wireless systems, for example, there are several concerns about power consumption. As known (BI; HO; ZHANG, 2015), battery replacement is a problem in bio-implants. Solutions come from the magnetic coupling, wireless energy transfer (WET) technology, capacitive energy generation (IEEE, 2012), or charge points. Notably, all the information collected will not be recorded in the system but will be passed to other entities that will perform some processing and assessment. The protocol used by these entities combines energy and routing awareness in the cross-layer model, between physical end data link layers (PHY and MAC) (AKYILDIZ; VURAN, 2010). Hence, a wireless network becomes a viable alternative for data transmission between sensors elements and the decision module, but with power consumption associated and stack managed. Fig. 5.11 shows the simulation results from a complete data-stream with 118,272 bits encoded in BCH, varying the coding in four types of Code Rates. The channel is assumed to have AWGN noise. Fig. 5.12 is a synthesis of a large number of results, showing that the curves gradually adjust following a trend to their amortization. The curves show that the number of errors fall with increasing SNR. In the simulations, uncorrected errors could generate peaks representing a higher error rate, in the communication process, by consequence they can cause packet retransmission. It is observed that there are cases above 13dB in the considered range Figure 5.11 – Simulation results with a SNR variable. Source: the author. without errors (all errors are fixed) for code rate equal to 0.5 (or BCH(62,51,2)). Figure 5.12 – BCH Code Rates for AWGN Channel (BER vs. $E_B/N_0$ ). Source: the author. Table 5.2 – Hardware Power Consumption per Code Rate. | Code Rate | <b>250 Kbps</b> | 487.5 Kbps | |-----------|----------------------|----------------------| | 0.5 | $8.05~\mu\mathrm{W}$ | $9.93~\mu\mathrm{W}$ | | 0.62 | $3.21 \mu W$ | $3.99~\mu\mathrm{W}$ | | 0.71 | $2.54~\mu\mathrm{W}$ | $3.29~\mu\mathrm{W}$ | | 0.81 | $1.22~\mu\mathrm{W}$ | $1.57~\mu\mathrm{W}$ | Source: the author. The curves indicate that a small increase in SNR ensures the reduction of PER. Furthermore, the results of Power presented in Table 4.4 show that the lower code rate (0.5) has a higher consumption, but there is a trade-off because it is more likely to minimize errors and retransmissions. Whereas the highest code rate (0.81) has the lowest power consumption, it remains more susceptible to retransmissions due to errors. As the respective power decay of each hardware (Table 5.2, which presents a synthesis of results for BCH case) as a function of the Code Rate. ## 5.3.1 Link for the WBAN (Distance vs. TX/RX Power) Since the standard number 802.15.6 from IEEE uses a star topology as mandatory, the placement of several transceivers working together under a single coordinator can be modeled. For the channel condition, relevant issues including attenuation and gains influence the consumption of the system, such as the level of transmitting power, and mainly the own body (movements). By consequence, the system energy efficiency depends on the quality of the channel and the effects over the link. The investigation of such near-field communication effects, as well the influence of signal-to-noise ratio, is a particular concern for the design of this kind of network. The simulation conditions used are: - 1. The variable power of the WBAN according the number of nodes, from 3 to 10; - 2. The total power for distances inside a cubic space of 3 $m^3$ based on Monte Carlo method and random distribution; - 3. Graphical curves comparison with and without body interference approach on short-range communication. Fig. 5.13 represents the space between a point and a plane (to simplify the star topology), and the position was randomly chosen in this example, simulating the connection between the nodes and the coordinator of the WBAN to give values of the link budget. Figure 5.13 – Hub to a plane composed of three nodes, example. Source: the author. The group of equations (5.6) computes the distance between the Hub and the Node on a 3-D space. This is done using the triple coordinates system, considering (x,y,z). $$d' = \sqrt{(x' - x'')^2 + (y' - y'')^2}$$ $$d = \sqrt{(z' - z'')^2 + (d')^2}$$ (5.6) An estimation of the PSD according the distance and 3.49 GHz as minimum and 9.98 GHz and maximum inside the UWB window of 7.5 GHz can be obtained by Equations 5.7. 3494 MHz is the minimum central frequency in low band for UWB operating and 9984 MHz is the maximum central frequency in the high band. The Figure shows a model for variations of receiver power as a function frequency and distance. Given a transmitter ( $h_t$ ) and a receiver ( $h_r$ ) height antennas of 50 cm with transmitted power of 5 W ( $P_t$ ). Considering the light speed (c) and the respective free space wavelength ( $\lambda$ for the 3.494 GHz and 9.984 GHz); for a transmitter gain $(g_t)$ and receiver gain $(g_r)$ of 1.5 and 1, respectively; and distance variation from 1 to 3 m; the electric fields $(e_0 \text{ and } e_1)$ and the power in receiver $(P_r)$ are calculated by the Equations 5.7, modified from (AKKAŞLI, 2009). Some parameters for simulation in free-space of this model are in Table 5.3. $$e_{0} = \frac{\sqrt{(P_{t} * g_{t})}}{d}$$ $$e_{1} = 2.e_{0}.sin(\frac{2.\pi.h_{t}.h_{r}}{\lambda.d})$$ $$P_{r} = (\frac{e_{1}}{0.001})^{2}.\frac{g_{r}.\lambda^{2}}{480 * \pi}$$ (5.7) Table 5.3 – Simulation Parameters to model the Power in the Receiver | Parameter | Value | |------------------------|-------------------| | Hight of the Tx Antena | 0.5 m | | High of the Rx Antena | 0.5 m | | Tramitter Power | 5 W | | Tx/Rx Gain | 1 | | Distances | 1 to 3 m | | Frequencies | 3.49 and 9.98 GHz | Source: the author. The data points in Fig. 5.14 are generated by the Monte Carlo method in the simulations, where the colored areas are representing the transmission power variations for three or ten nodes. For the case of 10 nodes (blue points), the effect of the path loss was considered, showing a higher power demand in the transmitter. Variations in the number of nodes are possible, but the plot has the purpose of tracing the behavior between nodes and the borders, being possible to infer a performance due to location. As a result of this approach involving the computation of distance and the required transmitted power, systems representing WBANs with 3 to 10 nodes were simulated in Matlab scripts, with random distributions in space, as shown in Figures 5.14 and 5.15. Figure 5.14 shows the relative power levels required from the TX, for different distances between hub and node, in different conditions (number of nodes and path losses). In this figure the absolute power levels are not shown, as the power level (in dBm) would have to be adjusted to respect the FCC or regulatory masks required for in-door operation of the WBAN. Fig. 5.15 replicates the results but limiting the variations in an area, which represents the differences between the arrangement of each topology. The presented values of transmission power are already expected, according to the rule that the power to be transmitted has to be higher, as the target nodes are further away and the desired power at the RX needs to be maintained. When they are closer, it is easier to proceed with communication and to operate with received power levels well above the RX sensitivity. In the aforementioned figures the free space propagation model is assumed. With the apparent Figure 5.14 – Relative Power TX vs. Sink-Sources distances. asymptotic variation of the curves, it is possible to infer that the NLOS requires more power than the LOS, as well as a large cluster of nodes needs more power for communication from the coordinator than in a shorter range. #### 5.4 **Benchmarking and Roadmapping** The purpose of this Section is to expose the reference values that delimit the parameters in the implementations. A summary of the results existing in literature makes possible to propose a benchmark through this section, using results for comparison and aiming at model improvements. The most important parameters can be highlighted as a metric for setting the "energy/bit or pulse" milestones of the circuit design. How to calculate the energy consumed to proceed with the communication – the amount of pJ per bit on the TX/RX process – and use the comparison criteria are also included as an aim of this proposal. The system specifications are taken from state-of-the-art reports, where a roadmap of works for the system energy consumption gives some possibilities, from the underlying physical level up to power update provided by the MAC level as a consequence of the coordination and management. So, from literature, a new and reliable energy efficiency model focused on the WBAN applications with use of IR-UWB is the main subject of this proposal. There are reference power values that represent the whole system since it works based on two transceivers and the channel, where the data or signal communication proceeds. The references are placed to aid in the formulation a global cross-layer model, expressing the sum, and also the average, of parameters established by PHY-MAC interrelations. The final goal is to get improvements for energy efficiency model. The following tables (5.4 to 5.8) present an energy survey for some published chips that Figure 5.15 – Fitted transmission power area according distances. Source: the author. were designed by several research groups developing and fabricating chips for UWB systems. The data serve as reference values for the energy and show the degree of optimization achieved by the RF chip designers over the years, using several commercial CMOS technologies. In the end, the aim of investigating the improvements for a global cross-layer system is to get less power consumption operating on similar conditions, where the following comparison is validated according to the summarizing tables. Table 5.4 – Performance summary of UWB chipsets. | Works | 1 | 2 | 3 | 4 | 5 | 6 | Wentzloff | |--------------------|------|------|------|------|------|--------|--------------| | Year | 2006 | 2003 | 2004 | 2005 | 2006 | 2007 | 2007 | | CMOS Tech (nm) | 180 | - | - | 180 | 130 | 90 | 90 | | PRF (GHz) | 0.4 | 1 | 1.4 | 1 | 0.08 | 0.4992 | 0.1 - 0.0167 | | Bandwidth (GHz) | 2 | 1.25 | 1.6 | 1.5 | 3.5 | 0.5 | 0.528 - 0.55 | | Center Freq. (GHz) | 4.1 | 6.2 | 4.1 | 4 | 3 | Multi | Multi | | Modulation | PPM | - | BPSK | QPSK | PPM | PPM | PPM | | | | | | | BPSK | BPSK | DB-BPSK | | Energy/Pulse (pJ) | 190 | 240 | 56 | 105 | 125 | 40 | 43 - 313 | ## **Citations:** - 1 (ZHENG et al., 2006) - 2 (FONTANA; RICHLEY; BARNEY, 2003) - 3 (UWB-FORUM, 2004) - 4 (IIDA et al., 2005) - 5 (SMAINI et al., 2005) - 6 (RYCKAERT et al., 2007) Source: (WENTZLOFF, 2007). Table 5.5 presents some current works in UWB transceivers. One of the reasons for ad- vances in low power is the reduction of CMOS technology, but the bit rate and the modulation are also associated with the energy decrease. Table 5.5 – Roadmap of the Wireless Integrated Circuits and Systems Group. | Works | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |-------------------|------|-------|--------|------|-------|-------------------|-------|--------|--------|-------| | Year | 2009 | 2011 | 2012 | 2013 | 2013 | 2013 | 2013 | 2013 | 2014 | 2014 | | CMOS Tech (nm) | 130 | 90 | 65 | 180 | 130 | 90 | 65 | 65 | 65 | 65 | | Data Rate (Mbps) | 19 | 1 | 1 | 0.03 | 0.001 | $50 \sim 100$ | 2 | 2,000 | 500 | 1 | | Modulation | - | OOK - | | | - | - | - | PPM | - | PPM | - | - | PPM | FSK | | Min. Range (m) | < 10 | - | < 0.01 | - | - | - | - | 0.075 | 0.4 | - | | Power (μW) | - | 1,640 | 290 | 37 | 6,600 | - | 750 | 98,000 | 13,300 | 8,000 | | Energy/Pulse (nJ) | 0.07 | 1.64 | 0.29 | 1.23 | 0.088 | $0.012{\sim}0.02$ | 0.375 | 0.049 | 0.026 | 8 | #### **Citations:** - 1 (HELLEPUTTE; GIELEN, 2009) - 2 (CREPALDI et al., 2011) - 3 (GAMBINI et al., 2012) - 4 (BROWN et al., 2013) - 5 (MEHRA et al., 2013) - 6 (EBRAZEH; MOHSENI, 2013) - 7 (VIGRAHAM; KINGET, 2013) - 8 (SILIGARIS et al., 2013) - 9 (GENG et al., 2014) - 10 (CHEN et al., 2014) Source: (WENTZLOFF, 2017). One main remark is that the PRF (stands for the pulse repetition frequency), in the chips measured and published, can vary from 1-10's of kHz up to GHz range 5.4. The control of the PRF has to be done to match the FCC mask bounds. In table 5.8 the works numbered from 1 to 6 are referred by (SAPUTRA, 2012), taking data from other authors. The works (7-9) achieved a good energy performance in nJ per bit. The work 2 shows the difference of 80 nJ/bit spent at the RX and the 320 nJ/bit at the transmitter. Meanwhile, it is known that in full radio operation, generally the time of reception is longer than transmission, resulting in more energy losses on the total period of reception. Table 5.10 presents a summary of the power from the other tables in this chapter, as well as some values of reference, *i.e.*, the average working power of the circuits. For the sake of simplicity, the RX and TX energy values are mixed. However, it is easy to distinguish them, because the RX mode consumes more energy to work than the TX. It is important to make the difference between the energy spent per pulse or bit. The equivalence criterion must be placed, the variations of results that at first glance can be negligible, can also after complete evaluation become significant. This case was partially addressed before, at section 3.3.1, when one or more pulses are coded to represent the binary data. Table 5.6 – Performance summary of IR-UWB (Tx/Rx). | Works | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | (HU et al., 2009) | | |----------------------|------|------|------|------|------|------|------|------|-------------------|--| | Year | 2005 | 2006 | 2006 | 2007 | 2006 | 2007 | 2007 | 2007 | 2009 | | | Technology (nm) | 180 | 180 | 130 | 180 | 180 | 90 | 90 | 180 | 130 | | | $Pulse_{Rate}$ (MHz) | 500 | 400 | 160 | 16.7 | 200 | 50 | 16.7 | 100 | 2500/250 | | | BW (GHz) | 2 | 2 | 2 | 2 | 2 | 3 | 2 | 2 | 6/7 | | | $E_{Tx}$ (pJ/Pulse) | 210 | 190 | 62.5 | 43 | - | - | - | - | 25 | | | $E_{Rx}$ (pJ/Pulse) | - | - | - | - | 405 | 2160 | 2500 | 144 | 190 | | #### **Citations:** - 1 (IIDA et al., 2005) - 2 (ZHENG et al., 2006) - 3 (SMAINI et al., 2005) - 4 (WENTZLOFF et al., 2007) - 5 (ZHENG et al., 2006) - 6 (HE; ZHANG, 2007) - 7 (WENTZLOFF et al., 2007) - 8 (PHAN; KRIZHANOVSKII; LEE, 2007) Source: (HU et al., 2009). Table 5.7 – Performance of the IR-UWB ( $7^{th}Derivative$ ). | Works | * | A | В | 1 | 2 | 3 | 4 | 5 | |-------------------|-------|---------|---------|------|----------------|--------------|----------------|---------| | Year | 2012 | 2008 | 2007 | 2013 | 2016 | 2017 | 2017 | 2018 | | Technology (nm) | 180 | 180 | 180 | 180 | 180 | 28 | 130 | 65 | | PRF (MHz) | 100 | 100 | 30 | 3 | 0.2 | - | 100 | 4 | | BW (GHz) | 8.5 | 7.5 | 4.1 | 3 | - | - | $1.8 \sim 3.5$ | 2.5 | | Area $(\mu m^2)$ | 73x75 | 240x560 | 500x800 | | | | | 260x820 | | Sensitivity (dBm) | | | | - | -54 | - | -85.8 | - | | Power/Pulse (mW) | 3.8 | 3.6 | 16 | 0.33 | $\approx 0.04$ | - | 14.7 | 0.48 | | Energy/Pulse (pJ) | 4.6 | 1.5 | 3 | - | - | $14 \sim 24$ | 146 | 48 | ## **Citations:** - A (NEKOOGAR, 2005) - B (NORIMATSU et al., 2007) - 1 (HUANG et al., 2013) - 2 (SHI et al., 2016) - 3 (STREEL et al., 2017) - 4 (VAUCHE et al., 2017) - 5 (HAAPALA; HALONEN, 2018) Source: A & B - \*(NETO; MOREIRA; NOIJE, 2012); 1 to 5 the author. Table 5.11 is a reference in terms of power consumption of the radio chips available in the market. The frequencies and band are very different from an IR-UWB, as the chips target the massive market for ISM and WSNs. The operation is referring to low power consumption, with low-speed MCU, using on-chip buffers to adapt the incoming data rate of the communications Table 5.8 – Performance summary of FM-UWB (Tx/Rx). | Works | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |----------------------------|-------|--------|-------|------|-------|-------|-------|------|------| | Year | 2006 | 2005 | 2008 | 2006 | 2010 | 2009 | 2010 | 2007 | 2010 | | Technology (nm) | 180 | 180 | 90 | 130 | 180 | 90 | 180 | 90 | 130 | | Modulation | OOK | OOK | UWB | BFSK | BFSK | OOK | BFSK | UWB | UWB | | Rate (Mbit/s) | 1 | 0.005 | 15.6 | 0.3 | 0.125 | 0.1 | 0.25 | 0.1 | 1.3 | | Sensitivity (dBm) | - | -100.5 | - | - | - | -72 | -86 | -99 | -55 | | Power <sub>max</sub> (dBm) | -11.4 | -4.4 | -16.4 | -5 | -5.2 | - | - | - | - | | $Power_{Tx}^{DC}$ (mW) | 3.8 | 1.6 | 4.36 | 1.12 | 1.15 | - | - | - | - | | $Power_{Rx}^{DC}$ (mW) | - | 0.4 | - | - | - | 0.052 | 0.215 | 35.8 | 3.3 | | Energy (nJ/bit) | 3.8 | 80-320 | 0.28 | 2.3 | 9.2 | 0.52 | 0.84 | 2.5 | 3.3 | #### **Citations:** - 1 (DALY; CHANDRAKASAN, 2006) - 2 (OTIS; CHEE; RABAEY, 2005) - 3 (MERCIER; DALY; CHANDRAKASAN, 2008) - 4 (COOK et al., 2006) - 5 (AYERS et al., 2010) - 6 (PLETCHER; GAMBINI; RABAEY, 2009) - 7 (AYERS; MAYARAM; FIEZ, 2010) - 8 (LEE; CHANDRAKASAN, 2007) - 9 (HELLEPUTTE et al., 2010) Source: (SAPUTRA, 2012). Table 5.9 – Performance and Comparison of References. | Works | 1 | 2 | 3 | 4 | 5 | 6 | (ISSA et al., 2017) | |-------------------------------------------------|----------------|------|------|------|----------------|-----------------|---------------------| | Year | 2014 | 2015 | 2011 | 2014 | 2011 | 2009 | 2017 | | Technology (nm) | 130 | 180 | 180 | 180 | 90 | 180 | 180 | | BW (GHz) | 2 | 0.52 | 2 | 2 | 1.9 | 1.25 | 2 | | Freq (GHz) | $3.5 \sim 4.5$ | 4 | 4 | 4 | 3 | $7.25 \sim 8.5$ | 4 | | Rate (Mbps) | 0.1 | 250 | 10 | 1 | $125 \sim 500$ | 0.1 | 200 | | Sensitivity (dBm) | -89.3 | -80 | -79 | -81 | -64 | -89 | -54 | | $\overline{\mathbf{Power}_{RX}\ (\mathbf{mW})}$ | 14.4 | 36 | - | 129 | - | 5.4* | 15.56 | | $E_{TX}$ (pJ/b) | 196 | - | 350 | 671 | 90 | - | 7.2 | | $E_{RX}$ (pJ/b) | 144 | - | 6200 | 3540 | 90 | 200 | 6.4 | \* Energy detector only. **Citations:** - 1 (SPARROW et al., 2014) - 2 (WU et al., 2015) - 3 (GAO et al., 2011) - 4 (ZHENG et al., 2014) - 5 (HU et al., 2011) - 6 (GEROSA et al., 2009) Source: (ISSA et al., 2017). Table 5.10 – A Summary of the Energy from the Previous Tables. | Table nº | <b>5.4</b> <sup>1</sup> (2007) | <b>5.5</b> <sup>1</sup> (2017) | <b>5.6</b> <sup>1</sup> (2009) | <b>5.7</b> <sup>1</sup> (2012) | <b>5.8</b> <sup>2</sup> (2012) | <b>5.9</b> <sup>1</sup> (2017) | |-----------------------------------------|--------------------------------|--------------------------------|--------------------------------|--------------------------------|--------------------------------|--------------------------------| | 1) | 40 | 12 | 25 | 1.5 | 280 | 6.4 | | 2) | 43 | 26 | 43 | 3 | 520 | 7.2 | | 3) | 56 | 49 | 62.5 | 4.6 | 840 | 90 | | 4) | 105 | 70 | 190 | 14 | 2,300 | 144 | | 5) | 125 | 88 | 210 | 24 | 2,500 | 196 | | 6) | 190 | 290 | 144 (Rx) | 48 | 3,300 | 200 | | 7) | 240 | 375 | 190 (Rx) | 146 | 3,800 | 350 | | 8) | - | 1,230 | 405 (Rx) | - | 9,200 | 671 | | 9) | - | 1,640 | 2,160 (Rx) | - | 80,000 | 3,540 | | 10) | - | 8,000 | 2,500 (Rx) | - | - | 6,200 | | $\mathbf{Energy}_{min} (\mathbf{pJ})^3$ | 40 | 26 | 25 | 3 | 280 | 7.2 | | Energy <sub>max</sub> $(pJ)^3$ | 240 | 1,640 | 405 | 18.72 | 3,800 | 671 | | Energy <sub>avg</sub> (pJ) | 114.14 | 420 | 134.63 | 48 | 1,934.29 | 236 | 1 - energy/pulse $\frac{1}{2} - energy/bit$ $\frac{3-considered}{}$ Source: the author (a survey). (SUHONEN et al., 2012). Table 5.11 – Characteristics of off-the-shelf ICs. | Company | Microchip | Nordic | sRF | Semtech | Texas Inst. | |-----------------------|-----------|-----------|--------------|---------------|-------------| | Data Rate (Mbps) | 0.25 | 0.05 - 2 | 0.15 - 0.576 | 0.064 - 0.152 | 0.076 - 0.5 | | Band (GHz) | 2.4 | 0.433 | 0.433 | 0.433 | 0.433 | | | | 0.915 | 0.868 | 0.915 | 0.915 | | | | 2.4 | | | 2.4 | | Buffer (B) | 128 | 32 | - | - | 64 - 128 | | $I_{Sleep}$ $(\mu A)$ | 2 | 0.9 - 2.5 | 0.7 | 0.2 | 0.2 - 1 | | $I_{Rx}$ $(mA)$ | 18 | 12.3 - 19 | 3.8 - 7 | 6 - 14 | 9.3 - 18.8 | | $I_{Tx}$ $(mA)$ | 22 | 11.3 - 13 | 10 - 12 | 11 - 33 | 10.4 - 21.2 | | Energy Rx (nJ/bit) | 264 | 17 - 750 | 52 - 313 | 516 - 650 | 93 - 406 | | Energy Tx (nJ/bit) | 216 | 18 - 840 | 36 - 99 | 276 - 281 | 99 - 226 | Source: (SUHONEN et al., 2012). Table 5.11 for off-the-shelf ICs indicates the fabless or IDM semiconductor companies, such as: Microchip (MC), Nordic Semiconductor (NS), sRF Monolithics (RFM), Semtech (SE), and Texas Instruments (TI). Current consumption (in mA) are specified at the lowest band and for 0 dBm transmission power. The table indicates that the data rate and frequency band has only a small effect on current consumption. The power per bit consumption considers 3.0 V of the supply voltage. The author (SUHONEN et al., 2012) also concludes that the most energy-efficient chips use narrow bands at 2.4 GHz, an implication due directly from their high data rates. Besides the diversity of works presented, the parameters characterize the differences and diversity between them. These works help to build a context of the main characteristics for the IR-UWB. And such benchmark will lead to future employment. Attempting to normalize them must consider several aspects inherent to each one. For instance, the modulation type chosen implies in some variance in the throughput and consequent power consumption as well as other variabilities, like which CMOS technology was adopted. ## 5.5 Enhanced Cross-Layer Proposal As cited by (OPPERMANN; HAMALAINEN; IINATTI, 2004), the UWB has many features which make it attractive. The IR-UWB can exploit these characteristics as a result of its potentially low complexity and low cost; its noise-like signal spectrum; resiliency to severe multipath and jamming; and it is an outstanding time-domain resolution, allowing for location and tracking applications. Therefore, in a cross-layer model there are possibilities to adjust the energy required by each level, from the top system to the bottom modules. If, on the one hand, the system level packets are managed to deal with significant amounts of energy, on the other hand, the circuit or gate level deals with small units of energy that perform the scenario at all. But an error in the last impact in the former. As well as the incorrection in the application layer reflects in a waste of performance. There are various interdependencies between different layers in the cross-layer design, particularly the MAC and PHY. For a general-level approach, and considering a protocol stack of a hierarchical network (*e.g.*, WSN), assumptions are made for further energy improvements, like the constant number of the bit error probability. FEC coding, as the use of LDPC and Turbo Codes, are options to achieve such purpose (KARVONEN, 2015). In this enhanced Cross-Layer proposal, some points of the model are adjusted to improve it, getting a better accuracy according to the data organized throughout this work of each chapter, the table of such values is shown in the following subsection. #### **5.5.1** Energy Efficiency Model (PHY Level) This proposal begins with the introductory theory for the communication systems (CARL-SON; CRILLY, 2010), where the Energy and Power Theorems are expressed as follows: 1) Parseval's Power Theorem Eq. 5.8 refers to the periodic signal evaluated by a quadratic mode during its lifetime; its respective power is obtained as a result. $$P(t) = \frac{1}{T_0} \int_{T_0} |signal(t)|^2 dt$$ (5.8) 2) Rayleigh's Energy Theorem On the other hand, but taking into account the frequency domain, Eq. 5.9 places the Energy of the signal as the quadratic format of the signal in all its scope. $$E(f) = \int_{-\infty}^{+\infty} |Signal(f)|^2 df$$ (5.9) Equation 5.10 simply states the value of the instantaneous power consumption P(t): $$P(t) = \frac{d(E(t))}{dt} \tag{5.10}$$ The inverse is true, as given by Eq. 5.11, where during the initial time $(t_i)$ through the entire period to the end $(t_f)$ , the *Power* is integrated. $$E_{TOTAL} = \int_{t_i}^{t_f} P.dt = P_i.t_i + \dots + P_f.t_f$$ (5.11) These are general principles for signal analysis which applications would also be extended to the hardware (processing) to get results in demanded power and the consequently spent energy. For convenience hereafter, the $|signal(t)|^2$ is nominated the Energy - E(t) or merely E, as well P in time. The CMOS circuits have three main sources of power dissipation, when switching at frequency fclk. They are classified as switching power, short-circuit power, and static power (CHANDRAKASAN; BRODERSEN, 1995). The dynamic of the capacitors during the switching process is represented by $P_{switching}$ at the Average Power in Equations 5.12, including the effects of the dynamic power resulting from short-circuit ( $P_{SC}$ ), and the static powers ( $P_{static}$ ) during operation, given a supply voltage ( $V_{DD}$ ) and an average short-circuit current ( $I_{CC}$ ) per transition at the output of a CMOS gate. $$P_{avg} = P_{switching} + P_{SC} + P_{static}$$ $$P_{avg} = \alpha.C_L.f_{clk}.V_{DD}^2 + \beta.I_{CC}.V_{DD} + I_{static}.V_{DD}$$ (5.12) The Eqs. 5.12 is a way to express the circuit interdependences to key CMOS VLSI parameters. It is positive to consider in such formulation the effects of temperature, aging, frequency, and the critical circuit paths. The last two considerations for complex synchronous digital circuits, which have literally millions and millions of possible signal propagation paths inside the chip, and each path has different switching activity (denoted by $\alpha$ in Eq. 5.12. A more simplified equation is given in Eq. 5.13. Taking into account the physical characteristics of the digital IC, where N digital nodes coexist, each with its own capacitive load $C_{Li}$ , the model can be steered to this new Eq. 5.14: $$P_{total} = P_{dyn} + P_{dc} (5.13)$$ In transistor/circuit level, the Direct Current (DC) flows through the CMOS transistor network, contributing to $P_{SC}$ and $P_{static}$ in Eq. 5.12. But there are other effects concerning the technology, *e.g.*, the power dissipation and the leakages associated with the cell design. The dissipation of power occurs at some time intervals, especially during input level transitions, when a current path is directly connected, then the NMOS and PMOS transistors are in short-circuit for a very short duration of the signal transition at the input - in this situation there is a path in DC to conduct the current directly to ground. The leakage is a static effect and is due to subthreshold transistor currents, reversely biased diodes, and gate tunneling effects through very thin gate insulators. The switching power plus the short-circuit estimation, and the DC power can be summed over all the nodes of the digital circuit as given by: $$P_{total} = \sum_{i=1}^{N} \alpha_{i}.f_{clk}.C_{L_{i}}.V_{DD}^{2} + \sum_{i=1}^{N} \beta_{i}.f_{clk}.V_{DD} + \sum_{i=1}^{N} \gamma_{i}.I_{CC}.V_{DD}$$ (5.14) The logic synthesis tool can accurately estimate the static and dynamic components based on the circuit behavior and the library characterization files. Hence, it uses Equation 5.15 to compute the power dissipation given that $\alpha_i$ is the node switching activity factor, $C_{L_i}$ is the node load capacitance, N is the number of logic nodes, and $f_{clk}$ is the circuit clock frequency. Further, $\beta_i$ is a factor that relates to the input signal transition time to the cell peak short circuit current. Another way is comparing the leakage power as the DC component, as follows: $$P_{total} = P_{dyn} + P_{leak} = \sum_{i=1}^{N} \alpha_i . f_{clk} . C_{L_i} . V_{DD}^2 + \sum_{i=1}^{N} \beta_i . f_{clk} . V_{DD} + P_{leak}$$ (5.15) Additionally, as a metric to allow comparisons at PHY layer, Rabaey and Ammer (AMMER; RABAEY, 2006) propose the Energy-per-Useful-Bit (EPUB) metric, whose equation is shown by 5.16. $P_{TX}$ and $P_{RX}$ are the power of the transmitter and the receiver. $B_D$ and $B_P$ are respectively the average numbers of data and preamble bits in a packet. The $\xi$ is determined by the MAC scheme, representing the ratio of mean time spent of the receive and transmit modes. And "T" is the time in seconds. $$EPUB = \left[\frac{B_D + B_P}{B_D}\right].(P_{TX} + \xi.P_{RX}).T$$ (5.16) Eq. 5.16 is used by other authors as a reference for comparison (VERHELST; DEHAENE, 2009), (NODA et al., 2013). This equation is very compact, and it oversimplifies the overall view of the communication system. It needs to be adapted for a new, more comprehensive energy efficiency model. Karvonen et al. (KARVONEN; IINATTI; HÄMÄLÄINEN, 2013) also present in their research an in-depth model (PHY components related to the ICs). Focusing on a specific point, the model used by them for basic energy calculation just for the FEC operations (or the bottom level in the physical layer, assuming a single adder and a multiplier to be used repeatedly to compute the error corrections) is as follows: $$\epsilon_{mult} = 0.4 \cdot (m^2 + \frac{3 \cdot (m-1)^2}{2}) pJ; \qquad \epsilon_{add} = 0.4 \cdot m pJ; \qquad \epsilon_{inv} = 8 \cdot m pJ \quad (5.17)$$ In 5.17, the "m" represents the number of bits at the message, it is calculated from Galois Field (GF), a finite number of elements. As well as the energy consumption, represented by $\epsilon_{add}$ and $\epsilon_{mult}$ , addition and multiplication need operations of the algorithm. Whereas $\epsilon_{inv}$ refers to the energy consumption of inversion operation. All of them, are the necessary components used in the computation of the communication process as the low level to start a complete analysis of the system. #### 5.5.1.1 Power Analysis for WBANs Power dissipation in sensors can be divided into three components: data acquisition, processing, and communication. Usually, the communication module consumes most of the power as it includes both physical and data link layers (POTTIE; KAISER, 2000). Hence, reducing data retransmission is essential to respect the sensor power budget. Although there are several optimization techniques for the layers, as mentioned earlier, they are limited in terms of energy benefits. The energy consumption model can be fully described by Equation 5.18 when a complete communication cycle is in place (AKYILDIZ; VURAN, 2010). Thus, the power dissipation can be divided into three modules: the transmitter front-end $(P_{Tx})$ , the transmitter power amplifier $(P_{out})$ and the receiver circuitry $(P_{Rx})$ . It is worth noting that the power drained by the output amplifier is in fact adjusted to be proportional to the distance between the source and sink nodes. In a wake-up WBAN radio system, the sensors spend time starting up internal circuits (such as local oscillators, finite state machines, etc.) to be able to manage the data exchange. Hence, the total time frame during which the sensor is active depends on this initialization time ( $t_{startup}$ ) and on the data exchange time ( $t_{on}$ ) which is directly proportional to the message size. $$E_{sensor} = P_{Tx} \cdot (t_{startup-Tx} + t_{on-Tx})$$ $$+ P_{out} \cdot t_{on-Tx} + P_{Rx} \cdot (t_{startup-Rx} + t_{on-Rx})$$ $$(5.18)$$ This energy model can be refined according to the power consumed by each circuit in the Tx and Rx, using logic or a circuit-level model. Traditionally, power dissipation in digital CMOS circuits is divided into static and dynamic components. The former is due to the leakage current of each transistor (NARENDRA; CHANDRAKASAN, 2006) and it is dependent on the logic circuit that is usually synthesized automatically from an HDL (hardware description language) design. The dynamic power component has two dissipation sources: 1) the active load charge and discharge and 2) the short circuit current. These sources depend mainly on circuit activity and operating clock frequency. Specifically, the FEC decoders considered in this work are contributors to the $P_{Rx}$ power dissipation component and they also impact $P_{Tx}$ through their effectiveness to reduce retransmissions caused by unrecovered errors. The overall communication latency is affected by the data rate and by the number of errors. The system is adjusted to correct these errors, causing additional retransmission cycles. Each FEC architecture has a different latency, considering the mode of operation, the type, and internal processing. As was shown in the previous sections, where the power consumption of different FEC decoders were applied to WBAN communication systems. According to Akyildiz and Vuran (AKYILDIZ; VURAN, 2010), it is possible to model the energy consumption ( $E_c$ ) in a system by the following equations: $$E_c = E_{tx}(k, d) + E_{rx}(k) (5.19)$$ $$E_{tx}(k,d) = E_{tx-elec}.k + e_{amp}.k.d^n$$ (5.20) $$E_{rx}(k) = E_{rx-elec}.k (5.21)$$ where $E_{tx}(k,d)$ is the energy consumption of the transmitter (Tx) and $E_{rx}(k)$ is the energy consumption of the receiver (Rx). k is the number of bits in a packet, d is the distance from Tx to Rx, and n is the path loss exponent of the channel. The $E_{tx-elec}$ , $E_{rx-elec}$ , and $e_{amp}$ represent respectively the energy consumption per bit in Tx, Rx, and for the power amplifier, also in function of distance. A detailed energy model is also presented in (AKYILDIZ; VURAN, 2010), designed for a multilevel quadrature amplitude modulation (MQAM) scheme, where the energy consumption is expressed in terms of the packet length (L), the number of bits per symbol (b), the channel bandwidth (B), the receiver noise figure $(N_f)$ , the power spectrum energy $(\sigma^2)$ , the probability of bit error $(P_b)$ , the power gain factor $(G_d)$ , the circuit power consumption $(P_c)$ , the frequency synthesizer power consumption $(P_{syn})$ , the frequency synthesizer settling time $(T_{tr})$ , the transceiver on time $(T_{on})$ , and the modulation parameter (M). $$E_c = E_{c1} + E_{c2} (5.22)$$ $$E_{c1} = (1+\alpha) \cdot \frac{4}{3} \cdot N_f \cdot \sigma^2 \cdot \frac{2^b - 1}{b} \cdot ln(\frac{4 \cdot (1 - 2^{(\frac{-b}{2})})}{b \cdot P_b}) \cdot G_d$$ (5.23) $$E_{c2} = \frac{P_c.T_{on} + 2.P_{syn}.T_{tr}}{L_{packet}}$$ (5.24) $$\alpha = \frac{\xi}{\eta} - 1;$$ $\xi = 3.\frac{\sqrt{M} - 1}{\sqrt{M} + 1};$ $M = 2^{(\frac{L}{B.T_{on}})}$ (5.25) The equations from 5.22 to 5.25 are an example of a model with other parameters, where $\alpha$ , $\eta$ and $\xi$ are components of the main $E_c$ equation, based on the power efficiency and in the modulation parameters. The meaning of b is the same, remaining equal to $log_2(M)$ . Additionally, energy efficiency model including a cross-layer evaluation is presented by (AKYILDIZ; VURAN, 2009). As an example of the military application cited by this work is the Smart Dust, Sniper Detection, and Surveillance Network Systems. In a Cross-Layer approach the energy efficiency of a bottom-up system model depends on the specification from the physical layer (PHY) to medium access control sublayer (MAC), and so on - regarding the case. Some authors are treating of new proposals about this topic, *e.g.*, (KARVONEN, 2015). For this work, the concern is in the elaborated and related formulation and the instructive way to achieve the results. It is a basic premise the energy per pulse or energy per bit to have the total consumption for data transfer in the communication procedures. It is necessary to distinguish each one in terms of equivalence to get some sense over the system. So, simply, if a pulse is equal to bit transmitted then the energy of both is equivalent. Otherwise, it is possible to have an amount of pulse to build a semantic set of information, be it in the form of bit, byte or also known as a chip (piece of data information) (ANTREICH; NOSSEK, 2011). Other work that describes an energy efficiency model is presented by Karvonen - in "A cross-layer energy efficiency optimization for the IEEE 802.15.6 standard" (KARVONEN, 2015). Using the Bose-Chaudhuri-Hocquenghem (BCH) code for forwarding error correction and on-off keying modulation in a WBAN, the communication is evaluated, showing results of energy efficiency according to variations of the payload lengths and the coding. A function of signal-to-noise ratio (SNR) in the additive white Gaussian noise channel (AWGN) is also an important aspect considered that affect the energy efficiency. The energy of the communication across the medium, represented by the RF link equations, can be calculated from the following equations: $$E_{link}^{i} = (N_{tx,i} - 1).(T_{packet}^{i}.(P_{tx,RF}^{i} + P_{tx,circ}) + \epsilon_{enc}^{i} + T_{ACKW}.P_{rx})$$ $$+ T_{packet}^{i}.(P_{tx,RF}^{i} + P_{tx,circ}) + \epsilon_{enc}^{i} + T_{ACKW}.P_{rx}$$ $$+ N_{tx,i}.(T_{packet}^{i}.P_{rx} + \epsilon_{dec}^{i}) + T_{ACK}.(P_{tx,RF}^{i} + P_{tx,circ})$$ (5.26) $$E_{link}^{i} = ((N_{tx,i} - 1) + 1) \cdot (T_{packet}^{i} \cdot (P_{tx,RF}^{i} + P_{tx,circ}) + \epsilon_{enc}^{i} + T_{ACKW} \cdot P_{rx}) + N_{tx,i} \cdot (T_{packet}^{i} \cdot P_{rx} + \epsilon_{dec}^{i}) + T_{ACK} \cdot (P_{tx,RF}^{i} + P_{tx,circ})$$ (5.27) Figure 5.16 – Variation of the link energy in the WBAN, depending on the distance and nodes. The variables of equation 5.26 are, according to Karvonen et~al., defined as: the $\epsilon_{enc/dec}$ is the energy consumption of encoding/decoding that depends on the codeword length, the error correction capability, and the energy consumption of computational operations; the $T^i_{packet}$ is the transmitted frame duration for the code rate in function of the synchronization header, physical layer header, and PSDU time duration; as well as $T_{ACKW}$ is the wait time for a acknowledge message, including the respective time duration $T_{ACK}$ ; the average number of transmissions required for a successful packet reception is defined as $N_{tx,i}$ . Besides the RF power consumption for Rx/Tx and respective circuitries (circ). The Table 5.12 summarizes the reference values taken from the literature and from the results of the previous chapter for the FEC decoder and encoder. The parameters which are important to the energy model developed by the author are highlighted in – Reference Table 5.12. The average power of the receiver and of the transmitter are shown in this table in the first two columns. The Energy per bit transmitted/received is shown in the next two columns, based on hardware designs by several other authors. In our Energy model we proposed to use the value of 175 pJ/bit at both the TX and RX. The energy consumption for the TX-RX link (first line of Table 5.13 - $E_{link}$ ) was estimated, by our model equations, to be around 180 $\mu$ J, while Karvonen (2015) estimated it to be 288 $\mu$ J, smaller in the scenario considered by this author. Some aspects are taking into account, like the total number of bytes of the PSDU for this example, impacting in the frame duration time. As noted, the energy spent by the decoder is crucial in the overall energy budget of the system. Table 5.12 – Interesting Power and Energy values. | | Power_RX (mW) | Power_TX (mW) | Energy_RX | Energy_TX | References | |-------|-------------------|-------------------|---------------|--------------|-------------------------| | | 2.75 | 5.9 ~ 12.3 | - | 9.7 nJ/bit | (KUO et al., 2016) | | Ch. 2 | $4.8 \sim 6.5$ | $4.6 \sim 14.5$ | - | 10.7 nJ/bit | (WONG et al., 2013) | | | 20 | $1.7 \sim 2.5$ | - | 1nJ/bit | (IEEE, 2012) | | | - | $9.63 \sim 53.64$ | 10.7 nJ/bit | - | (BARRAS, 2010) | | | - | 1 | 290 pJ/* | 25 pJ/* | (GAMBINI et al., 2012) | | Ch. 3 | - | - | - | 4.7 pJ | (TUAN-ANH et al., 2007) | | | - | - | - | 357 nJ | the author | | | - | - | 175 pJ ** | 175 pJ ** | | | Ch. 4 | $0.001 \sim 0.01$ | - | 5.65 pJ/pulse | - | the author | | | $0.03 \sim 0.062$ | - | - | - | | | | - | - | 3.8 nJ/bit | 1.5 pJ/pulse | (Neto et al.,2012) | | Ch. 5 | - | - | - | 4.6 pJ | (SUHONEN et al., 2012) | | | - | - | - | 17 nJ | | \*bit equal to pulse; \*\*(2ns of pulse) Source: The author and references. Table 5.13 – Simulation parameters. | Parameter | Notation | Proposed Value | Karvonen (2015) | |---------------------------------------|--------------------------------------|----------------|-----------------| | Energy Consumption for Tx-Rx Link | $E_{link}\left(\mathbf{J}\right)$ | 1.8E-4 | 2.88E-4 | | Expected Number of Transmissions | $N_{tx}(i)$ | 5 | 5 | | RF Power Consumption | $P_{tx,RF}(i)$ (W) | 3.7E-5 | 3.7E-5 | | Transmitter Circuit Power Consumption | $P_{tx,circ}\left(\mathbf{W}\right)$ | 0.01 | 0.002 | | Receiver Power Consumption | $P_{rx}\left(\mathbf{W}\right)$ | 0.02 | 0.02 | | Transmitter UWB frame duration | $T_{packet}(i)$ (s) | 1E-3 | 1.97E-4 | | Duration of the Acknowledge Packet | $T_{ACK}(i)$ (s) | 1E-4 | 1.5E-4 | | Synch. Header duration | $T_{SHR}(i)$ (s) | 2E-5 | 4.03E-5 | | PHY header durations | $T_{PHR}(i)$ (s) | 4E-5 | 8.21E-5 | | Short interframe spacing | pSIFS | 3E-5 | 7.5E-5 | | Energy of the Decoder | $E_{dec}(i)$ (J) | 3E-6 | 5E-5* | | Energy of the Encoder | $E_{enc}(i)$ (J) | 1E-6 | 1.9E-7* | | Number of information bytes | $\lambda(i)$ | 255 | - | Source: The author and (KARVONEN, 2015). Figure 5.16 is a plot generated by the equation 5.27, representing that the energy required for communication increases depending on the number of nodes and the distance from the source to the sink. The majority of the parameters used are specified in Table 5.13. In this model it is taken into account that there is a proper proportionality as to the number of nodes and the power emitted to carry out the communications. It represents a synthesis of energy behavior in the model, already considering the values proposed by the author for the parameters of Table 5.13. In this Table the values used by Karvonen (2015) are shown in the respective column. Equations represent a Cross-Layer model for energy, in the context of a WBAN, from the bottom to the top levels. The FEC decoder is one contributor to the $P_{RX}$ component, and it also impacts the $P_{TX}$ through its effectiveness to reduce retransmissions caused by unrecovered errors. The new model is adjusted according to the perspectives of this proposal. The idea is to form a group of equations to express the relationship between operations and the amount of energy consumed, although this relation can vary depending on the CMOS technology and other factors. At this moment, the study is divided into pulses comparison (gaussian, square, and spheroidal) and the amount of data needed for communication in a simple WBAN model, which in their turn also have a percentage of energetic impact in the model. #### 5.5.1.2 Considerations about the Power Analysis - A New Approach When in a network it is considered a star topology with several nodes that communicate with a central coordinator, simplifying only to the mode of processing based on the circuits, involving the core of the digital hardware, it becomes useful to describe a model that unifies the power consumption of all the set as follows. In equation 5.28, $E_T^{Proc}$ is the total processing energy of the digital circuits, including the CODEC of the FEC, $E_{TX}^{Proc}$ is regarding to the minimum energy to remain the Hub in an active state (or the transmitter energy spent in digital processing); $E_{RX}^{i,Proc}$ is the energy in static mode for a group of nodes of the WBAN (or the receiver energy spent in digital processing). The equation 5.29 represents the total energy required for the processing, considering the setup time, where the $(P.T)_i^{setup,RX}$ and $(P.T)_{setup,TX}$ represent, respectively, the power multiplying the setup time in the receiver and the transmitter. Thus, $E_T^{Proc}$ is the Eq. 5.28 modified. $$E_T^{Proc} = E_{TX}^{Proc} + \sum_{i=1}^{m} E_{RX}^{i,Proc}$$ (5.28) $$E_T'^{Proc} = E_{TX}^{Proc} + (P, T)_{setup, TX} + \sum_{i=1}^{m} (E_{RX}^{i, Proc} + (P.T)_i^{setup, RX})$$ (5.29) To model Eq. 5.30 the RF equipment modules are considered, as well as the channel effect, the analog circuit of the Front-End such as Power Amplifier and the antenna. The necessary energy for communication. $$E_T^{Com} = E_{RX}^{Com} + \sum_{j=1}^n E_{TX}^{j,Com}$$ (5.30) $$E_T^{Com} = E_{RX}^{Com} + \sum_{j=1}^{n} E_{TX}^{j,Com} + (P.T)_{setup}^{Com}$$ (5.31) Equation 5.31 represents the total energy of the communication for some network, including the energy spent in the setup stage. It is the component that differs this equation from the previous one (Eq. 5.30). The required energy to put the communication system ready is expressed by the term " $(P.T)_{setup}^{Com}$ ". Additionally, the energy required to provide communication should take into account the sensitivity level of the receiver $(S_T^{RX})$ , link energy losses (e.g., path losses, "PL"), SINR and, eventually, signal fluctuations. That results in the (losses due to the communication link, $E_{PL}$ ), and the $E_T^{TX}$ is the minimum threshold for communication. For this reason, Eq. 5.32 proposes to aggregate these factors. $$E_T^{TX} = S_T^{RX} + E_{PL} (5.32)$$ Further, to complete the model Eq. 5.33 has also proposed an approach to incorporate the analog circuit $(E_{RX}^{hw})$ of each element of the system. $E_{RX}^{Proc}$ and $E_{TX}^{Proc}$ , respectively, are the processing energy for the receiver and the transmitter using another formulation from another point of view. " $\delta$ " is complementary to " $\phi$ ", in such a way that $\delta << \phi$ due the power consumption of the decoder is reduced when compared with the other parts of the equipment (much less when the encoder is considered). $$E_{TX}^{Proc} = \delta \cdot E_{TX}^{enc} + \phi \cdot E_{TX}^{hw}$$ $$E_{RX}^{Proc} = \delta \cdot E_{RX}^{dec} + \phi \cdot E_{RX}^{hw}$$ $$(\delta = 1 - \phi)$$ (5.33) Fig. 5.17 is the result of a simulation of Equations 5.28, 5.29, 5.30, and 5.31, representing the model according to the values of the parameters highlighted throughout this work. In Table 5.14 the range of data used in the simulation is listed. The range for the transceiver is based on available data in the literature, *e.g.*, the static power dissipated by the front-end is estimated $(480 \text{ nW} \sim 170 \mu\text{W})$ as a component of the circuit power (HAAPALA; HALONEN, 2018). The oscillations across the increasing number of nodes (x-axis) are due to the random combination of the TX values of each node (active transmission of some of them, while others are in hibernation stage). An asymptotic plot shows the increase of total energy curves with their linear behavior. Table 5.14 – Simulation parameters for the model. | Parameter | Notation | Simulation Value | |------------------------------------------------|-----------------------------------------|--------------------------| | Receiver energy spent on the communication | $E_{RX}^{Com}\left(\mathbf{J}\right)$ | 1E-4 | | Transmitter energy spent on the communication | $E_{TX}^{j,Com}\left(\mathbf{J}\right)$ | 1E-3 | | Setup time | $T(\mathbf{s})$ | 1E-3 | | Transmitter Power (max.) | $P_{Tx}\left(\mathbf{W}\right)$ | 53.64E-3 | | Receiver Power (max.) | $P_{Rx}\left(\mathbf{W}\right)$ | 9.93E-3 | | Energy to set up the communication in the Tx | $E_{Tx}^{setup}\left(\mathbf{J}\right)$ | $E_{TX}^{j,Com}$ /500 | | Energy to set up the Tx circuit | $E_{Tx}^{setup-circ}$ (J) | $E_{Tx}^{circ}$ /500 | | Total energy of the processing circuit | $E_T^{Proc}\left(\mathbf{J}\right)$ | $54.75E-5 \sim 39.29E-3$ | | Transmitter energy spent in digital processing | $E_{TX}^{Proc}\left(\mathbf{J}\right)$ | 54.64E-6 | | Receiver energy spent in digital processing | $E_{RX}^{i,Proc}$ (J) | 12.93E-6 | | Encoder Energy | $E_{enc}\left(\mathbf{J}\right)$ | 1E-6 | | Decoder Energy | $E_{dec}\left(\mathbf{J}\right)$ | 3E-6 | | Number of Nodes | $N_{nodes}$ | 1 ~ 300 | | Total energy in the communication | $E_T^{Com}$ (J) | $0.001 \sim 0.031$ | Source: The author. ## 5.6 Summary For developing the Cross-Layer Energy Model presented in this Chapter, the connection between the physical (PHY) and MAC layers is exceptionally relevant. In such a way that the subject of study is mainly concerned with the impact of one on the other. For this, the protocol of connection between such two layers is the determining factor of the efficiency and effectiveness of the communications. In this context, the link budget and the overview of the results collected in recent years fill the existing gap to obtain the final result. For instance, 175 pJ per bit is a reasonable value for the design, considering the energy values researched in literature. About the energy consumption, the on-the-shelf commercial chips presented in this Chapter have much higher energy levels - which vary from 17 to 840 nJ per bit. The transceivers front-end have their power values varying from 1.7 mW to 53.64 mW. Notably, the transmitter front-end consumes more power than the receiver front-end. The overall consumption in the receiver is higher since the baseband functions and the time required for synchronizing and for the set up of the reception functions adds significant expenditure of energy. These RX functions shift the balance of energy towards the receiver when all aspects of communication are considered. At the end of this Chapter the proposal to use an improved model, with different model parameters from real data from IC designs, was presented. This proposal is definitely linked to the recommendations of previous researchers, adding new components to the existing research topics. The results presented in this chapter constitute the convergence of the information presented throughout the thesis. Finally, a new model of the network is shown, which will entail further research on this subject. #### 6 CONCLUSIONS AND FUTURE WORK This thesis addresses the digital IC design in CMOS for WBAN application-oriented, aided by specifications of telecommunications systems. Moreover, at the system level, an interrelated multilayer energy model, consisting of the IR-UWB cross-layer energy model, is also presented. Specific components of this system are focused such as waveform, FEC decoders, and link energy to achieve effective communications by the IR-UWB radio. Therefore, the energy consumption in the communication processes depends on the medium, the transceiver hardware and the algorithms used to process the data, which were presented throughout this work. The results of the previous Chapters aim to answer the questions presented at the motivation section of Chapter 1. The main findings and results of this Thesis can be synthetically stated as follows: - 1. The determination of the best waveform pulses for IR-UWB in an energy-savvy communication. - 2. The use of Pulse Position Modulation for the IR was established as an energy-favored modulation to determine the best IR-UWB system performance. - 3. The investigation was carried out in the Thesis for the forward error correction (FEC) block with VHDL synthesis to estimate the power for different code rates. - 4. An estimation of Hub-Nodes communication was made, related to the link budget. - 5. Provided by the existing benchmark, the values of energy per bit/pulse of the base-band component was reviewed. - A system of equations proposing the Cross-Layer Energy Model of IR-UWB for Short-Range Communication was exposed, as a method of convergence for the information presented in the thesis. Therefore, this thesis investigates a myriad of aspects and parameters related to the topics to improve the energy model, based on implementations and concepts from PHY and MAC levels, a compatible model for IR-UWB for a WBAN application. One of the objectives was to verify its effectiveness and efficiency, as well as to insert specific microelectronics techniques contributions in the model. Finally, the perspective is towards a new energy efficient model. In which, the fundamental aspects of the waveform of pulses and types of modulation, and the benchmark of the power budget, are steered to the concepts necessary for the development of the digital design. #### 6.1 Conclusions The work of this thesis investigated an energy efficiency model for WBANs operating as IR-UWB. The results achieved allow some conclusions, regarding the modulation, pulse techniques, processing and coding for the communication systems, as follows. #### 6.1.1 Pulse Shapes and Modulation Techniques for IR-UWB in WBANs Three UWB pulse shapes (Gaussian, PSWF, or Square) and their impact on the energy consumption were analyzed and compared. It was possible to conclude that the PSWF shape has a power/energy effectiveness that warrants its use in IR-UWB systems since its spectral occupancy was considered. The pulse waveform directly impacts the energy consumption and the $E_{eff}$ of the transmission. The modulation or coding technique where one bit is equivalent to one pulse is the worst case in terms of energy efficiency, when PPM has "M" equal to "1". Moreover, the PPM with "M" equal to "3" or greater is the best choice for energy efficient IR-UWB transmitter. PPM orders higher than "3" pose a problem of pulse jitter and stringent requirements on pulse resolution, which tend to offset benefits of higher $E_{eff}$ in the Tx/Rx channel. The focus of the presented analysis is the energy used in the inherent processing, a full model needs to consider the energy of the RF link (the budget link), thus raising the power level. For an enhanced link is necessary to use the most effective pulse (PSWF or PSWF-like) to model the simplest and $E_{eff}$ modulation for the WBAN communication. Improvements in energy efficiency depend on the data payload and required data rates, the spectral efficiency, as well as on the quality of the channel, which is an embedded parameter in terms of signal to noise ratio. This performed evaluation can contribute to future work in energy efficiency models for IR-UWB based WBANs. #### 6.1.2 Comparison of BCH and QC-LDPC Decoders Power efficiency is the critical factor that drives the development of integrated circuits, especially of tiny sensors that communicate wirelessly and have limited energy sources that are not so easily recharged or replaced. Therefore, integrating FEC codes into transceivers is essential to improve communication reliability, reducing the number of retransmissions due to unrecoverable data corruption. This work discussed and compared different FEC codes applied to the WBAN communication environment. In this context, the hardware implementation QoR of IEEE 802.15.6-defined BCH codes and channel capacity-achieving QC-LDPC codes for a commercial 65nm CMOS technology were analyzed. The former provides a lightweight design with an energy efficient circuit as well as in its area, especially for higher code rates, using only 203 gates for rate 0.81. The later, however, provides a robust, correction efficient approach with a higher cost in both area and power. Despite the advantages of LDPC codes, they are not yet suitable for WBAN applications, especially when the application targets low data rates. The inherent robustness of the LDPC is only advised in very noisy environments, and very far distances between Tx and Rx - which is not the case for most short-range WBAN links. #### 6.1.3 Cross-layer Energy Model Analyzing the entire scope of this work, the constraints involved to reach the high-level energy model, and the necessary steps to obtain the results, a direct interdependence of the physical layer (PHY) of the system and the MAC layer exists. Consequently, the better performance of the components of this layer provides also gain in the overall quality of the communications, which can be translated into energy efficiency since the expenses with the work of subsequent repetitions become mitigated or, in the best case, avoided (representing total success in the communication process). Briefly, the cross-layer energy model of IR-UWB for short-range communication system depends on a set of multiple variables. It is necessary to make a complete adaptation of the parameters used for the conditions of the environment and its application. Thus, allowing it to be an effective and optimized model. The following propositions of the model, discussed in this thesis, are highlighted for energy improvements in the system: - 1. the use of $7_{th}$ derivative equation to produce the PSWF wave. - 2. PPM with "M" equal to "3" or greater. - 3. A BCH coding and decoding for a system with bearable SNR conditions, otherwise the use of a more robust FEC, *e.g.*, the LDPC improved CODEC. - 4. The use of WUR mode in the communication system, with Type II HARQ, and a specific algorithm for WBAN. - 5. The specialized literature gives the reference values for Transmitter Power and Sensitivity, and equivalent classes, characterizing the state-of-art. Thus, from them, it is possible to place the thresholds, operating margins, and physical constraints for the optimized model. - 6. It is noted throughout this work that the MAC depends on the PHY layer, so it is known that the communications spend more energy than the processing stage, and the latter significantly impacts on the energy gain of the overall system. - 7. It is possible to get a more efficient system whose model translates it, when coordinating the premises above highlighted, such as a hardware in a scaled CMOS technology node (45nm or less) with the lowest energy consumption, the FEC CODEC adaptable to the sensors, the appropriate PSWF, and the proper adjustment of the Code Rate. Regarding the performance improvement of the energy due to the duty-cycling approach and wake-up radios, it is observed that optimization in the algorithm causes better energy savings. That is, there is an opportunity for the research involving the MAC procedures to generate productivity and energy efficiency, as a function of these two techniques reconciled with others, such as the Machine Learning or automatic reconfiguration adjustable to the medium and the communication channel. #### **6.2** Future Works Several options can be listed as a continuation in future works, regarding the system level, or even microelectronics and data communication subgroups. Some suggestions for development in such knowledge fields are shown as follows: - To confirm that PPM approach overcomes in terms of energy consumption other modulation techniques (DBPSK, DQPSK, DPSK, and FM-UWB) is a possibility of study. - Evaluate the computational complexity related to spent energy in each of the FEC's CODEC. Assigning to the algorithms the level of complexity that influences the power associated with its hardware or software implementation. - Implement other FEC types, like Convolutional or some kinds of Turbo Codes. - Compare a full range of variations in the system and how each one impacts in the energy model, establishing the trade-offs about the energy level to accomplish the communication task. - Work with all modules of the IR-UWB and the system, determining precisely the consumption of each due to the respective hardware, making the IC Design. - Check the performance of the IR-UWB (as Wake-up radio) and the operating conditions according to the environment by evaluating the paths available in the channel. - Improve the energy efficiency model for the WBAN, either by using low power microelectronic techniques or tailoring the model to a very specific application. - Define the amount of energy the WBAN can save to run during long-term operations. For example, the use of IR-UWB for WBAN in smart cities or IoT context. - Investigate the novel LDPC implementations in an IC design to fabricate it. And to adjust its design for a low-performance and restrict area for an implantable sensor. - Reduce the LDPC's area and power by new techniques such as Fin-FET in a decananometer scale. #### **REFERENCES** - AKKAŞLI, C. **Methods for Path loss Prediction MSI Report 09067**. SE-351 95 Sweden, 2009. - AKYILDIZ, I. F.; VURAN, M. C. Error Control in Wireless Sensor Networks: A Cross Layer Analysis. [S.l.: s.n.], 2009. - AKYILDIZ, I. F.; VURAN, M. C. Wireless Sensor Networks. [S.l.]: John Wiley & Sons Ltd, 2010. ISBN 978-0-470-03601-3. - ALVARADO, U.; BISTUÉ, G.; ADÍN, I. Low Power RF Circuit Design in Standard CMOS Technology. [S.l.]: Springer-Verlag Berlin Heidelberg, 2011. ISBN 978-3-642-22986-2. - AMMER, J.; RABAEY, J. The energy-per-useful-bit metric for evaluating and optimizing sensor network physical layers. **Proceedings of the IEEE**, Sensor and Ad Hoc Communications and Networks, 2006. SECON '06. 2006 3rd Annual IEEE Communications Society on, v. 2, p. 695–700, Sept 2006. - ANTREICH, F. Array Processing and Signal Design for Timing Synchronization. Thesis (PhD) Technische Universität München Lehrstuhl für Netzwerktheorie und Signalverarbeitung, 2011. - ANTREICH, F.; NOSSEK, J. A. Optimum chip pulse shape design for timing synchronization. **Proceedings of the IEEE**, Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, p. 3524–3527, May 2011. - AYERS, J.; MAYARAM, K.; FIEZ, T. S. An ultralow-power receiver for wireless sensor networks. **IEEE Journal of Solid-State Circuits**, v. 45, n. 9, p. 1759–1769, Sept 2010. - AYERS, J. et al. A 2.4GHz wireless transceiver with 0.95nJ/b link energy for multi-hop battery-freewireless sensor networks. In: 2010 SYMPOSIUM ON VLSI CIRCUITS. **Proceedings...** [S.l.], 2010. p. 29–30. - BARRAS, D. D. A Low-power Impulse Radio Ultra-wideband CMOS Radio-frequency Transceiver. Thesis (PhD) ETH ZURICH, 2010. - BI, S.; HO, C. K.; ZHANG, R. Wireless powered communication: Opportunities and challenges. **Energy Harvesting Communications, IEEE Communications Magazine**, April 2015. 3GPP Forum Report TS GC5. - BIROLI, A. D. G.; MARTINA, M.; MASERA, G. An LDPC Decoder Architecture for Wireless Sensor Network Applications. **Sensors**, v. 12, n. 12, p. 1529–1543, feb 2012. - BROWN, J. K. et al. An ultra-low-power 9.8 GHz crystal-less UWB transceiver with digital baseband integrated in 0.18 $\mu$ m BiCMOS. In: 2013 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE DIGEST OF TECHNICAL PAPERS. **Proceedings...** [S.l.], 2013. p. 442–443. - CARLSON, A.; CRILLY, P. Communication System. Tata McGraw-Hill Education, 2010. ISBN 9780071321174. Available from Internet: <a href="https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books?id=\\_B0DaELen98C>">https://books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br/books.google.com.br - CHANDRAKASAN, A. P.; BRODERSEN, R. W. Low Power Digital CMOS Design. Norwell, MA, USA: Kluwer Academic Publishers, 1995. ISBN 079239576X. - CHANDRAKASAN, A. P.; VERMA, N.; DALY, D. C. Ultralow-power electronics for biomedical applications. **The Annual Review of Biomedical Engineering is online at bio-eng**, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA, n. 10, 2008. - CHEN, F. et al. 9.3 A 1mW 1Mb/s 7.75-to-8.25GHz chirp-UWB transceiver with low peak-power transmission and fast synchronization capability. In: 2014 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE DIGEST OF TECHNICAL PAPERS (ISSCC). **Proceedings...** [S.1.], 2014. p. 162–163. - CHEN, M. et al. Body area networks: A survey. **Mobile Networks and Applications**, Springer US, v. 16, n. 2, p. 171–193, 2011. - CHIEN, R. Cyclic decoding procedures for bose- chaudhuri-hocquenghem codes. **IEEE Transactions on Information Theory**, v. 10, n. 4, p. 357–363, Oct 1964. - CHÉTELAT, O. et al. New biosensors and wearables for cardiorespiratory telemonitoring. In: 2016 IEEE-EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS (BHI). **Proceedings...** [S.l.], 2016. p. 481–484. - COOK, B. W. et al. Low-Power 2.4-GHz Transceiver With Passive RX Front-End and 400-mV Supply. **IEEE Journal of Solid-State Circuits**, v. 41, n. 12, p. 2757–2766, Dec 2006. - CREPALDI, M. et al. An Ultra-Wideband Impulse-Radio Transceiver Chipset Using Synchronized-OOK Modulation. **IEEE Journal of Solid-State Circuits**, v. 46, n. 10, p. 2284–2299, Oct 2011. - CREPALDI, M. et al. A 0.07 mm<sup>2</sup> Asynchronous Logic CMOS Pulsed Receiver Based on Radio Events Self-Synchronization. **IEEE Transactions on Circuits and Systems**, v. 61, n. 3, p. 750–763, March 2014. - DALY, D. C.; CHANDRAKASAN, A. P. Energy efficient OOK transceiver for wireless sensor networks. In: IEEE RADIO FREQUENCY INTEGRATED CIRCUITS (RFIC) SYMPOSIUM, 2006. **Proceedings...** [S.l.], 2006. p. 4 pp.—. - DE ÁVILA, L. A. et al. Energy efficiency evaluation of the pulse shapes and modulation techniques for IR-UWB in WBANs. In: 2016 IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS). **Proceedings...** [S.l.], 2016. p. 173–176. - DILL, R. **Optimization of Multi-Channel BCH Error Decoding for Common Cases**. Thesis (Master Thesis) Arizona State University, Phoenix, AZ, USA, 2015. - EBRAZEH, A.; MOHSENI, P. An all-digital IR-UWB transmitter with a waveform-synthesis pulse generator in 90nm CMOS for high-density brain monitoring. In: 2013 IEEE RADIO FREQUENCY INTEGRATED CIRCUITS SYMPOSIUM (RFIC). **Proceedings...** [S.l.], 2013. p. 13–16. - ECMA, I. ECMA-368, High Rate Ultra Wideband PHY and MAC Standard. 2008. - ECMA, I. ECMA-369, MAC-PHY Interface for ECMA-368. 2008. - EMAMI, S. **UWB Communication Systems: Conventional and 60 GHz Principles, Design and Standards**. [S.l.: s.n.], 2013. ISBN 978-1-4614-6752-6. - ERKIP, E. et al. Energy efficient coding and transmission. **Vehicular Technology Conference VTC 2001, IEEE VTS 53rd**, Spring, v. 2, 2001. - ETSI, E. T. S. I. 650, Route des Lucioles- 06921, Sophia-Antipolis Cedex FRANCE: [s.n.], 2015. Available from Internet: <a href="http://www.etsi.org/">http://www.etsi.org/</a>>. - FCC. Federal Communications Commission (FCC), Revision of Part 15 of the Commissions Rules Regarding Ultra Wideband Transmission Systems, First Report and Order. [S.l.: s.n.], 2002. - FERNANDES, J. R.; WENTZLOFF, D. Recent advances in ir-uwb transceivers: An overview. In: PROCEEDINGS OF 2010 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS. **Proceedings...** [S.l.], 2010. p. 3284–3287. - FONTANA, R. J.; RICHLEY, E.; BARNEY, J. Commercialization of an ultra wideband precision asset location system. In: IEEE CONFERENCE ON ULTRA WIDEBAND SYSTEMS AND TECHNOLOGIES, 2003. **Proceedings...** [S.1.], 2003. p. 369–373. - GALLAGER, R. B. Low-density parity-check codes. **IRE Transactions on Information Theory**, v. 8, n. 1, p. 21–28, Jan 1962. - GAMBINI, S. et al. A Fully Integrated, 290 pJ/bit UWB Dual-Mode Transceiver for cm-Range Wireless Interconnects. **Solid-State Circuits, IEEE Journal of**, v. 47, n. 3, p. 586–598, March 2012. - GAO, Y. et al. Low-power ultrawideband wireless telemetry transceiver for medical sensor applications. **IEEE Transactions on Biomedical Engineering**, v. 58, n. 3, p. 768–772, March 2011. - GENG, S. et al. 9.2 A 13.3mW 500Mb/s IR-UWB transceiver with link-margin enhancement technique for meter-range communications. In: 2014 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE DIGEST OF TECHNICAL PAPERS (ISSCC). **Proceedings...** [S.1.], 2014. p. 160–161. - GEROSA, A. et al. An energy-detector for noncoherent impulse-radio uwb receivers. **IEEE Transactions on Circuits and Systems I: Regular Papers**, v. 56, n. 5, p. 1030–1040, May 2009. - GHASEMPOUR, M. et al. Ultra-low power transmitter. **Proceedings of the IEEE**, International Symposium on Circuits and Systems (ISCAS), 2012 IEEE, p. 1807–1810, May 2012. - GHAVAMI, M.; MICHAEL, L. B.; KOHNO, R. **Ultra Wideband Signals and Systems in Communication Engineering**. [S.l.]: John Wiley & Sons, Ltd, 2004. ISBN 0-470-86751-5. - GIMENO, C.; FLANDRE, D.; BOL, D. Analysis and Specification of an IR-UWB Transceiver for High-Speed Chip-to-Chip Communication in a Server Chassis. **IEEE Transactions on Circuits and Systems I: Regular Papers**, v. 65, n. 6, p. 2015–2023, June 2018. - GRACIE, K.; HAMON, M. Turbo and turbo-like codes: Principles and applications in telecommunications. **Proceedings of the IEEE**, v. 95, n. 6, p. 1228–1254, June 2007. - HAAPALA, T.; HALONEN, K. A Fully Integrated Digitally Programmable Pulse Shaping 6.0-8.5 GHz UWB IR Transmitter Front-End for Energy Harvesting Applications. In: 2018 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS). **Proceedings...** [S.l.], 2018. p. 1–5. - HAN, S.-C. A Flexible Decoder and Performance Evaluation Array-Structured LDPC Codes. Thesis (PhD Thesis) Carnegie Mellon University, 2007. - HE, J.; ZHANG, Y. P. A CMOS Ultra-Wideband Impulse Radio Transceiver for Interchip Wireless Communications. In: 2007 IEEE INTERNATIONAL CONFERENCE ON ULTRA-WIDEBAND. **Proceedings...** [S.1.], 2007. p. 626–631. - HELLEPUTTE, N. V.; GIELEN, G. A 70 pJ/Pulse Analog Front-End in 130 nm CMOS for UWB Impulse Radio Receivers. **IEEE Journal of Solid-State Circuits**, v. 44, n. 7, p. 1862–1871, July 2009. - HELLEPUTTE, N. V. et al. A Reconfigurable, 130 nm CMOS 108 pJ/pulse, Fully Integrated IR-UWB Receiver for Communication and Precise Ranging. **IEEE Journal of Solid-State Circuits**, v. 45, n. 1, p. 69–83, Jan 2010. - HERINGER, L. C. **Desempenho e Complexidade de Sistemas DS-UWB em Canais Multipercursos Densos**. master Universidade Estadual de Londrina Departamento de Engenharia Elétrica, 2007. - HU, C. et al. A 90 nm-CMOS, 500 Mbps, 3–5 GHz Fully-Integrated IR-UWB Transceiver With Multipath Equalization Using Pulse Injection-Locking for Receiver Phase Synchronization. **IEEE Journal of Solid-State Circuits**, v. 46, n. 5, p. 1076–1088, May 2011. - HU, J. et al. Energy efficient, reconfigurable, distributed pulse generation and detection in uwb impulse radios. In: ICUWB 2009. IEEE INTERNATIONAL CONFERENCE ON ULTRA-WIDEBAND. **Proceedings...** [S.l.], 2009. p. 773–777. - HUANG, K. et al. An Ultra-Low-Power 9.8 GHz Crystal-Less UWB Transceiver With Digital Baseband Integrated in 0.18 $\mu$ m BiCMOS. **IEEE Journal of Solid-State Circuits**, v. 48, n. 12, p. 3178–3189, Dec 2013. - IEEE. Std 802.15.6 IEEE Standard for Local and metropolitan area networks Part 15.6: Wireless Body Area Networks. 2012. - IIDA, S. et al. A 3.1 to 5 GHz CMOS DSSS UWB transceiver for WPANs. In: ISSCC. 2005 IEEE INTERNATIONAL DIGEST OF TECHNICAL PAPERS. SOLID-STATE CIRCUITS CONFERENCE, 2005. **Proceedings...** [S.l.], 2005. p. 214–594 Vol. 1. - ISO. Reference Model of Open Systems Interconnection; Part 1, Basic Reference Model (incorporating connectionless-mode transmission). 2 Park Street, London W1A 2BS, UK, 1988. BS 6568 : Part 1 : 1988 (≡ ISO 7498–1984 including Amendment 1). - ISSA, D. B. et al. Reconfigurable uwb transceiver for biomedical sensor application. **BioNanoScience**, v. 7, n. 1, p. 11–25, Mar 2017. Available from Internet: <a href="https://doi.org/10.1007/s12668-016-0384-9">https://doi.org/10.1007/s12668-016-0384-9</a>>. - JAMRO, E. **The Design of a VHDL Based Synthesis Tool for BCH CODECS**. Thesis (Master Thesis) university of Huddersfiel, 1997. - JOHANNESSON, R.; ZIGANGIROV, K. S. Fundamentals of convolutional coding. In: \_\_\_\_\_. Fundamentals of Convolutional Coding. IEEE, 2015. Available from Internet: <a href="https://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7199859">https://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7199859</a>>. - KARVONEN, H. Energy Efficiency Improvements for Wireless Sensor Networks by Using Cross-Layer Analysis. Thesis (PhD Thesis) Centre for Wireless Communications. University of Oulu, Oulu, Finland., 2015. - KARVONEN, H.; IINATTI, J.; HÄMÄLÄINEN, M. Energy efficiency optimization for IR-UWB WBAN based on the IEEE 802.15.6 Standard. **Proceedings of the ACM**, BodyNets 2013, 8th International Conference on Body Area Networks, Oulu, Finland., p. 575–580, 2013. - KARVONEN, H.; IINATTI, J.; HÄMÄLÄINEN, M. A cross-layer energy efficiency optimization model for WBAN using IR-UWB transceivers. **Telecommunication Systems**, Springer US, v. 58, n. 2, p. 165–177, 2015. Available from Internet: <a href="http://dx.doi.org/10.1007/s11235-014-9900-9">http://dx.doi.org/10.1007/s11235-014-9900-9</a>>. - KUNST, R. An error injector for evaluation of performance of Channel Coding on IEEE **802.16 networks**. master UFRGS, 2009. Master Thesis. - KUNST, R. et al. Improving Network Resources Allocation in Smart Cities Video Surveillance. **Computer Networks**, v. 134, p. 228 244, 2018. Available from Internet: <a href="http://www.sciencedirect.com/science/article/pii/S1389128618300513">http://www.sciencedirect.com/science/article/pii/S1389128618300513</a>. - KUO, F. W. et al. A Bluetooth low-energy (BLE) transceiver with TX/RX switchable on-chip matching network, 2.75mW high-IF discrete-time receiver, and 3.6mW all-digital transmitter. In: 2016 IEEE SYMPOSIUM ON VLSI CIRCUITS (VLSI-CIRCUITS). **Proceedings...** [S.l.], 2016. p. 1–2. - LATRÉ, B. et al. A survey on wireless body area networks. **Wireless Networks**, Springer-Verlag New York, Inc., Secaucus, NJ, USA, v. 17, n. 1, p. 1–18, jan. 2011. Available from Internet: <a href="http://dx.doi.org/10.1007/s11276-010-0252-4">http://dx.doi.org/10.1007/s11276-010-0252-4</a>. - LECOINTRE, A.; DRAGOMIRESCU, D.; PLANA, R. Design and hardware implementation of a reconfigurable mostly digital ir-uwb radio. **Romanian Journal of Information Science and Technology**, v. 11, n. 4, p. 295–318, 2008. University of Toulouse, LAAS-CNRS, France. - LEE, F. S.; CHANDRAKASAN, A. P. A 2.5nJ/b 0.65V 3-to-5GHz Subbanded UWB Receiver in 90nm CMOS. In: 2007 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE. DIGEST OF TECHNICAL PAPERS. **Proceedings...** [S.1.], 2007. p. 116–590. - LI, Z. et al. Efficient encoding of quasi-cyclic low-density parity-check codes. **GLOBECOM IEEE Glob. Telecommun. Conf.**, v. 3, n. 1, p. 1205–1210, 2005. - LIN, S.; COSTELLO, D. J. Error control coding. 2. ed. Upper Saddle River: Pearson, 2004. 1272 p. ISBN 978-0130426727. - MACKAY, D. Good error-correcting codes based on very sparse matrices. **IEEE Transaction on Information Theory**, v. 45, n. 2, p. 399–431, mar 1999. ISSN 00189448. - MASSEY, J. Shift-register synthesis and BCH decoding. **IEEE Transactions on Information Theory**, v. 15, n. 1, p. 122–127, Jan 1969. - MEHRA, A. et al. A 0.32nJ/bit noncoherent UWB impulse Radio transceiver with baseband synchronization and a fully digital transmitter. In: 2013 IEEE Radio Frequency Integrated Circuits Symposium (RFIC). **Proceedings...** [S.l.], 2013. p. 17–20. - MERCIER, P. P.; DALY, D. C.; CHANDRAKASAN, A. P. A 19pJ/pulse UWB transmitter with dual capacitively-coupled digital power amplifiers. In: 2008 IEEE RADIO FREQUENCY INTEGRATED CIRCUITS SYMPOSIUM. **Proceedings...** [S.l.], 2008. p. 47–50. - MOREIRA, J. Castiñeira; FARRELL, P. G. Essentials of Error-Control Coding. Chichester, UK.: John Wiley & Sons, Ltd, 2006. - MOVASSAGHI, S. et al. Wireless body area networks: A survey. **Communications Surveys Tutorials, IEEE**, v. 16, n. 3, p. 1658–1686, Third 2014. - NARENDRA, S.; CHANDRAKASAN, A. Leakage in Nanometer CMOS Technologies. [S.l.]: Springer US, 2006. (Integrated Circuits and Systems). ISBN 9780387281339. - NEKOOGAR, F. **Ultra-wideband Communications: Fundamentals and Applications**. First. Upper Saddle River, NJ, USA: Prentice Hall Press, 2005. ISBN 0131463268. - NETO, J. F.; MOREIRA, L. C.; NOIJE, W. A. M. V. Inductorless very small 4.6pj/pulse 7th derivative pulse generator for ir-uwb. In: 2012 IEEE 3RD LATIN AMERICAN SYMPOSIUM ON CIRCUITS AND SYSTEMS (LASCAS). **Proceedings...** [S.l.], 2012. p. 1–4. - NEVES, L. C. et al. Design of a PSWF impulse response filter for UWB systems. **Proceedings of the IEEE**, IEEE International Symposium on Circuits and Systems (ISCAS), 2012, p. 1935–1938, May 2012. - NIEMELA, V.; HÄMÄLÄINEN, M.; IINATTI, J. On IEEE 802.15.6 UWB symbol length for energy detector receivers' performance with OOK and PPM. In: 2013 7TH INTERNATIONAL SYMPOSIUM ON MEDICAL INFORMATION AND COMMUNICATION TECHNOLOGY (ISMICT). **Proceedings...** [S.1.], 2013. p. 33–37. - NODA, C. et al. On packet size and error correction optimisations in low-power wireless networks. **Proceedings of the IEEE**, Sensor, Mesh and Ad Hoc Communications and Networks (SECON), 2013 10th Annual IEEE Communications Society Conference on, p. 212–220, June 2013. - NORIMATSU, T. et al. A UWB-IR Transmitter With Digitally Controlled Pulse Generator. **IEEE Journal of Solid-State Circuits**, v. 42, n. 6, p. 1300–1309, June 2007. - OH, D. Low Complexity VLSI Architectures for LDPC Decoders. 158 p. Thesis (PhD) University of Minnesota, 2008. - OPPERMANN, I.; HAMALAINEN, M.; IINATTI, J. **UWB Theory and Applications**. Oulu, Finland: John Wiley & Sons Ltd, 2004. - OTIS, B.; CHEE, Y. H.; RABAEY, J. A 400 /spl mu/w-rx, 1.6mw-tx super-regenerative transceiver for wireless sensor networks. In: ISSCC. 2005 IEEE INTERNATIONAL DIGEST OF TECHNICAL PAPERS. SOLID-STATE CIRCUITS CONFERENCE, 2005. **Proceedings...** [S.1.], 2005. p. 396–606 Vol. 1. - OTT, A.; EISNER, C.; EIBERT, T. A reconfigurable impulse radio transmitter. **Ultra-Wideband (ICUWB), 2012 IEEE International Conference on**, p. 26–30, Sept 2012. - PARK, Y. S. Energy-Efficient Decoders of Near-Capacity Channel Codes. Thesis (PhD) Electrical Engineering in The University of Michigan, 2014. - PHAN, T.; KRIZHANOVSKII, V.; LEE, S. Low-power cmos energy detection transceiver for UWB impulse radio system. In: 2007 IEEE CUSTOM INTEGRATED CIRCUITS CONFERENCE. **Proceedings...** [S.1.], 2007. p. 675–678. - PLETCHER, N. M.; GAMBINI, S.; RABAEY, J. A $52\mu$ W Wake-Up Receiver With-72 dBm Sensitivity Using an Uncertain-IF Architecture. **IEEE Journal of Solid-State Circuits**, v. 44, n. 1, p. 269–280, Jan 2009. - POTTIE, G. J.; KAISER, W. J. Wireless integrated network sensors. Communications of the **ACM**, ACM, New York, NY, USA, v. 43, n. 5, p. 51–58, may 2000. - RABAEY, J. M.; CHANDRAKASAN, A.; NIKOLIC, B. **Digital Integrated Circuits**. 3rd. ed. Upper Saddle River, NJ, USA: Prentice Hall Press, 2008. ISBN 0132219107, 9780132219105. - RAPPAPORT, T. S. Wireless Communications Principles and Practice. 2nd. ed. Upper Saddle River, NJ: Prentice Hall PTR, 2001. - RICHARDSON, T.; URBANKE, R. Efficient encoding of low-density parity-check codes. **Information Theory, IEEE Transactions on**, v. 47, n. 2, p. 638–656, Feb 2001. - ROCHOL, J. Comunicação de Dados. [S.l.]: Bookman, 2012. ISBN 978-85-407-0053-6. - RYAN, W. E.; LIN, S. Channel Codes: Classical and Modern. Cambridge: Cambridge University Press, 2009. 710 p. ISBN 9780511803253. - RYCKAERT, J. et al. A 0.65-to-1.4nj/burst 3-to-10 GHz UWB digital TX in 90nm CMOS for IEEE 802.15.4a. In: 2007 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE. DIGEST OF TECHNICAL PAPERS. **Proceedings...** [S.l.], 2007. p. 120–591. - SAPUTRA, N. An FM-UWB Transceiver for Autonomous Wireless Systems. Thesis (PhD) Electrical Engineering, Technische Universiteit Delft, 2012. - SECTION, I. T. U. R. 2015. Available from Internet: <a href="http://www.itu.int/">http://www.itu.int/</a>>. - SHI, Y. et al. A 10 mm<sup>3</sup> inductive coupling radio for syringe-implantable smart sensor nodes. **IEEE Journal of Solid-State Circuits**, v. 51, n. 11, p. 2570–2583, Nov 2016. SILIGARIS, A. et al. A low power 60-ghz 2.2-Gbps UWB transceiver with integrated antennas for short range communications. In: 2013 IEEE RADIO FREQUENCY INTEGRATED CIRCUITS SYMPOSIUM (RFIC). **Proceedings...** [S.1.], 2013. p. 297–300. SLEPIAN, D.; POLLAK, H. O. Prolate spheroidal wave functions, fourier analysis and uncertainty. **The Bell System Technical Journal**, v. 40, n. 1, p. 43–63, Jan 1961. SMAINI, L. et al. Single-chip CMOS pulse generator for UWB systems. In: PROCEEDINGS OF THE 31ST EUROPEAN SOLID-STATE CIRCUITS CONFERENCE, 2005. ESSCIRC 2005. **Proceedings...** [S.l.], 2005. p. 271–274. SPARROW, O. R. et al. High rate UWB CMOS transceiver chipset for WBAN and biomedical applications. **Analog Integrated Circuits and Signal Processing**, v. 81, n. 1, p. 215–227, Oct 2014. Available from Internet: <a href="https://doi.org/10.1007/s10470-014-0369-y">https://doi.org/10.1007/s10470-014-0369-y</a>. STANGHERLIN, K. H. Energy and Speed Exploration in Digital CMOS Circuits in the Near-threshold Regime for Very-Wide Voltage-Frequency Scaling. Thesis (master) — UFRGS, 2013. Master Thesis. STOICA, L. et al. An ultra wideband TAG circuit transceiver architecture. **Proceedings...**, Ultra Wideband Systems, 2004. Joint with Conference on Ultrawideband Systems and Technologies. Joint UWBST IWUWBS. 2004 International Workshop on, p. 258–262, May 2004. STREEL, G. de et al. SleepTalker: A ULV 802.15.4a IR-UWB Transmitter SoC in 28-nm FDSOI Achieving 14 pJ/b at 27 Mb/s With Channel Selection Based on Adaptive FBB and Digitally Programmable Pulse Shaping. **IEEE Journal of Solid-State Circuits**, v. 52, n. 4, p. 1163–1177, April 2017. SUHONEN, J. et al. Low-Power Wireless Sensor Networks Protocols, Services and Applications. [S.l.]: Book Springer, 2012. Springer Briefs in Electrical and Computer Engineering. TANNER, R. A recursive approach to low complexity codes. **IEEE Transactions on Information Theory**, v. 27, n. 5, p. 533–547, Sep 1981. TUAN-ANH, P. et al. 4.7pj/pulse 7th derivative gaussian pulse generator for impulse radio UWB. **Circuits and Systems, 2007. ISCAS 2007. IEEE International Symposium on**, p. 3043–3046, May 2007. ULLAH, S. et al. A comprehensive survey of wireless body area networks. **Journal of Medical Systems**, v. 36, n. 3, p. 1065–1094, 2012. UWB-FORUM, U. DS-UWB Physical Layer Submission to 802.15 Task Group 3a. 2004. VAUCHE, R. et al. A 100 MHz PRF IR-UWB CMOS Transceiver With Pulse Shaping Capabilities and Peak Voltage Detector. **IEEE Transactions on Circuits and Systems I: Regular Papers**, v. 64, n. 6, p. 1612–1625, June 2017. VERHELST, M.; DEHAENE, W. Energy Scalable Radio Design for Pulsed UWB Communication and Ranging. [S.l.]: Springer-Verlag, 2009. - VIGRAHAM, B.; KINGET, P. A self-duty-cycled and synchronized UWB receiver SoC consuming 375pj/b for -76.5dbm sensitivity at 2mb/s. In: 2013 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE DIGEST OF TECHNICAL PAPERS. **Proceedings...** [S.1.], 2013. p. 444–445. - WANG, Z.; CUI, Z. Low-complexity high-speed decoder design for quasi-cyclic LDPC codes. **IEEE Transaction on Very Large Scale Integration Systems**, v. 15, n. 1, p. 104–114, 2007. - WANG, Z.; CUI, Z.; SHA, J. VLSI Design for Low-Density Parity-Check Code Decoding. **IEEE Circuits and System Magazine**, v. 11, n. 1, p. 52–69, 2011. - WENTZLOFF, D. D. Pulse-Based Ultra-Wideband Transmitters for Digital Communication. Thesis (PhD) Massachusetts Institute of Technology, 2007. - WENTZLOFF, D. D. **Ultra Low Power Radio Survey**. Thesis (PhD) University fo Michigan, 2017. Available from Internet: <a href="http://www.eecs.umich.edu/wics/low\_power\_radio\_survey.html">http://www.eecs.umich.edu/wics/low\_power\_radio\_survey.html</a>>. - WENTZLOFF, D. D. et al. Energy Efficient Pulsed-UWB CMOS Circuits and Systems. In: 2007 IEEE INTERNATIONAL CONFERENCE ON ULTRA-WIDEBAND. **Proceedings...** [S.1.], 2007. p. 282–287. - WONG, A. C. W. et al. A 1V 5mA Multimode IEEE 802.15.6/Bluetooth Low-Energy WBAN Transceiver for Biotelemetry Applications. **IEEE Journal of Solid-State Circuits**, v. 48, n. 1, p. 186–198, Jan 2013. - WU, Z.-H. et al. A low-complexity uwb rf receiver based on a novel low noise amplifier. **Microelectronics Journal**, v. 46, n. 8, p. 685 689, 2015. Available from Internet: <a href="http://www.sciencedirect.com/science/article/pii/S0026269215001160">http://www.sciencedirect.com/science/article/pii/S0026269215001160</a>. - YITBAREK, Y. H. Evaluation and Implementation of Error Control Coding Schemes in Industrial Wireless Sensor Networks. Thesis (Master Thesis) Chalmars University of Technology, Göteborg, Sweden, 2014. - YOO, H.; LEE, Y.; PARK, I. C. Low-Power Parallel Chien Search Architecture Using a Two-Step Approach. **IEEE Transactions on Circuits and Systems II: Express Briefs**, v. 63, n. 3, p. 269–273, March 2016. - ZHENG, Y. et al. A CMOS Carrier-less UWB Transceiver for WPAN Applications. In: 2006 IEEE INTERNATIONAL SOLID STATE CIRCUITS CONFERENCE DIGEST OF TECHNICAL PAPERS. **Proceedings...** [S.1.], 2006. p. 378–387. - ZHENG, Y. et al. A 3.54 nJ/bit-RX, 0.671 nJ/bit-TX Burst Mode Super-Regenerative UWB Transceiverin 0.18-μm CMOS. **IEEE Transactions on Circuits and Systems I: Regular Papers**, v. 61, n. 8, p. 2473–2481, Aug 2014. - ZWIRELLO, L.; WIESBECK, W.; ZWICK, T. On maximum signal amplitude, signal-to-noise ratio and filtering in impulse-UWB transmission. **Proceedings...**, Antennas and Propagation (EuCAP), 2014 8th European Conference on, p. 1624–1628, April 2014. # APPENDIX A - PULSE WAVE SHAPE FOR THE $7^{TH}$ DERIVATIVE of a GAUSSIAN PULSE The following analysis was elaborated and included here, since it is not found in the literature. It is shown that the $7^{th}$ derivative presents useful characteristics in the UWB spectrum. In a way, this is comparable to the $20^{th}$ derivative in meeting FCC mask restrictions. Based on the $5^{th}$ derivative, proceed as follows to obtain the $7^{th}$ derivative in time domain, which is shown by factoring the pre-exponential terms in the last equation: $$g^{V}(t) = \left(\frac{-15.t + \frac{10.t^{3}}{\sigma^{2}} - \frac{t^{5}}{\sigma^{4}}}{\sqrt{2.\pi}.\sigma^{7}}\right) \cdot e^{-\left(\frac{t^{2}}{2.\sigma^{2}}\right)}$$ (1) $$\left(-\frac{15}{\sqrt{2.\pi}.\sigma^{7}} + \frac{30.t^{2}}{\sqrt{2.\pi}.\sigma^{9}} - \frac{5.t^{4}}{\sqrt{2.\pi}.\sigma^{11}}\right).e^{-\left(\frac{t^{2}}{2.\sigma^{2}}\right)} + \left(\frac{15.t^{2}}{\sqrt{2.\pi}.\sigma^{9}} - \frac{10.t^{4}}{\sqrt{2.\pi}.\sigma^{11}} + \frac{t^{6}}{\sqrt{2.\pi}.\sigma^{13}}\right).e^{-\left(\frac{t^{2}}{2.\sigma^{2}}\right)}$$ (2) $$g^{VI}(t) = \left(\frac{-15}{\sqrt{2.\pi}.\sigma^7} + \frac{45.t^2}{\sqrt{2.\pi}.\sigma^9} - \frac{15.t^4}{\sqrt{2.\pi}.\sigma^{11}} + \frac{t^6}{\sqrt{2.\pi}.\sigma^{13}}\right) \cdot e^{-\left(\frac{t^2}{2.\sigma^2}\right)}$$ (3) $$g^{VI}(t) = \left(\frac{-15 + \frac{45 \cdot t^2}{\sigma^2} - \frac{15 \cdot t^4}{\sigma^4} + \frac{t^6}{\sigma^6}}{\sqrt{2 \pi} \sigma^7}\right) \cdot e^{-\left(\frac{t^2}{2 \cdot \sigma^2}\right)}$$ (4) $$g^{VII}(t) = \left(\frac{90.t}{\sqrt{2.\pi}.\sigma^9} - \frac{60.t^3}{\sqrt{2.\pi}.\sigma^{11}} + \frac{6.t^5}{\sqrt{2.\pi}.\sigma^{13}}\right) \cdot e^{-\left(\frac{t^2}{2.\sigma^2}\right)} + \left(\frac{15.t}{\sqrt{2.\pi}.\sigma^9} - \frac{45.t^3}{\sqrt{2.\pi}.\sigma^{11}} + \frac{15.t^5}{\sqrt{2.\pi}.\sigma^{13}} - \frac{t^7}{\sqrt{2.\pi}.\sigma^{15}}\right) \cdot e^{-\left(\frac{t^2}{2.\sigma^2}\right)}$$ (5) $$= \left(\frac{105.t}{\sqrt{2.\pi}.\sigma^9} - \frac{105.t^3}{\sqrt{2.\pi}.\sigma^{11}} + \frac{21.t^5}{\sqrt{2.\pi}.\sigma^{13}} - \frac{t^7}{\sqrt{2.\pi}.\sigma^{15}}\right) \cdot e^{-\left(\frac{t^2}{2.\sigma^2}\right)}$$ (6) $$g^{VII}(t) = \left(\frac{105 - \frac{105 \cdot t^3}{\sigma^2} + \frac{21 \cdot t^5}{\sigma^4} - \frac{t^7}{\sigma^6}}{\sqrt{2 \cdot \pi} \cdot \sigma^9}\right) \cdot e^{-\left(\frac{t^2}{2 \cdot \sigma^2}\right)}$$ (7) # APPENDIX B - GALOIS FIELD AND BCH BASIS | 0 | 0 | $(0\ 0\ 0\ 0\ 0\ 0)$ | $\alpha^{32}$ | $1 + \alpha^3$ | (1 0 0 1 0 0) | |---------------|---------------------------------------------------------------------------------------|----------------------|---------------|-------------------------------------------------------------------------------------------------------------------------------------|---------------| | 1 | 1 | (1 0 0 0 0 0) | $\alpha^{33}$ | $\alpha + \alpha^4$ | (0 1 0 0 1 0) | | $\alpha$ | α | (0 1 0 0 0 0) | $\alpha^{34}$ | $\alpha^2 + \alpha^5$ | (0 0 1 0 0 1) | | $\alpha^2$ | $\alpha^2$ | (0 0 1 0 0 0) | $\alpha^{35}$ | $1+\alpha + \alpha^3$ | (1 1 0 1 0 0) | | $\alpha^3$ | $\alpha^3$ | (0 0 0 1 0 0) | $\alpha^{36}$ | $\alpha + \alpha^2 + \alpha^4$ | (0 1 1 0 1 0) | | $\alpha^4$ | $\alpha^4$ | (0 0 0 0 1 0) | $\alpha^{37}$ | $\begin{array}{ccc} \alpha + \alpha^2 & +\alpha^4 \\ \alpha^2 + \alpha^3 & +\alpha^5 \end{array}$ | (0 0 1 1 0 1) | | $\alpha^5$ | $\alpha^5$ | (0 0 0 0 0 1) | $\alpha^{38}$ | $1+\alpha + \alpha^3 + \alpha^4$ | (1 1 0 1 1 0) | | $\alpha^6$ | $1 + \alpha$ | (1 1 0 0 0 0) | $\alpha^{39}$ | | (0 1 1 0 1 1) | | $\alpha^7$ | $\alpha + \alpha^2$ | (0 1 1 0 0 0) | $\alpha^{40}$ | | (1 1 1 1 0 1) | | $\alpha^8$ | $\alpha^2 + \alpha^3$ | (0 0 1 1 0 0) | $\alpha^{41}$ | $1 + \alpha^2 + \alpha^3 + \alpha^4$ | (1 0 1 1 1 0) | | $\alpha^9$ | $\alpha^3 + \alpha^4$ | (0 0 0 1 1 0) | $\alpha^{42}$ | $\alpha + \alpha^3 + \alpha^4 + \alpha^5$ | (0 1 0 1 1 1) | | $\alpha^{10}$ | $\alpha^4 + \alpha^5$ | (0 0 0 0 1 1) | $\alpha^{43}$ | $\alpha + \alpha^3 + \alpha^4 + \alpha^5$ $1 + \alpha + \alpha^2 + \alpha^4 + \alpha^5$ | (1 1 1 0 1 1) | | $\alpha^{11}$ | $1 + \alpha$ | (110001) | $\alpha^{44}$ | $1 + \alpha^2 + \alpha^3 + \alpha^5$ | (1 0 1 1 0 1) | | $\alpha^{12}$ | $1 + \alpha^2$ | (1 0 1 0 0 0) | $\alpha^{45}$ | $1 + \alpha^3 + \alpha^4$ | (1 0 0 1 1 0) | | $\alpha^{13}$ | $\alpha + \alpha^3$ | (0 1 0 1 0 0) | $\alpha^{46}$ | $\alpha + \alpha^4 + \alpha^5$ | (0 1 0 0 1 1) | | $\alpha^{14}$ | $\alpha^2 + \alpha^4$ | (0 0 1 0 1 0) | $\alpha^{47}$ | $1 + \alpha + \alpha^2 + \alpha^5$ $1 + \alpha^2 + \alpha^3$ | (1 1 1 0 0 1) | | $\alpha^{15}$ | $\alpha^3 + \alpha^5$ | (0 0 0 1 0 1) | $\alpha^{48}$ | $1 + \alpha^2 + \alpha^3$ | (1 0 1 1 0 0) | | $\alpha^{16}$ | $1+\alpha$ $+\alpha^4$ | (1 1 0 0 1 0) | $\alpha^{49}$ | $\alpha + \alpha^3 + \alpha^4$ | (0 1 0 1 1 0) | | $\alpha^{17}$ | $\alpha + \alpha^2$ $\alpha^5$ | (0 1 1 0 0 1) | $\alpha^{50}$ | $ \begin{array}{ccc} 1 & +\alpha^2 + \alpha^3 \\ & \alpha & +\alpha^3 + \alpha^4 \\ & \alpha^2 & +\alpha^4 + \alpha^5 \end{array} $ | (0 0 1 0 1 1) | | $\alpha^{18}$ | $1 + \alpha + \alpha^2 + \alpha^3$ | (1 1 1 1 0 0) | $\alpha^{51}$ | $1+\alpha + \alpha^3 + \alpha^5$ | (1 1 0 1 0 1) | | $\alpha^{19}$ | $\alpha + \alpha^2 + \alpha^3 + \alpha^4$ $\alpha^2 + \alpha^3 + \alpha^4 + \alpha^5$ | (1 1 1 1 0 0) | $\alpha^{52}$ | $1 + \alpha^2 + \alpha^4$ | (1 0 1 0 1 0) | | $\alpha^{20}$ | $\alpha^2 + \alpha^3 + \alpha^4 + \alpha^5$ | (1 1 1 1 0 0) | $\alpha^{53}$ | $\alpha + \alpha^3 + \alpha^5$ | (0 1 0 1 0 1) | | $\alpha^{21}$ | $1+\alpha + \alpha^3 + \alpha^4 + \alpha^5$ | (1 1 0 1 1 1) | $\alpha^{54}$ | $1 + \alpha + \alpha^2 + \alpha^4$ | (1 1 1 0 1 0) | | $\alpha^{22}$ | $1 + \alpha^2 + \alpha^4 + \alpha^5$ | (1 0 1 0 1 1) | $\alpha^{55}$ | $\alpha + \alpha^2 + \alpha^3 + \alpha^5$ | (0 1 1 1 0 1) | | $\alpha^{23}$ | $1 + \alpha^3 + \alpha^5$ | (1 0 0 1 0 1) | $\alpha^{56}$ | $1 + \alpha + \alpha^2 + \alpha^3 + \alpha^4$ | (1 1 1 1 1 0) | | $\alpha^{24}$ | $1 + \alpha^4$ | (1 0 0 0 1 0) | $\alpha^{57}$ | $\alpha + \alpha^2 + \alpha^3 + \alpha^4 + \alpha^5$ | (0 1 1 1 1 1) | | $\alpha^{25}$ | $\alpha$ $+\alpha^5$ | (0 1 0 0 0 1) | $\alpha^{58}$ | $1 + \alpha + \alpha^2 + \alpha^3 + \alpha^4 + \alpha^5$ | (1 1 1 1 1 1) | | $\alpha^{26}$ | $1 + \alpha + \alpha^2$ | (1 1 1 0 0 0) | $\alpha^{59}$ | $1 + \alpha^2 + \alpha^3 + \alpha^4 + \alpha^5$ | (1 0 1 1 1 1) | | $\alpha^{27}$ | $\alpha + \alpha^2 + \alpha^3$ | (0 1 1 1 0 0) | $\alpha^{60}$ | $1 + \alpha^3 + \alpha^4 + \alpha^5$ | (1 0 0 1 1 1) | | $\alpha^{28}$ | $\alpha^2 + \alpha^3 + \alpha^4$ | (0 0 1 1 1 0) | $\alpha^{61}$ | $1 + \alpha^4 + \alpha^5$ | (1 0 0 0 1 1) | | $\alpha^{29}$ | $\alpha^3 + \alpha^4 + \alpha^5$ | (0 0 0 1 1 1) | $\alpha^{62}$ | $1 + \alpha^5$ | (1 0 0 0 0 1) | | $\alpha^{30}$ | $1+\alpha + \alpha^4 + \alpha^5$ | (1 1 0 0 1 1) | $\alpha^{63}$ | 1 | - | | $\alpha^{31}$ | $1 + \alpha^2 + \alpha^5$ | (1 0 1 0 0 1) | $\alpha^{64}$ | - | - | | | | | | | | #### APPENDIX C - DATA PACKETS ASSEMBLING This Appendix shows how the data packets are formed, following the IEEE 802.15.6 Standard, and how they were used in some simulations and throughout this work. The flow to build the packets takes into account the bytes length of their sub-frames. Therefore, the UWB PPDU includes as its components the SHR (126 bytes), PHR (40 bytes) and PSDU (variable number of bytes). The PLPC preamble has 126 bytes from 8 times of Kasami Sequences (each with 63 bytes). The PLPC Header (PHR) is divided in 24 bits of Physical frame, 4 bits of HCS and 12 bits of the BCH parity. By its turn, the PSDU is composed by MPDU plus the channel code BCH parity bits. MPDU is formed by 7 bytes of MAC header, $0 \sim 255$ bytes of MAC frame body (payload) and 2 bytes of FCS. Figure 1 shows the general distribution of the frames according to this communication protocol. Service Data Unit (SDU) MPDU: МІС FCS (bytes) (1 - 256)MAC Header: BAN ID (bytes) 4 ACK timing Last Frame BAN Sequence Security Time Key Frame Non Final Frame Fragment Control: Mode / B2 (bits) 2 1 1 Figure 1 – MAC Frame Format for WBAN. Source: IEEE Std 802.15.6 (IEEE, 2012). It is out of the scope here to reproduce and to discuss all the variance of superframe formats existing on the modes and the connection procedures. As an illustration, Fig. 2 provides a representation of the time intervals and energy spreadable over such frame organization. Source: IEEE Std 802.15.6 (IEEE, 2012). From (IEEE, 2012), the assignment shown in Figure 2 is according to these acronyms and #### functionalities: - CAP contention access phase - EAP phases—exclusive access phase, 1 and 2 - MAP managed access phase - RAP random access phase 1 and 2 - "The hub shall place the EAP1, RAP1, MAP, EAP2, RAP2, another MAP, and CAP. The hub may set to zero the length of any of these access phases, but shall not have RAP1 end before the guaranteed earliest time as communicated in Connection Assignment frames sent to nodes that are still connected with it. To provide a non-zero length CAP, the hub shall transmit a preceding B2 frame. The hub shall not transmit a B2 frame if the CAP that follows has a zero length, unless it needs to announce B2-aided time-sharing information and/or provide group acknowledgment. A node may obtain, and initiate frame transactions, in contended allocations in EAP1, RAP1, EAP2, RAP2, and CAP in any active superframe using CSMA/CA or slotted Aloha based random access." (IEEE, 2012). The contention access is a method based on carrier sense multiple access with collision avoidance (CSMA/CA) or slotted Aloha access. The standard does not allow both at same time. The CAP is a random access, "a time span set aside by a hub and announced via a preceding non-beacon frame for contention access to the medium by the nodes in the body area network (BAN) of the hub." Based on this, nodes can use CAP for initiating one or more frame transactions and the coordination of this example process, through its algorithms, has impact on the total energy consumed. Table A.1 contains a summary of packet formation and bits budget for the solutions used in this work, based on the WBAN standard. This type of packet was used to assemble benchmarks for power and energy estimations in the modules designed or simply modeled. Table 1 - A.1 - Estimated number of bits in the packets. | Packet | No. bytes | No. bits | |---------------|-----------|----------| | $MPDU_{MIN}$ | 9 | 72 | | $MPDU_{MAX}$ | 264 | 2112 | | $N_{CW}^1$ | - | 2 | | $N_{CW}^2$ | - | 42 | | bit stuffing | - | 30 | | $N_{PSDU}^1$ | - | 102 | | $N_{PSDU}^2$ | - | 2142 | | PLPC preamble | - | 63 + 63 | | $N_{PAD}$ | 0 | 0 | Source: the author. In order to have worst-case activity for the inputs, and a 50% proportion in the quantity of zeros and ones in the message, this work chose to simulate random text messages with a minimum of 4 Kbits. Even considering the MAC header, statistically there is close to 50% of "0's" and "1's". As 72 bits is the minimum packet of bits, a factor number of **56** was chosen to generate the 4,032 bits, thus meeting the previous condition. These 56 frames are also taken into account for the other payload lengths. This leads to a maximum amount of 118,272 for the 56 frames of 2,112 bits for the 255 message bytes transmission. #### APPENDIX D - OFDM IMPLEMENTED IN AN INTEGRATED CIRCUIT The implementation of an Orthogonal Frequency Division Multiplexing (OFDM) system is presented as a complement to Section 2.1.2, that contains the modulation and demodulation blocks of an Integrated Circuit. For this design work, EDA tools from *Cadence Company (RTL-Compiler and SoC Encounter)* are used, aiming at the logical and physical synthesis of CMOS circuits, in such a way that the GDSII (Graphic Design System II) file is generated from the RTL level as a final result. The procedures and steps described in Section 2.3 were employed for this design, including clock gating for power reduction. This implementation is a way for investigation of another UWB transceiver communication context and its corresponding energy use. Aiming at a future work, in which the OFDM modulation is chosen, where the input data is transmitted between several parallel and orthogonal channels. Such modulation, as the name stands for, has a division multiplexed in orthogonal frequency, with digital modulation suited according to the channel, distinguishing them by subcarrier in different frequencies. In such a way that the purpose is to avoid interference between them and preserve orthogonality. Fig. 4 shows the sequence of steps of this part of the work. A logical synthesis and the subsequent physical synthesis are made following these general steps: Logic Equivalence Checking (LEC), the formal verification of the netlist; circuit floorplanning; power grid distribution on the plant, the power-planning; allocation of the cells in the area; Clock Tree Synthesis (CTS); and Routing with its optimization steps. The digital design was developed using standard-cells of a CMOS 180 nm process. The VHDL is provided by an open source site and the design was based on a previous implementation made in FPGA. The graphic design system II (GDSII) file was the final product, starting from the register transfer level (RTL) description. This OFDM modem/demodulation component design can be used in the UWB, where some adaptations are necessary to match the circuit in the complete design. On the other hand, an IR is much simpler and can be used as an intermediate stage for the design evolution. Whatever the choice for a future design, such component will be divided in TX, AWGN channel, and RX blocks. The OFDM has been used for high throughput, whose FFT internal methods bring efficiency advantages. The circuit was implemented using a FFT Radix-4 of 64 points to decimation, a coordinate rotation digital computer for butterfly method, generating a 4-QAM constellation for each channel. For a total number of 16,629 standard-cells in the ASIC, the estimated power consumption is 232 mW, for a 100 MHz clock frequency. For this design the total area of IC is around 1.5 mm<sup>2</sup>. This design exemplifies that a very large reduction of power is necessary to employ this OFDM modulator in an IoT or in the IR-WBAN context of this thesis. The modulator needs to be adjusted to work at lower frequencies for much lower data rates. A large margin of power reduction needs to be explored in future designs. Figure 3 shows the MB-OFDM distribution in five channels over fourteen bands already mentioned in the same range of UWB (from 3.1 GHz to 10.6 GHz). Min. Max. Min. Тур. Max. Тур. Amplitude Figure 3 – MB-OFDM Bands distribution. Frequency bands (MHz) Source: (FCC, 2002). Plots of some stages of the OFDM Modulator and demodulator in an IC development: Figure 4 – Initial Setup; Floorplanning; and Powerplanning. Source: the author. Figure 5 – Roating; Verification Stage; and Final IC. Source: the author. #### APPENDIX E - Modulations: OOK, PPM, FSK, PSK, and FM-UWB The modulation block of the transmitter is discussed in this Appendix, where modulation schemes can be implemented, each one with different degree of energy efficiency. The main modulation schemes treated here are PPM, OOK, PAM, FSK, PSK, and FM-UWB. Data rate, bit error rate, and power consumption are some features to consider when the approach is oriented to a specific application design. In the case of WBANs, in a wider scope of an hierarchical Wireless Sensor Network (WSN), (e.g., an assembly of biomedical implants, actuators and sensors), each node has blocks for modulation or demodulation to process the information. In Figure 6 the transparent Gaussian pulse represent the zero value, for the bit stream shown in the NRZ line. Table 2 has examples of pulses on time by bits. Such pulse shape is one option to implement in system. As shown in Chapter 3, this choice of pulse is not the more efficient one in terms of energy, and it is used here for illustration. The modulation scheme has to be chosen in a way to make the synchronism also an energy efficient process and easier to achieve. Figure 6 – Modulation Types for a Binary Example. Source: the author. The PPM modulation is further detailed in Figure 7, for M-ary (M=1, unary, M=2 bits is binary, etc) coding by the pulse position. There are pulses representing the binary signal, which is supposed to be inserted in a IR-UWB transmitter. It is possible to see with this short example the reduction in the number of UWB pulses as the degree M is increased. The M-ary is equivalent to the degree as the exponent number in base two, e.g. $2^3 = 8$ , in that case, its M-ary is "3". The related energy spent in this modulation and coding is directly dependent on this M-degree. Again, the transparent pulse represents an optional pulse of a sequence of "zero's". Figure 7 – Modulation using codification PPM with M-ary, M from 1 to 4. Table 2 – Positioning to convert in pulses over time. | Binary Data Input of | <b>Correspondent Relative Position in Time</b> | |----------------------|------------------------------------------------| | 011001001100 | | | PPM M-ary = 1 | 0-1-1-0-0-1-0-0-1-1-0-0 | | PPM M-ary = 2 | 01-10-01-00-11-00 | | PPM M-ary = 3 | 0 1 1 - 0 0 1 - 0 0 1 - 1 0 0 | | PPM M-ary = 4 | 0 1 1 0 - 0 1 0 0 - 1 1 0 0 | Source: the author. As mentioned before, this study focus on the modulation and demodulation techniques applicable to a WBAN. The energy efficiency has to be favored by the higher M. However, the higher the M, higher is the degree of position discrimination required from the hardware at the TX and the RX, during the operation. "To save power consumption, the electronic systems are expected to be heavily digitized (CREPALDI et al., 2014)". That idea leads to the concept of the Software Defined Radio, which has some important characteristics in the ideal proposition, but it is very far from practical implementation in a real communication system if that means to digitize directly the modulated RF signal at the antenna. It is important to look at the consumption of energy inside the Integrated Circuit and across the layers of a system, which means that the energy analysis, in a general way, must not be constrained by the PHY Layer or by a single layer. The main spectrum occupied by the pulses in the sequences of Fig. 7 is related with Ultra-wideband (UWB) frequencies, which allow us to design in digital CMOS the modules of a modem operating from 3.1 up to 10 GHz. The UWB devices have limited penetration when inside an aqueous environment such as the human body. An example of a limitation is a crowded place with UWB WBANs that has a lot of interference, where some aspects of cognition algorithms could be exploited as a solution. # **APPENDIX F - FinFET Transistor Evolution** The overview on FinFET technology presented in this Appendix was written in 2013, as an introduction to new possibilities that such evolution in FET devices can bring. FinFETs allowed the CMOS miniaturization in commercial processes to proceed further since 2011 until today. This new technology advanced more strongly in the digital designs of very dense digital systems. FinFETs allowed to cope with the static power dissipation issues below 22nm in CMOS technologies. It made possible to reduce geometries and solso the energy consumption levels of all digital ICs. The usage of FInFETs in RF systems will depend on the increasing digitalization of most modules of the RF front-end, like mixers, amplifiers and filter that used to be done in the full analog domain. That digitalization is certainly a current trend that will continue further. # An Overview about FinFET Technology Leandro Ávila de Ávila Programa de Pós-Graduação em Microeletrônica PGMicro/UFRGS - Porto Alegre, Brazil Email: leandro.avila@ufrgs.br Abstract—This paper presents an overview about one of the most important electronic technologycal limits, focusing the current talked Ultra Tiny Devices, the FinFETs. Additionally, a brief study over the dopant's implanting into FinFETs as a way to control the voltage threshold by deterministic positions is proposed, among other ideas. #### I. INTRODUTION The technological advances toward nanodevices usage are a current challenge and it needs progressively more improvements, which bring as consequence to achieve phisical semiconductors and new materials limits, in the scaling down. However microelectronic methods and computational resources are helping to overcome this constraints. At the core of the hardware to attend the users expectations for bandwidth and connectivity will be new process technologies, they allow for more power-efficient CMOS transistors and increased integration, enabling a higher level of functionality [16]. In this sense, here are presenting selected works demonstrating how it happens. Some research's themes related with leading-edge FinFET technology devices compose this article as well as its current perspectives. One of the main motivation is modify the characteristics of certain elements to get new properties. Several authors give us examples, as Masahiro [1] writes that with the gate length approaching 10 nm by the year 2020, the channel region will contain a small number of dopant atoms, perhaps only one (apud [2]). The research and use of nanoscale field-effect transistors (FinFET devices) are made to look better alternatives in the MOS channel control, avoiding leakage power [3]. In other words, considering the proportion that a device becomes more and more tiny, the chance of its susceptibility of remaining with a constant current consumption will rise. For that reason some works are proposing ways to improve even the technological fabrication, [4] and [5]. The remainder of this paper is organized by following sections. A short overview about some recent research milestones in Section 2. The expectations over the area and proposal of works in Section 3. Concluding in Section 4 with some considerations and point of views over FinFET technology. #### II. RELATED WORKS In nanoscale there is a great current consumption, even when the transistors are in off state, there are drift currents through depletion region. To solve this, especially in dimensions of 22 to 10 mm or less, when it occurs, the FinFET wraps around the current channel to increase the overlapping Fig. 1. FINFET Cross Sectional Strutures [7]. area between the gate and channel. The current channel is just the fins vertical to the gate in FinFET, which is quite different of the planar CMOS technology. As we see in the works as follows. #### A. FinFETs and SOI-FinFETs FinFETs also known as 3D transistors, because their difference from planar technology putted the manufacture and design conditions under other point of view, relating the need of growth to upper surfaces (3D). Besides it leads to solve problems of current leakage stems from nano dimensioning (high static power). As Chenming Hu [6], coinventor of the FinFET, said the following words about how harmful is this problem, with impact in the dice's total power comsumption. "For 250-nm transistors, the power-supply voltage was 2.5 volts; for 180 nm, it was 1.8 V; for 130 nm, it was 1.3 V. The pattern was very regular until 90 nm, but it reached a limit. The industry has used 1.2 V. Even at 45 nm, the industry still used 0.9 V instead of 0.45 V". #### 22 nm Tri-Gate Transistor Fig. 2. Tri-Gate transistors with multiple fins [8]. The Figure 1 is shown as a solution of the beginning of years 2000 referring the research in semiconductors [7]. They were right, but natural improvements were included during the last 13 years, what led us to latter results pointed by Intel, Figure 2 [8]. The triple gated transistors are a variation on the FinFET (III-FinFET), whose its design substitutes the flat channel through which electrons flow with a 3-D ridge, or fin. The new "tri-gates" transistors are multi-fins structures (in number of three) where the Gate is pierced by them. This kind of transistor achieved the 22 nm scale in 2011, constituting the base for the Intel's microprocessor with code-named *Ivy Bridge*. Aditionally, the technology can get a better performance how far it can produce uniform sizes of the fins. The transistor's width and driving capability can only be changed discretely by modifying the integer number of them, which is different from the traditional planar CMOSFET, where the current is in the whole active region under the gate [3]. Should be mentioned here that FinFETs can be manufactured on Silicon-On-Insulator (SOI) or BULK substrates. SOI-FinFETs rely on the SOI technology that has evolved as a need of better performance of BULK technology [5]. And the main performance parameters remain the same. A simple model of the Threshold Voltage (Vth or $V_T$ ) could be given by equation (1): $$[h]V_T = V_{FB} + 2 * \psi_B + \frac{q * N_{sub} * t_b}{C_{ox}}$$ (1) $$[h]V_{T,SCE} = V_T * (1 - \frac{X_h}{L_c})$$ (2) The equation (2) shows the Short Channel Effect Model to one fin (Side-Channel) [9]. Where [10]: - $V_{FB}$ is the flat-band voltage. - $\psi$ is the Fermi Potential. - $N_{sub}$ is the the body doping. - $t_b$ is the channel deplection width = $0.5*W_{fin}$ - $C_{ox}$ is the Gate Capacitance. - $X_h$ is the charge-sharing length. - $L_c$ is the length of the channel. A new unified FinFET compact model is proposed by [6] for devices with complex fin cross-sections facing the structures presently existing. Comparing the type of structure used by Intel with an ideal rectangular FinFET, is shown in Figure 3. It is a result of a simulation work done in 2012 by the company Gold Standard Simulations Ltd [11], based on another work using Transmission Electron Microscope (TEM) images from Chipworks [12], as seen in one of the images with fins of Figure 4. An analysis of the current density distribution across the fin with different gate bias being applied occurs, as follows: At high gate voltage the current moves towards the interface, Fig. 3. Threshold voltage dependence on gate length [11]. Fig. 4. TEM Lattice Image of NMOS Fin Structure [12]. crowding at the top of the fin due to the focusing gate fringing field there, with quantum mechanical confinement concentrating the charge in a small circular region [11]. One of the conclusions of them was that the gate length dependence of the threshold voltage for the trapezoidal Intel transistor and an equivalent rectangular-fin transistor (same fin height and with fin width equal to the average width of the trapezoidal fin). Representing that the rectangular fin has better short channel effects. #### B. Defects in FinFETs Some works show the models used in faults simulations over FinFET logical behavior, for instance [3]. This authors study some unique defects in FinFET logic circuits and whether existing fault models are applicable to detect faults in FinFET logic gates. But the analysis is useful to base the approach guided to wire ruptures, doping and delays. In that work the bibiographic review classified the FinFETs as Shortage Gate and Independent Gate modes, in the first exist a connection among each one and in second the transistor have four contacts isolated. The second gate cause effects in threshold and power margins when compared with the shortage gate. Another refered affirmation of them [3] is that lightly doped channel eliminates the threshold voltage variations due to reduced random dopant fluctuations and increased carrier mobility. The authors detail the faulty behaviors of the simulated defects on the Fins to demonstrate that new FinFET fault models are necessary and shown when one single defect affecting multiple gates. Normally, the faults are differentiate by the level of signal that represents an especific value, zero logic, transient or one logic (stuck-at-value). For example, simulating if one from a set of fins are open (in a multifins structure). The gate oxide short deserves attention, because it has a high probability to happen. Either lithography defect or catastrophic breakdown of gate oxide layer may result it. SPICE simulations to representing such kind of defect, where a bidimensional array of MOSFET transistor to model each fin was used. A set of fins is quite different of only one channel. When compared with CMOS FET, it bring as a result that any defect in the channel's region affects the entire transistor. Otherwise, exist a number of fins with tolerable acceptance to stuck-open fault or delay fault. On a multi-gate configuration, one single defect may affect multiple correlated gates. Some parameters in fault detection could be verified through simulations, with existing tools. For instance, in [5] the author makes a comparison between short and long-channel FinFET devices, with the same doping profiles, resulting a data difference of its simulators in moderate and strong inversion regions, one of them showed a higher value of drain current. Fig. 5. Regular and Random Doping Channel Distribuition [1]. #### C. Positioning Control for Ion Implantation In [1] a dopant position on threshold voltage (Vth) was investigate, according two types of organization, ordered arrays and random doping. The channel widths of the fabricated transistors were 100, 250, and 500 nm. A single-ion implantation (SII) method was applied to enable the use of a specific number of atoms to the positioning in the target. The controlling of the dopant atom number and the possition is essential for suppressing the fluctuations in Vth of the devices, seen in Figure 5. So, the variation of Vth ( $\Delta$ Vth) and Vth's deviation ( $\sigma$ Vth) for all channel widths was an important measurement that was obtained and compared. The Vth fluctuations have relation with the ordering of dopant position, improving the performance of the device for a better controllability. Enhancements in electrical features were obtained by controlling the position and regarding the ions utilized (phosphorus or arsenic). As they concluded, implanting heavier ions could help realize higher placement accuracy in the SII method and deterministic doping facilitates the development of doped devices. The ion implantation is known to improve the performance of electronic devices, one example is suppression of the punchtrough effect, by modifying the structure of MOSFET [13]. In the case exposed, this methodology could be well employed in the Fins to get specific characteristics. Another work [14] refers that the lithography using proton beams (H<sup>+</sup>) is a technique of microelectronics industry and can be used in research level. However this lithography process applied to nanoscale is one of the most expensive in the industry, due to the use of photo-aligners with Extreme Ultra Violet. Costs reduction implies in investment in researches. #### III. PERSPECTIVES The Challenges and the roadmap of expectations are shown in various scenarios as [2] and [15], meanwhile this section has a selected group of promising ideas presently existing and others, as appropriate, relatively new. Table I is presenting an industrial perspective to next years in semiconductor R&D. IMEC [16] announced the world's first CMOS compatible III-V FinFET device processed on 300mm wafers, which represents that the technology as a viable next-generation alternative for the current Si-based FinFET technology for a high volume production. "The replacement of poly-silicon gate by high-k metalgate in 45nm CMOS technology in 2007 represented a major inflection in new material integration for the transistor. The ability to combine scaled non-silicon and silicon devices might be the next dramatic transistor face-lift, breaking almost 50 years of all-silicon reign over digital CMOS.", said Aaron Thean, director of the logic R&D at IMEC. IMEC replaces silicon fins with indium gallium arsenide (InGaAs) and indium phospide (InP), the atomic lattice mismatch was over 8%. The new technique is based on aspectratio trapping of crystal defects, trench structure, and epitaxial process innovations. The Table II is the challenge planned by IMEC in 2011, it contains the latest dimensioning realized, that with other parameters form an interesting evolutionary scenario. For example, the Wfin target was 10-15 nm, the subthreshold slope (mV/dec.) at Vds = 1.0V fall down from 75 or upper (20 nm) to 60 for a physical Gate Length of 90 nm. Another effect that is very similar to the punchthrough is Drain-Induced Barrier Lowering (DIBL), which will reduced at less then 25 mV/V. At that sense of work, the Figure 6 shows a Multigate FET. It represents an idea of target for a future work, where each fin will be modelated. Maybe with more one fin (V FinFET) and according the new list of features. TABLE I. INTEL TECHNOLOGY ROADMAP [8] | Process Name | P1266 | P1268 | P1270 | P1272 | P1274 | |------------------|-------|-------|-------|-------|-------| | Lithography | 45 nm | 32 nm | 22 nm | 14 nm | 10 nm | | First Production | 2007 | 2009 | 2011 | 2013 | 2015 | TABLE II. $L_{Gate}$ AND $W_{Fin}$ ROADMAP (NM) [16] | Node | 130 | 90 | 65 | 45 | 32 | |-----------|-----|-----|----|----|----| | $L_G$ | 70 | 56 | 45 | 36 | 32 | | W | 160 | 120 | 80 | 60 | 40 | | $W_{Fin}$ | - | - | - | 23 | 18 | Some innovations can be summarized as: - 1) 14nm Intel (not confirmed in 2013) is now expected to the first quarter of 2014. - 2) An analysis with diverse geometry optimizations using fins's simulations and its behavior [6]. - 3) Use of new High-K Metal Gate (HKMG) [5]. - 4) A promising solution to FinFET test concerns is cell-aware ATPG (automatic test pattern generation) technology [20]. - 5) A mathematical and physical model for a FinFET toward the next steps in scaling down (14 nm and 10 nm), thereby enabling the respective simulations. - 6) Study of current concentration in vertical level of fin's ridge. For example, the dynamic behavior, how can it vary according with the applied frequency? Another important application that could be investigated by an specific study in FinFETs is related with radiation exposure. Which the nanoscale contributes to analyses of behavior of the electronic components and singular effects. One example of most important methods applied to modify the optical and electrical features belonging to the semiconductors nanowires (ZnO) is exactly by the ion implantation or it irradiation [14]. The thermal treatment could be complemented, with focus on defects, as well as the loss of material with evaporation or sputtering. It's important to mention yet that FinFETs are not the unique solution to down scale problem effects in research. For instance, Fully-Depleted Silicon-On-Insulator (FD-SOI) in deployment by SOI Industry Consortium [17] and [18]. FD-SOI is a quite similar to a regular bulk CMOS transistor, but with low silicon geometries and less manufacturing process. And the 14 nm node is already in development [19]. #### IV. CONCLUSION This work presented the route tracked by electronics evolution until now, showing yet an idea to a future technological breakthrough. From the FinFET overview we can see a sample of current state-of-art, where the FINs channel's engineering could bring new methods to control the effects of nanoscale. Hence, the ion positioning by implantation with controlling mechanisms could be a good option to fabricate the future devices, aggregating new characteristics by doping align in such semiconductor components. The balance between controllability and the effects of dimensioning is one of the most important requirements to be attended to each technological node. A simple verification of the available data conduct us to achieve the response to the following issue: "why do not join all fins in one only channel?" Of course, to avoid current leakage and enhance the controllability between on off operations. But some questions still are open: How some characteristics that seem like defects could contribute with improvements for FinFETs's parameters? Fig. 6. FINFET Multigate. #### REFERENCES - [1] Masahiro Hori, Keigo Taira, Akira Komatsubara, Kuninori Kumagai, Yukinori Ono et al. Reduction of Threshold Voltage Fluctuation in Field-effect Transistors by controlling individual dopant position. Applied Physics Letters 101, 013503. 2012. - [2] International Technology Roadmap for Semiconductors (ITRS). www.itrs.net. 2013. - [3] Liu, Yuxi and Xu, Qiang On Modeling Faults in FinFET Logic Circuits. IEEE International Test Conference. 2012. - [4] Lee, Jong-Ho. School of EECS and National Education Center for Semiconductor Technology. 2nd US-Korea NanoForum, LA. 2005. - [5] Ferreira, Luiz Fernando. Double-Gate Nanotransistors in Silicon-on-Insulator Simulation of sub-20 nm FinFETs. PGMicro. Universidade Federal do Rio Grande do Sul. 2012. - [6] Hu, Chenming et al. Unified FinFET Compact Model: Modelling Trapezoidal Triple-Gate FinFETs IEEE. 2013. - [7] Hisamoto, D. et al. FinFET-A Self-Aligned Double-Gate MOSFET Scalable to 20nm. IEEE Transactions on Electron Devices, Vol. 47, No. 12, December 2000. - [8] Intel. Intel 22nm 3-D Tri-Gate Transistor Technology; and Intel's Revolutionary 22 nm Transistor Technology. www.intel.com. 2013. - [9] Lee, Jong-Ho. Semiconductor Materials and Device Laboratory SEMAT-ECH Symposium Korea. 2011. - [10] Lee, Jong-Ho et al. Threshold-Voltage Modeling of Body-Tied FinFETs (Bulk FinFETs) IEEE Transactions on Electron Devices, Vol. 54, No. 3. March 2007. - [11] Gold Standard Simulations Ltd. Simulation analysis of the Intel 22nm FinFET. www.goldstandardsimulations.com. 2012. - [12] Chipworks Inc. Intel's 22-nm Tri-gate Transistors Exposed. www.chipworks.com. 2012. - [13] Pesenti, Giovani C. Development and Optimization of CMOS Technology with Polycrystalline Silicon Gate. PGMicro. Universidade Federal do Rio Grande do Sul. 2008. - [14] Cauduro, Andre L. F. Synthesis, Photoluminescence and Electrical Characterization of ZnO nanostructures. PGMicro. Universidade Federal do Rio Grande do Sul. 2012. - [15] Kawa, Jamil. FinFET Design, Manufacturability, and Reliability. R&D Gruop Synopsys Inc. 2013. - [16] Laboratory for Advanced Research in Microelectronics. Imec Demonstrates World's First III-V FinFET Devices Monolithically Integrated on 300mm Silicon Wafers www2.imec.be. 2013. - [17] SOI Industry Consortium. www.soiconsortium.org. 2013. - [18] Taiwan Semiconductor Manufacturing Company Limited. www.tsmc.com. 2013. - [19] ST Microelectronics. Home-Innovation and Technology-FD-SOI www.st.com. 2013. - [20] Hapke, F. et al. Cell-aware Production test results from a 32-nm notebook processor. IEEE International Test Conference (ITC), 2012.