# UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL INSTITUTO DE INFORMÁTICA PROGRAMA DE PÓS-GRADUAÇÃO EM MICROELETRÔNICA

# UNIVERSITÉ DE BORDEAUX ÉCOLE DOCTORALE DES SCIENCES PHYSIQUES ET DE L'INGÉNIEUR

LEONARDO HEITICH BRENDLER

# Memory circuit hardening to Multiple-Cell Upsets

Thesis in joint supervision (cotutelle) presented in partial fulfillment of the requirements for the degree of Doctor of Microelectronics

Advisor - UFRGS: Prof. Dr. Ricardo Augusto da Luz Reis Advisor - UBx: Dr. François Rivet

Porto Alegre January 2024 Brendler, Leonardo Heitich

Memory circuit hardening to Multiple-Cell Upsets / Leonardo Heitich Brendler. – 2024.

144 f.: il.

Advisor - UFRGS: Ricardo Augusto da Luz Reis; Advisor - UBx: François Rivet.

Thesis (Ph.D.) – Universidade Federal do Rio Grande do Sul. Programa de Pós-Graduação em Microeletrônica. Porto Alegre, BR–RS, 2024.

- Université de Bordeaux. École Doctorale des Sciences Physiques et de l'Ingénieur. Bordeaux, FR-Gironde, 2024.

1. Detection cell. 2. Multiple-Cell Upsets. 3. Radiation Hardening. 4. Single-Event Upsets. 5. Soft Errors. 6. SRAM. I. Reis, Ricardo Augusto da Luz. II. Rivet, François. III. Título.

UNIVERSIDADE FEDERAL DO RIO GRANDE DO SUL Reitor: Prof. Carlos André Bulhões Mendes Vice-Reitora: Prof<sup>a</sup>. Patricia Pranke Pró-Reitor de Pós-Graduação: Prof. Júlio Otávio Jardim Barcellos Diretora do Instituto de Informática: Prof<sup>a</sup>. Carla Maria Dal Sasso Freitas Coordenador do PGMICRO: Prof. Cláudio Radtke Bibliotecário-chefe do Instituto de Informática: Alexsander Borges Ribeiro

"Let the future tell the truth, and evaluate each one according to his work and accomplishments. The present is theirs; the future, for which I have really worked, is mine." — NIKOLA TESLA

### ACKNOWLEDGEMENTS

It is always important and necessary to thank all those who were part of my personal, academic, and professional growth.

I would like to thank my parents, Alvaro and Cleide, and my sister, Juliana, for their continued support and encouragement, making my choice to pursue the academic field possible.

Special thanks to my wife, Christine, for her companionship, affection, and unconditional support throughout the journey. Thank you for believing in me and leaving everything aside to accompany me through the adventures and challenges that have passed and those that are yet to come.

I thank my two advisors, Professor Ricardo Reis, for the opportunity and encouragement to enter the research field, always providing the necessary support to resolve the challenges, and Professor François Rivet for the immense opportunity to do part of my PhD in France at an internationally renowned university and laboratory.

Thanks to my co-supervisors, Professor Hervé Lapuyade and Professor Yann Deval, for exchanging ideas and tips throughout the PhD program. My Master's cosupervisors, who always continued to help with everything I needed, Professor Cristina Meinhardt and Professor Alexandra Zimpeck.

Thanks to my two great friends during my stay in Bordeaux, Mateus and Andrés, for their continued companionship and partnership and their invaluable help in carrying out various tasks throughout my PhD.

Finally, I thank all my friends, friends from the old days, friends from the laboratory at UFRGS and IMS, and friends from Bordeaux. Thank you very much.

# ABSTRACT

A new era of space exploration is coming with an exponential increase in satellites and a drastic cost reduction. Memory circuits are a fundamental part of space applications, and techniques to deal with the radiation effects in these circuits are constantly studied without eliminating the need to develop new methods. With the advancements in technology scaling, the number of Multiple-Cell Upsets (MCUs) in a memory plan increases, making conventional techniques insufficient to maintain circuit robustness. In this context, this work details a new way to deal with the MCUs in Static Random-Access Memories (SRAMs) for space applications. The method involves of spatially interleaving a memory plan with a network of radiation detectors (detection cells). At the bottom of this plan, a logic circuit is implemented to create an alarm signal when a radiation-induced particle impacts the memory plan changing the detector's state. The analyses present in this work can be divided into three stages. First, and as a proof-of-concept, a prototype circuit composed of the detection cells was manufactured in the 350 nm Complementary Metal-Oxide-Semiconductor (CMOS) Process Technology and tested considering two methodologies: electrically-induced Single Event Upset (SEU)/MCU testing and Single Event Effects (SEEs) laser testing. Silicon measurement results confirm the correct operation of the circuit, detecting single and multiple events inserted in different positions of the evaluated detection plans. Also, in a second stage, a 32 kb interleaved data/detection SRAM was designed in the 28 nm Fully Depleted Silicon On Insulator (FD-SOI) Technology and tested using post-layout simulations. Results confirm the correct operation of the data and the detection cells of the memory, also detecting single and multiple events inserted in different positions of the memory array. Due to its customizable nature, the proposed method allows varying the number of added detection cells allowing to explore the trade-off between robustness and hardware (circuit) overhead. In the last stage, a tool to automatically generate the layout of the core of a radiation-hardened SRAM was developed, facilitating the application of the new method and providing a range of sizes and protection configurations. Considering the ratio of the number of data and detection cells used in the SRAM designed in this work (50%), the detection method can provide a probability of detecting MCUs in a memory plan that can reach close to 100%. The new challenges arising from the increase in the MCU rate in modern nodes benefit the new method validated in this thesis because, with the increase in the number of events in a memory plan, the probability of detecting an event also increases.

**Keywords:** Detection cell. Multiple-Cell Upsets. Radiation Hardening. Single-Event Upsets. Soft Errors. SRAM.

### Circuito de memória robusto a Multiple-Cell Upsets

### **RESUMO**

Uma nova era de exploração espacial está chegando com um aumento exponencial no número de satélites e uma drástica redução nos custos de lançamento de foguetes. Os circuitos de memória são parte fundamental das aplicações espaciais, e técnicas para lidar com os efeitos da radiação nesses circuitos são constantemente estudadas, não eliminando a necessidade do desenvolvimento de novos métodos. Com o escalonamento das dimensões mínimas dos transistores, o número de Multiple-Cell Upsets (MCUs) em um plano de memória aumenta, tornando as técnicas convencionais insuficientes para manter a robustez do circuito. Nesse contexto, este trabalho detalha uma nova forma de lidar com MCUs em Memórias Estáticas de Acesso Aleatório (SRAMs) para aplicações espaciais. O método envolve intercalar espacialmente um plano de memória com uma rede de detectores de radiação (células de detecção). Na parte inferior deste plano, um circuito lógico é implementado para criar um sinal de alarme quando uma partícula induzida por radiação impacta o plano de memória alterando o estado do detector. As análises presentes neste trabalho podem ser divididas em três etapas. Primeiramente, e como prova de conceito, um protótipo de circuito composto pelos detectores de radiação foi fabricado na tecnologia de processo Semicondutor de Óxido Metálico Complementar (CMOS) de 350 nm e testado considerando duas metodologias: teste de Single Event Upset (SEU)/MCU induzido eletricamente e teste de Single Event Effect (SEE) a laser. Os resultados das medições no silício confirmam o correto funcionamento do circuito, detectando eventos únicos e múltiplos inseridos em diferentes posições dos planos de detecção avaliados. Ainda, em uma segunda etapa, uma SRAM com células de detecção/dados intercaladas de 32 kb foi projetada na tecnologia de Silício sobre Isolante Totalmente Reduzido (FD-SOI) de 28 nm e testada usando simulações pós-layout. Os resultados confirmam o correto funcionamento das células de dados e detecção da memória, também detectando eventos únicos e múltiplos inseridos em diferentes posições da matriz de memória. Devido à sua natureza customizável, o método proposto permite variar o número de células de detecção adicionadas permitindo a exploração do compromisso entre robustez e sobrecusto de hardware (circuito). Na última etapa, foi desenvolvida uma ferramenta para gerar automaticamente o layout do núcleo de uma SRAM robusta à radiação, facilitando a aplicação do novo método e fornecendo uma variedade de tamanhos e configurações de proteção. Considerando a razão entre o número de células de memória e células de detecção utilizadas na SRAM projetada neste trabalho (50%), o método de detecção pode fornecer uma probabilidade de detecção de MCUs em um plano de memória que pode chegar próximo a 100%. Os novos desafios decorrentes do aumento da taxa de MCU em nodos modernos beneficiam o novo método validado nesta tese de doutorado, pois, com o aumento do número de eventos em um plano de memória, a probabilidade de detecção de um evento também aumenta.

**Palavras-chave:** Célula de Detecção, Multiple-Cell Upsets, Radiation Hardening, Single-Event Upsets, Erros Leves, SRAM.

## Durcissement d'un circuit mémoire à Multiple-Cell Upsets

# RÉSUMÉ

Une nouvelle ère de l'exploration spatiale se profile avec une augmentation exponentielle du nombre de satellites et une réduction drastique des coûts de lancement des fusées. Les circuits mémoire constituent une partie fondamentale des applications spatiales, et des techniques pour faire face aux effets des radiations sur ces circuits font l'objet d'études constantes, ce qui n'élimine pas la nécessité de développer de nouvelles méthodes. Avec les progrès dans la réduction de la technologie, le nombre de Multiple-Cell Upsets (MCUs) dans un plan mémoire augmente, rendant les techniques conventionnelles insuffisantes pour maintenir la robustesse du circuit. Dans ce contexte, ce travail détaille une nouvelle manière de traiter les MCUs dans les Mémoires Statiques à Accès Aléatoire (SRAMs) pour les applications spatiales. La méthode consiste en une entrelacée spatiale d'un plan mémoire avec un réseau de détecteurs de radiation (cellules de détection). Au bas de ce plan, un circuit logique est mis en œuvre pour créer un signal d'alarme lorsqu'une particule induite par le rayonnement impacte le plan mémoire et modifie l'état du détecteur. Les analyses présentées dans ce travail peuvent être divisées en trois étapes. Tout d'abord, à titre de preuve de concept, un circuit prototype composé des détecteurs de rayonnement a été fabriqué dans la technologie de processus CMOS (Complementary Metal-Oxide-Semiconductor) 350 nm et testé selon deux méthodologies : les tests Single Event Upset (SEU)/MCU induits électriquement et les tests au laser pour les Single Event Effect (SEE). Les résultats des mesures sur silicium confirment le bon fonctionnement du circuit, détectant des événements uniques et multiples insérés à différentes positions des plans de détection évalués. Dans un deuxième temps, une SRAM de données/détection de 32 kb entrelacée a été conçue dans la technologie de 28 nm FD-SOI (Fully Depleted Silicon On Insulator) et testée à l'aide de simulations après la mise en page. Les résultats confirment le bon fonctionnement des cellules de données et de détection de la mémoire, détectant également des événements uniques et multiples insérés à différentes positions du réseau mémoire. En raison de sa nature personnalisable, la méthode proposée permet de varier le nombre de cellules de détection ajoutées en visant l'équilibre entre la robustesse et les surcoûts. Dans la dernière étape, un outil a été développé pour générer automatiquement la mise en page du cœur d'une SRAM résistante aux radiations, facilitant ainsi l'application de cette nouvelle approche et offrant une gamme de tailles et de configurations de protection. En considérant le rapport entre le nombre de cellules de données et de détection utilisées dans la SRAM conçue dans ce travail (50%), la méthode de détection peut fournir une probabilité de détection des MCU dans un plan de mémoire qui peut approcher les 100%. Les nouveaux défis découlant de l'augmentation du taux de MCU dans les nœuds modernes bénéficient de cette nouvelle méthode validée dans ce travail, car avec l'augmentation du nombre d'événements dans un plan de mémoire, la probabilité de détecter un événement augmente également.

**Mots clés:** Cellule de Détection, Multiple-Cell Upsets, Durcissement par rayonnement, Single-Event Upsets, Erreurs logicielles, SRAM.

# LIST OF ABBREVIATIONS AND ACRONYMS

| AMS    | Austria Micro Systems                   |
|--------|-----------------------------------------|
| AND    | AND Logic Gate                          |
| ASIC   | Application-Specific Integrated Circuit |
| BBICS  | Bulk Built-in Current Sensor            |
| BL     | Bit Line                                |
| BLB    | Bit Line Bar                            |
| CMOS   | Complementary Metal-Oxide-Semiconductor |
| CNN    | Convolutional Neural Network            |
| CPU    | Central Processing Unit                 |
| CR     | Cell Ratio                              |
| CS     | Chip Select                             |
| DAEC   | Double-Adjacent-Error-Correcting        |
| DBU    | Double-Bit Upset                        |
| DD     | Displacement Damage                     |
| DLs    | Detection Lines                         |
| DNN    | Deep Neural Network                     |
| DRC    | Design Rule Check                       |
| DRAM   | Dynamic Random Access Memory            |
| ECC    | Error Correction Code                   |
| EDAC   | Error Detection And Correction          |
| ESA    | European Space Agency                   |
| FD-SOI | Fully Depleted Silicon On Insulator     |

| FIFO                     | First In, First Out                           |
|--------------------------|-----------------------------------------------|
| FinFET                   | Fin-Shaped Field-Effect Transistor            |
| FIT                      | Failure in Time                               |
| FPGA                     | Field-Programmable Gate Array                 |
| FWHM                     | Full Width at Half Maximum                    |
| GEO                      | Geostationary Orbit                           |
| GND                      | Ground                                        |
| IC                       | Integrated Circuit                            |
| INV                      | Inverter                                      |
| IRPS                     | International Reliability Physics Symposium   |
| LET                      | Linear Energy Transfer                        |
| <b>LET</b> <sub>th</sub> | Threshold Linear Energy Transfer              |
| LEO                      | Low Earth Orbit                               |
| LVS                      | Layout Versus Schematic                       |
| MBU                      | Multiple-Bit Upset                            |
| MCU                      | Multiple-Cell Upset                           |
| MEO                      | Medium Earth Orbit                            |
| MOS                      | Metal Oxide Semiconductor                     |
| MUX                      | Multiplexer                                   |
| NAND                     | Not AND                                       |
| NASA                     | National Aeronautics and Space Administration |
| NMOS                     | N-Channel MOSFET                              |
| NOR                      | Not OR                                        |
| OE                       | Output Enable                                 |

| OR | OR Logic Gate |
|----|---------------|
|----|---------------|

- PC Pre-Charge Enable Signal
- PCB Printed Circuit Board
- PDK Process Design Kit
- PMOS P-Channel MOSFET
- PR Pull-Up Ratio
- PTL Pass Transistor Logic
- **QBU** Quadruple-Bit Upset
- **RBB** Reverse-Body-Biasing
- **RHBD** Radiation-Hardening-By-Design
- **RLs** Refresh Lines
- **RVT** Regular Threshold Voltage
- SAA South Atlantic Anomaly
- SCE Short Channel Effect
- **SER** Soft Error Rate
- **SEB** Single Event Burnout
- **SEC-DED** Single-Error-Correcting-Double-Error-Detecting
- **SEEs** Single Event Effects
- **SEGR** Single Event Gate Rupture
- SEL Single Event Latch-up
- **SET** Single Event Transient
- **SEU** Single Event Upset
- SHE Single Hard Error
- **SEME** Single Event Multiple Effect

- SMU Single-Word Multiple-bit Upset
- **SNM** Static Noise Margin
- SoC System-on-a-Chip
- SOI Silicon on Insulator
- **SPICE** Simulation Program with Integrated Circuit Emphasis
- SRAM Static Random Access Memory
- TAEC Triple-Adjacent-Error-Correcting
- **TBU** Triple-Bit Upset
- TCAD Technology Computer-Aided Design
- **TID** Total Ionizing Dose
- **TMR** Triple Modular Redundancy
- UV Ultraviolet
- VDD Supply Voltage
- VLSI Very large-system Integration
- WE Write Enable
- WL Word Line

# LIST OF FIGURES

| Figure 1.1   | The transition from a SEU to an MCU due to the transistors shrinking      | 23       |
|--------------|---------------------------------------------------------------------------|----------|
| Figure 1.2 M | MCU percentage of SER for Planar, FD-SOI and FinFET technologies          | 23       |
| Figure 2.1 S | Semiconductor memories classification.                                    | 30       |
| Figure 2.2 C | Classic SRAM memory architecture.                                         | 31       |
| Figure 2.3 6 | 5T-SRAM Bit Cell electric schematic.                                      | 32       |
| Figure 2.4 A | An example of a simplified model of SRAM cell during the read operation   | .33      |
| Figure 2.5 A | An example of a simplified model of SRAM cell during the write operation. | .34      |
| Figure 2.6   | The relationship between SRAM bit interleaving and MBU.                   | 35       |
| Figure 2.7 H | Pre-charge circuit electric schematic.                                    | 36       |
| Figure 2.8 7 | 7T Latch-Type Sense Amplifier electric schematic                          | 37       |
| Figure 2.9 V | Write Driver circuit electric schematic                                   | 38       |
| Figure 3.1 V | Van Allen radiation belt                                                  | 42       |
| Figure 3.2 C | Cross-section of the Sun interior.                                        | 43       |
| Figure 3.3 S | Solar Wind                                                                | 44       |
| Figure 3.4 S | Solar Flare                                                               | 45       |
| Figure 3.5 M | Nuclear cascade reactions of particles towards the Earth's surface        | 46       |
| Figure 3.6 I | Deformation in the inner Van Allen belt of the Earth due to SAA           | 47       |
| Figure 3.7 C | Current scenario of the South Atlantic Anomaly                            | 47       |
| Figure 3.8   | Three-Universe model proposed by Pradhan et al. (1996).                   | 48       |
| Figure 3.9 I | Degradation of a pulse by electric masking. Depending on the (a) Gen-     |          |
| erated       | pulse's width, it can be (b) Attenuated or (c) Filtered when propagating  |          |
| throug       | h the circuit                                                             | 49       |
| Figure 3.10  | Logical masking example in combinational circuit                          | 50       |
| Figure 3.11  | Latching Window masking                                                   | 50       |
| Figure 3.12  | Classification of major Single Event Effects.                             | 52       |
| Figure 3.13  | Single Event Effects - an ionizing particle passing through a sensitive   |          |
| volum        | e (SV) in an active (semiconductor) device                                | 53       |
| Figure 3.14  | The SEU response in SRAM Bit cells is determined by the interaction       |          |
| betwee       | en the feedback process and the recovery process.                         | 56       |
| Figure 3.15  | SRAM struck drain voltage transients for ion strikes with LET: well       |          |
| below,       | , just below, and just above the SEU threshold                            | 57       |
| Figure 3.16  | Charge Collection Mechanisms due to an Ion strike in a P-N junction       | 60       |
| Figure 3.17  | Transient Current Waveform induced by a radiation strike                  | 61       |
| Figure 3.18  | Results of the collected charge from the heavy-ion impact at the drain    |          |
| termin       | al on the 32nm Bulk CMOS device                                           | 62       |
| Figure 3.19  | Nodal separation setup for NMOS charge sharing.                           | .63      |
| Figure 3.20  | Charge collection with distance of $0.18\mu$ m between adjacent devices   | .63      |
| Figure 3.21  | SET Pulse Quenching Effect in a inverter chain.                           | 64       |
| Figure 3.22  | Packing density per chip across technology nodes                          | 65       |
| Figure 3.23  | Normalized alpha particle induced SER as a function of technology         |          |
| for sin      | gle-port SRAMs                                                            | 66       |
| Figure 3.24  | System-SER, bit-SER, and percentage MCU of the total SER as a             | <u> </u> |
| functio      | on of technology node.                                                    | 67       |
| Figure 3.25  | Radiation induced SER for each ECC mode at $VDD = 500 \text{ mV}$ for 1-, | <u> </u> |
| 2-, and      | 1 4-way interleaving                                                      | 67       |

| Figure 3.26 Predicted trends of MCU ratio and maximum multiplicity in SRAM cells with technology scaling                                                                      |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <ul><li>Figure 5.1 The design-at-the-circuit-level solution: (a) Memory plan spatially interleaved; (b) Alarm signal; (c) Cluster of physically adjacent data cells</li></ul> |
| detection cell architecture                                                                                                                                                   |
| Figure 5.3 Detection plan operation modes: (a) Regular operation (no event); (b)<br>Cell impacted by a radiation event                                                        |
| Figure 6.1 Detection plan layouts: (a) Detection Cell, (b) 4×4 Detection Plan, and<br>(c) 8×8 Detection Plan                                                                  |
| Figure 6.2 Layout of the on-chip fault injection test circuits composed of pulse generators, 3×8 decoders, and PMOS/NMOS current insertion structures                         |
| Figure 6.3 The full test chip layout highlighting the core area with the designed circuit and the total area of the chip considering the pads                                 |
| Figure 6.4 Schematic of the on-chip fault injection test circuits composed of two 3x8 decoders and the PMOS/NMOS test structures                                              |
| Figure 6.5 (a) Test Chip (highlighted in grey) with associated PCB, and (b) Die<br>Photo with zoom in the test circuits and the detection plans                               |
| Figure 6.6 Selected cells for electrically-induced SEU/MCU fault injection in the $4 \times 4$ plan, considering different scenarios                                          |
| Figure 6.7 Selected cells for electrically-induced SEU/MCU fault injection in the 8×8 plan, considering different scenarios                                                   |
| Figure 6.8 Laser test setup: (a) Schematic of the test setup considering the equip-<br>ment used, and (b) Photo of the test setup highlighting the frontside approach95       |
| Figure 6.9 Experimental circuit behavior after an electrically-induced SEU impact<br>on the $4 \times 4$ detection plan                                                       |
| Figure 6.10 Measurements of the MCU effects in the $8 \times 8$ plan and the comparison<br>with the DBU impact on the $4 \times 4$ plan                                       |
| Figure 6.11 General view of the different states of the circuit before, during, and<br>after a laser shot                                                                     |
| Figure 6.12 Experimental circuit behavior after a laser beam impact on the 8th row<br>of the 8×8 detection plan                                                               |
| Figure 6.13 Scan performed in the $8 \times 8$ plan for different laser energy values 104                                                                                     |
| Figure 6.14 Experimental and Weibull fitted cross-section of the $8 \times 8$ plan105                                                                                         |
| Figure 7.1 6T-SRAM Bit Cell: (a) Electric Schematic and (b) Layout                                                                                                            |
| Figure 7.2 Detection Cell: (a) Electric Schematic and (b) Layout                                                                                                              |
| Figure 7.3 Pre-Charge circuit: (a) Electric Schematic and (b) Layout                                                                                                          |
| Figure 7.4 7T Latch-Type Sense Amplifier: (a) Electric Schematic and (b) Layout109                                                                                            |
| Figure 7.5 Write Driver: (a) Electric Schematic and (b) Layout                                                                                                                |
| Figure 7.6 A part of the 8×256 Row Decoder layout111                                                                                                                          |
| Figure 7.7 (a) Memory organization: 32 interleaved 8-bit words and (b) Selection                                                                                              |
| of the 8-bit Word 0 through the $5 \times 32$ Column Decoder                                                                                                                  |
| Figure 7.8 A part of the 5×32 Column Decoder layout                                                                                                                           |
| Figure 7.9 A part of the Detection Logic layout114                                                                                                                            |
| Figure 7.10 (a) 256X256 Interleaved Data/Detection SRAM block diagram and the                                                                                                 |
| (b) SRAM core layout115                                                                                                                                                       |
| Figure 7.11 The 32 kb SRAM validation through consecutive write/read operations117                                                                                            |
| Figure 7.12 March-C Test in 128 bits of the SRAM: 16 words of Row 0                                                                                                           |

| Figure 7.13 The behavior of the detection method considering post-layout simula-       |      |
|----------------------------------------------------------------------------------------|------|
| tions of a particle impact in: (a) The bottom row (Row 127) and (b) The top            |      |
| row (Row 0)                                                                            | .119 |
|                                                                                        |      |
| Figure 8.1 Automatic SRAM layout generation tool main window, highlighting all         |      |
| the available features                                                                 | .125 |
| Figure 8.2 An example of running the tool: (1) defining the input data, (2) gener-     |      |
| ating the SRAM layout, (3) providing layout information.                               | .126 |
| Figure 8.3 An example of a $4 \times 4$ data/detection SRAM (a) Schematic and (b) Lay- |      |
| out generation, highlighting the vector expressions in multiple-bit wire names         |      |
| functionality.                                                                         | .127 |
| Figure 8.4 The impact of the radiation hardening levels provided by the tool in the    |      |
| total area of the SRAM core.                                                           | .128 |

# LIST OF TABLES

| Table 3.1              | Probability of Radiation-Induced Upsets in Different Environments                  | 70       |
|------------------------|------------------------------------------------------------------------------------|----------|
| Table 6.1<br>Table 6.2 | Pins description of the test chip<br>4×4 Plan Measurements - Complete SEU Analysis | 90<br>98 |
| Table 7.1              | Comparison between the state-of-the-art and this thesis                            | 120      |
| Table 8.1              | SRAM Generated Layout Data                                                         | 128      |

# CONTENTS

| 1 INTRODUCTION                                                      | .21 |
|---------------------------------------------------------------------|-----|
| 1.1 Thesis Objectives and Contributions                             | .25 |
| 1.2 Thesis Structure                                                | .27 |
| 2 MEMORY CIRCUITS                                                   | .29 |
| 2.1 Static Random-Access Memory (SRAM)                              | .29 |
| 2.1.1 6T-SRAM Bit Cell                                              | .31 |
| 2.1.2 SRAM Read Operation                                           | .32 |
| 2.1.3 SRAM Write Operation                                          | .33 |
| 2.1.4 Bit Interleaving                                              | .34 |
| 2.2 SRAM Peripheral Circuits                                        | .35 |
| 2.2.1 Pre-Charge Circuit                                            | .35 |
| 2.2.2 Sense Amplifier                                               | .36 |
| 2.2.3 Write Driver                                                  | .37 |
| 2.2.4 Row Decoder                                                   | .38 |
| 2.2.5 Column Decoder                                                | .39 |
| <b>3 RADIATION EFFECTS ON ELECTRONIC CIRCUITS</b>                   | .40 |
| 3.1 Radiation Belts                                                 | .42 |
| 3.2 Sun                                                             | .43 |
| 3.3 Cosmic Rays                                                     | .45 |
| 3.4 South Atlantic Anomaly (SAA)                                    | .46 |
| 3.5 Fault Tolerance Basic Concepts and Terms                        | .48 |
| 3.6 Characterization of the radiation effects on electronic devices | .50 |
| 3.7 Single Event Effects (SEE)                                      | .52 |
| 3.7.1 Destructive Events                                            | .53 |
| 3.7.2 Single Event Upset (SEU)                                      | .54 |
| 3.7.3 SEU Effects in SRAMs                                          | .55 |
| 3.7.4 Multiple-Cell Upsets                                          | .57 |
| 3.8 Physical Mechanisms of Charge Deposition and Collection         | .58 |
| 3.9 Emerging Effects at Advanced Technologies                       | .62 |
| 3.10 Problem Definition                                             | .65 |
| 3.11 Overview                                                       | .69 |
| 4 STATE-OF-THE-ART                                                  | .72 |
| 4.1 Redundancy-based Techniques                                     | .72 |
| 4.2 Radiation Hardening by Design Techniques                        | .73 |
| 4.3 Error Detection and Correction Techniques                       | .74 |
| 4.4 Radiation Monitors and Software-based Techniques                | .75 |
| 4.5 Related Work                                                    | .78 |
| 5 DETECTION METHOD                                                  | .79 |
| 5.1 Architecture                                                    | .79 |
| 5.2 Operation                                                       | .82 |
| 5.3 Overheads                                                       | .83 |
| 5.3.1 Area Penalty                                                  | .84 |
| 5.3.2 Power Consumption Penalty                                     | .84 |
| 6 PROOF-OF-CONCEPT                                                  | .86 |
| 6.1 Circuit Design                                                  | .86 |
| 6.1.1 On-Chip Test Circuits                                         | .89 |
| 6.1.2 Test Board Features                                           | .91 |
| 6.2 Electrically-Induced SEU/MCU Test Methodology                   | .92 |

| 6.3 Laser Test Methodology                                            | 93   |
|-----------------------------------------------------------------------|------|
| 6.3.1 Single Cell.                                                    | 96   |
| 6.3.2 Scan Test                                                       | 97   |
| 6.4 Electrically-Induced SEU/MCU Measurements                         | 97   |
| 6.4.1 4X4 Detection Plan                                              | 97   |
| 6.4.2 8X8 Detection Plan                                              | 99   |
| 6.5 Laser Results                                                     | .100 |
| 6.5.1 Single Cell                                                     | .100 |
| 6.5.2 Scan                                                            | .103 |
| 6.6 Applicability of the Method Proposed in Advanced Technology Nodes | .104 |
| 7 INTERLEAVED DATA/DETECTION SRAM                                     | .106 |
| 7.1 The Design                                                        | .106 |
| 7.1.1 Data Cell                                                       | .107 |
| 7.1.2 Detection Cell                                                  | .107 |
| 7.1.3 Main SRAM Peripherals                                           | .108 |
| 7.1.4 8X256 Row Decoder                                               | .110 |
| 7.1.5 5X32 Column Decoder                                             | .111 |
| 7.1.6 Detection Logic Circuit                                         | .113 |
| 7.1.7 256X256 Interleaved Data/Detection SRAM                         | .113 |
| 7.2 Test Methodology                                                  | .114 |
| 7.2.1 SRAM Data Cells Test                                            | .115 |
| 7.2.2 SRAM Detection Cells Test                                       | .115 |
| 7.3 Interleaved Data/Detection SRAM Post-Layout Simulations           | .116 |
| 7.4 Comparison with State-of-the-Art                                  | .119 |
| 8 RADIATION-HARDENED SRAM LAYOUT GENERATION TOOL                      | .122 |
| 8.1 Tool Features and Implementation                                  | .122 |
| 8.2 Tool Execution and Results                                        | .125 |
| 9 CONCLUSIONS                                                         | .129 |
| 9.1 Applicability of the Method at the System Level                   | .130 |
| 9.2 Competitive Advantage of the Proposed Method                      | .131 |
| 9.3 Future Works and Perspectives                                     | .131 |
| 9.3.1 Rad-Hard Test                                                   | .132 |
| 9.3.2 A combination of widely used techniques and the proposed method | .132 |
| 9.3.3 AI-driven SRAM                                                  | .133 |
| REFERENCES                                                            | .134 |
| ANNEX A — LIST OF PUBLICATIONS                                        | .143 |

# **1 INTRODUCTION**

The evolution of the transistor manufacturing process, known as technology scaling, involves reducing the dimensions of transistors and offers numerous advantages. Firstly, it allows for increased integration capacity of integrated circuits. Secondly, it enables higher operating frequencies, pushing the boundaries of circuit performance. Lastly, it facilitates improved performance and lower power consumption. However, alongside these benefits, certain challenges arise due to the shrinking of transistors. These challenges include an increase in the variability of the manufacturing process (ORSHANSKY; NASSIF; BONING, 2008; MEINHARDT, 2014; ZIMPECK et al., 2018), the emergence of Short Channel Effect (SCE), the occurrence of undesirable leakage current (TAUR et al., 1997), and most notably, the heightened susceptibility to radiation effects. Integrated circuits play an essential role in all our daily tasks. The demand for circuits capable of delivering high performance across a multitude of tasks continues to grow. It is one of the reasons why integrated circuits are increasingly dense and complex.

Radiation-induced soft errors pose a significant reliability concern for nanotechnologies, impacting both space-based and terrestrial applications (BAUMANN, 2002a; HEIDEL et al., 2009). To understand the implications of these effects on integrated circuit design, it is crucial to comprehend their origins. The Earth's atmosphere serves as a semi-permeable layer, permitting the passage of light and heat while acting as a natural filter to reduce the intensity of radiation reaching the Earth and blocking ultraviolet rays. Radiation encountered in space or in the atmosphere can be classified into two broad categories: ionizing particles and non-ionizing particles. Examples of ionizing radiation include cosmic rays, x-rays, and radiation emitted by radioactive materials. These types of radiation have the ability to emit electrons upon interacting with materials. On the other hand, non-ionizing radiation, such as ultraviolet light, radio waves, and microwaves, lacks the capability to ionize materials. The main particles that can cause undesired effects in electronic circuits are electrons, protons, neutrons, alpha particles and heavy ions, as well as electromagnetic radiation, such as x-rays and gamma rays (STASSINOPOULOS; RAYMOND, 1988).

Space radiation encompasses subatomic particles, originating from sources such as heavy ions in the space environment or alpha particles emitted by radioactive isotopes. These particles travel in space at very high speeds, and the fastest ones can travel at speeds close to the speed of light, which allows them to easily traverse a material and cause various effects on it. The Earth receives a continuous influx of radiation from three primary sources that have the potential to impact electronic circuits: the Sun, Cosmic Rays, and Trapped Radiation.

Electronic circuits operating in space face substantial radiation exposure and the possibility of encountering heavy particles originating from the sun or beyond our galaxy. This radiation exposure carries a high probability of inducing changes and disturbances in the circuit, thereby compromising its proper functioning. The effects associated with radiation incidence on electronic components have been extensively researched by the international scientific community, primarily in the context of space and military applications. The integrated circuits that experience the interaction of ionizing particles basically suffer from two types of degradation: those of singular character, occurring due to the incidence of a single particle, and those of a cumulative character, which, in turn, occur due to the accumulation of doses of ionizing radiation over the lifetime of the circuit (BOUDENOT, 2007; STASSINOPOULOS; RAYMOND, 1988).

Degradations resulting from the impact of a single particle are referred to as Single Event Effects (SEEs). These effects can be further classified into two subgroups: Destructive Events, also known as hard errors, which lead to permanent failures in the circuit, and Non-Destructive Events, commonly known as soft errors, where errors occur in the system without causing permanent damage. One of the most studied soft errors in the literature is the Single Event Upset (SEU). A SEU is a change of state caused by a single ionizing particle striking a sensitive node in a microelectronic device, such as in a semiconductor memory.

The design of radiation-tolerant circuits, especially memory circuits, is recognized as critical for space applications (VELAZCO; FOUILLAT; REIS, 2007; FAMÁ; ESTELA, 2019). The primary objective is to reduce the susceptibility of these circuits to radiation-induced effects, such as the SEU. In novel nanotechnologies, the dimensions of the transistors and, consequently, the cells' size are significantly reduced, causing more than one memory cell (in a memory plan) to be affected by a single particle impact (HEI-DEL et al., 2009; IBE et al., 2010; CLEMENTE et al., 2021), characterizing the transition from a SEU to a Multiple-Cell Upset (MCU), as can be seen in Figure 1.1. The MCU percentage of the total SEU rate increases with transistor downscaling independently of the evaluated technology. Figure 1.2 presents the simulated MCU ratios (in percentage) for the total SEU rate as a function of the technological node and for the planar bulk Complementary Metal-Oxide-Semiconductor (CMOS), Fully Depleted Silicon On Insu-



Figure 1.1: The transition from a SEU to an MCU due to the transistors shrinking.

Source: From the author.

lator (FD-SOI), and Fin-Shaped Field-Effect Transistor (FinFET) bulk technologies. Due to their extremely limited sensitive volume dimensions, FD-SOI and FinFET technologies are less sensitive to MCU than the planar bulk CMOS technology (HUBERT; ARTOLA; REGIS, 2015).

Figure 1.2: MCU percentage of SER for Planar, FD-SOI and FinFET technologies.



Source: Hubert, Artola and Regis (2015).

Numerous techniques at different levels are presented in the literature to deal with radiation effects. According to the state-of-the-art, the main mitigation design techniques are:

- Spatial redundancy-based techniques at the system level;
- Error Detection And Correction (EDAC) techniques at the architectural level;
- Radiation-Hardening-By-Design (RHBD) of the memory cells at the circuit level.

The Triple Modular Redundancy (TMR) technique is one of the most popular hardware redundancy techniques. Despite being widely explored in various implementation strategies (KASTENSMIDT; CARRO; REIS, 2006; CANNON et al., 2020; TAN et al., 2021), redundancy-based solutions increase the system's complexity. Adopting these solutions causes an extra cost in silicon surface and power consumption, not always meeting the hardware requirements (DEVAL; LAPUYADE; RIVET, 2019). The widely used basic Hamming code (ZHANG et al., 2018) allows detecting two errors and correcting one in a single data word. EDAC algorithms need to implement redundant bits whose number increases with the number of errors to detect and correct (DEVAL; LA-PUYADE; RIVET, 2019; VLAGKOULIS et al., 2022; KUENTZER; KRSTIC, 2020). The extra cost in terms of hardware and power consumption increases significantly with the increase in the MCU rate, a consequence of the continuous downscaling of transistors (DEVAL; LAPUYADE; RIVET, 2019). The adoption of RHBD memory cells (HAN et al., 2021; RAGHURAM; GUPTA; KAUSHAL, 2020; HARAN et al., 2020; JIANG et al., 2019; LI et al., 2021) does not allow the detection and thus the correction of radiationinduced corruptions of data. RHBD cells usually have more transistors than traditional designs, increasing the design cost in terms of silicon area. Another way to deal with the radiation effects is to increase the sensitivity of Static Random Access Memory (SRAM) bit cells, creating SRAM-based radiation monitors to predict the impact of radiation on a system at space or ground level (WANG et al., 2021). This method can provide essential data for the development of radiation-tolerant circuits. However, it does not increase the robustness of the SRAM used as a monitor.

All the aforementioned techniques have in common, mainly area and power consumption penalties. These penalties are already expected in the radiation-tolerant circuit design, and are also present in the method proposed in this thesis. However, in existing methods, the designed circuit can usually detect a limited number of events, or the modified memory cell can support a certain amount of deposited charge before presenting a failure (HEIDEL et al., 2009; BAUMANN, 2002b). Considering the significant increase in the MCU rate both for new technological nodes and for the environment in which the circuit will operate, the existing techniques may not provide a satisfactory level of robustness to radiation effects depending on the application needs.

In this context, this thesis presents a new method to detect radiation-induced upsets in memory circuits, focusing on the MCUs. The method involves spatially interleaving a memory array with a network of radiation detectors (defined as detection cells). Towards the base of this layout, a logic circuit is integrated to generate an alarm signal upon detection of a radiation-induced particle impacting the memory plan and altering the detector's state. Although it also presents area and power consumption overheads similar to those presented in the techniques already used, the advantage of the new method is the increase in fault detection capability/probability according to the increase in the number of events. The originality of the method is that it goes the opposite way of the techniques already present in the literature, benefiting from the increase of multiple events and achieving a high detection rate even in the harshest environments. This fact is of great interest in developing applications for the space industry.

As previously presented, the MCU rate has been increasing according to the advancement of technological scaling, and the existing techniques to deal with these effects present a detection limitation according to the increase in the number of events. Therefore, the hypothesis of this thesis is that creating a customizable network of memory radiation detectors interleaved in a traditional SRAM array can provide an unlimited detection capability at the cost of low power consumption overhead and a similar area overhead to other existing techniques. The proof of this hypothesis provides a new method for detecting MCUs that uses a different methodology from the methods already presented, increasing its robustness according to the reduction of technological nodes and the consequent increase in the MCU rate.

# **1.1 Thesis Objectives and Contributions**

In order to present and test the proposed new method in detail, the analyses present in this work can be divided into three stages. Firstly, the objective was to validate the detection method by designing and manufacturing a prototype version of the circuit and conducting radiation experiments. Afterward, the new method was extended and applied in a memory circuit with characteristics closer to current commercial memories. Finally, considering the important customization feature of the new method, a tool for automatically generating an SRAM layout, providing different levels of protection is presented.

The thesis main objectives are:

- 1) To present a proof-of-concept for the detection method, which does not present an event detection limitation according to the increase of SEUs and MCUs;
- 2) To verify possible different circuit behaviors about the particle location impact, the

detection plan size, and the event type (single or multiple);

- To confirm the correct operation of the method in an interleaved SRAM memory plan designed with data and detection cells;
- 4) To verify the method behavior in a current technology;
- 5) To verify the detection delay for a commercial-sized memory plan;
- 6) To provide an easy-to-use tool that facilitates the implementation of the proposed method, automatically generating an SRAM layout with data and detection cells, providing different levels of protection.

This thesis contributes to the community by providing a new type of technique for dealing with radiation-induced upsets, mainly MCUs. In the current context, different types of techniques are used to deal with the effects of radiation on Integrated Circuit (IC). As previously presented, among the most used techniques nowadays, we can highlight redundancy-based, RHBD, and EDAC, in addition to hardware monitoring and software-based techniques. Most of these techniques use masking as the primary method to increase circuit reliability. EDAC techniques allow for detecting a certain number of events for subsequent possible correction. Regardless of the methodology used by current techniques, they all have limitations in masking, detecting, and correcting upsets according to the increase in the number of events. In other words, in the current scenario, there is a direct relationship between the increase in the number of radiation-induced upsets (mainly multiple events) and the reduction in the protection capability of the techniques used to deal with these effects. Therefore, this thesis proposes to modify this relationship by bringing a new type of technique for detecting multiple events.

The main contributions of this thesis are summarized as follows:

- A new idea for detecting multiple events in memory circuits, which, unlike existing techniques, benefits from the increasing MCU rate. The idea is implemented through a new method that provides unlimited detection of radiation-induced upsets.
- Validation of the proposed method through designing, manufacturing, and testing a prototype circuit. It is proven that the initial idea is viable and can be implemented in a memory circuit.
- The proposed and validated method was applied in the design of a complete SRAM with commercial dimensions in 28 nm FD-SOI technology. Considering a commercial-sized memory, the method's behavior was verified using current

technology and the difference in sensitivity between the detection cells.

• A tool for automatically generating an SRAM layout that implements the proposed method, providing different levels of protection, is presented. The tool is independent of the technology and the architecture chosen for the SRAM data cell, contributing to the automation of different projects.

# **1.2 Thesis Structure**

This thesis is organized as follows:

- **Chapter 2 Memory Circuits**: The chapter presents the main concepts of memory circuits, highlighting the difference between volatile and non-volatile memories. Also, the primary characteristics and circuits that compose an SRAMs, which is the focus of this thesis, are detailed.
- Chapter 3 Radiation effects on electronic circuits: The radiation effects, from their origin to their impact on electronic circuits, are detailed. The chapter's main topic is the Single Event Effects, mainly the MCUs. This chapter also presents the problem definition, the starting point for the new method creation.
- Chapter 4 State-of-the-art: The chapter presents the state-of-the-art works in the design of radiation-tolerant circuits. The chosen works represent different techniques to deal with the radiation effects, focusing on multiple events. A comparison of the method proposed in this thesis and the state-of-the-art works are also presented.
- **Chapter 5 Detection Method**: The main idea of the proposed detection method, detailing its architecture and operation, is presented in this chapter. The method's drawbacks are also described, highlighting the area and power consumption overheads.
- Chapter 6 Proof-of-Concept: This chapter presents a proof-of-concept of the new proposed method. A prototype circuit composed of two detection plans was designed, fabricated, and tested, aiming to validate the initial idea.
- **Chapter 7 Interleaved data/detection SRAM**: An extended version of the previously presented prototype circuit was designed considering not only the detection but also the data cells (interleaved data/detection SRAM) in a more advanced technology, and with commercial-size dimensions. Tests were performed in order to verify the correct operation of the data cells and the possible sensitivity difference in the

detection cells.

- Chapter 8 Radiation-Hardened SRAM Layout Generation Tool: This chapter presents all the steps of the development of a radiation-hardened SRAM layout generation tool. An easy-to-use tool was implemented to facilitate the generation of SRAM layouts with different levels of protection.
- Chapter 9 Conclusions: The conclusions and final remarks of the thesis are presented.
  Finally, the chapter also points out the possible future works and new perspectives that may emerge to improve the presented method.
- **ANNEX A LIST OF PUBLICATIONS**: The list of publications obtained throughout the PhD.

## **2 MEMORY CIRCUITS**

With the reduction of technological nodes and the consequent reduction in the dimensions of transistors, memory circuits exponentially increase the amount of data that must be processed and stored in advanced technologies. In a semiconductor memory chip, each bit of binary data is stored in a small circuit, known as a memory cell, composed of one to several transistors. The memory cells are commonly laid out in rectangular arrays on the surface of the chip. Depending on the memory size, a specific number of the 1-bit memory cells are grouped (not necessarily physically) in small blocks called words which are accessed together.

The data stored in the memory cells are accessed through a binary number called memory address, which is applied to the chip's address pins. The address specifies which word in the chip will be accessed. If the memory address consists of M bits, the number of addresses on the chip is  $2^M$ , each containing an N-bit word. Consequently, the amount of data stored in each chip is  $N2^M$  bits (DAWOUD; PEPLOW, 2010). With the definition of an address to select a word from the memory circuit, two basic operations can be performed: "read," in which the data contents of a memory word are read out, and "write," in which data is stored in a memory word, replacing any data that was previously stored there.

Due to the need for faster and more reliable memories, several types of memories have emerged. The different types of memories are basically divided into two main groups: Volatile and Non-volatile memories. Volatile memory maintains its data while the device is powered, and Non-volatile memory preserves its data even during periods when the power to the chip is turned off. Figure 2.1 presents the classification of the different types of memories according to the two main groups. Among the different types of memories, this work focuses on the robustness of SRAMs; their main characteristics will be presented below.

#### 2.1 Static Random-Access Memory (SRAM)

SRAM, as defined by the acronym, is a type of random-access memory that uses latching circuitry to store each bit. SRAM is a volatile memory in which the data is lost when power is removed. The term static differentiates SRAM from Dynamic Random Access Memory (DRAM), as SRAM will hold its data permanently in the presence of



Figure 2.1: Semiconductor memories classification.

Source: Weste and Harris (2015).

power. In contrast, data in DRAM decays in seconds and thus must be periodically refreshed. SRAM is faster than DRAM, but it is more expensive regarding silicon area and cost; it is typically used for the cache and internal registers of a Central Processing Unit (CPU), while DRAM is used for a computer's main memory. SRAMs play a vital role in microprocessor chips and various applications. The fraction of the total chip area devoted to SRAM arrays is large for state-of-the-art designs. As the device is scaled down, process variation effects and radiation-induced upsets become crucial factors in SRAM design (JOSHI; KIM; KANJ, 2011).

The design of an SRAM can be divided into two parts: the core and the peripherals. The memory core comprises an array of bi-stable memory bit cells. These arrays are always large, containing from thousands to millions of bit cells. Even slight enhancements in bit cells' reliability, performance, and power consumption can significantly influence the entire processor or System-on-a-Chip (SoC) product. In high-performance processors, operating speed and bit cell area are the prime concern in having high-density caches maintaining adequate reliability. In contrast, energy-constrained applications like sensor nodes or medical implants prioritize energy efficiency and reliability over other concerns (SINGH; MOHANTY; PRADHAN, 2012).

The peripheral circuits typically comprise address (row and column) decoders, sense amplifiers, write drivers, and bit line pre-charge circuits. Peripheral circuits are mainly responsible for communicating the data stored in the core with the processor; they enable reading from and writing into the array. A classic SRAM architecture is shown in Figure 2.2. The memory array consists of 2N words of 2M bits each.



Figure 2.2: Classic SRAM memory architecture.

Source: Adapted from Kang and Leblebici (2003).

#### 2.1.1 6T-SRAM Bit Cell

The 6T-SRAM bit cell is a common type of SRAM bit cell used in integrated circuits for storing digital data. Figure 2.3 presents the electric schematic of the 6T-SRAM bit cell. A standard 6T-SRAM bit cell comprises two identical CMOS inverters connected in a positive feedback loop. This structure forms a latch to create a bi-stable circuit allowing storing one bit of data, either '1' or '0', in the internal nodes (Q and QB) (SINGH; MOHANTY; PRADHAN, 2012). The cross-coupled inverter pair consists of two pull-up P-Channel MOSFET (PMOS) devices (M3 and M4) and two pull-down N-Channel MOSFET (NMOS) devices (M1 and M2). Another two NMOS access transistors (M5 and M6) are controlled by the Word Line (WL), and they serve as switches between the inverter pair and the complementary pair of bit lines, Bit Line (BL) and Bit Line Bar (BLB), used to read from or write to the bit cell (SINGH; MOHANTY; PRADHAN, 2012). The data in the SRAM bit cell is stored as long as the power is maintained in the bit cell. The basic operations of the 6T-SRAM bit cell as a storage device are reading or writing new data to the bit cell and are presented next.

Figure 2.3: 6T-SRAM Bit Cell electric schematic.



Source: From the author.

The 6T-SRAM bit cell sizing is crucial for stable read/write operations. Two ratios are defined in the literature to characterize the relationship between pull-down, pull-up, and access transistors of the 6T-SRAM bit cell. The Cell Ratio (CR) is the W/L ratio of the pull-down transistor to the access transistor and the Pull-Up Ratio (PR) is the W/L ratio of pull-up transistor to the access transistor (RABAEY; CHANDRAKASAN; NIKOLIC, 2002; SINGH; MOHANTY; PRADHAN, 2012). Typically, CR higher than 1.2 is required to avoid read upset in conventional 6T-SRAM bit cell. Write-ability of SRAM bit cell is determined by the PR. Generally, PR lower than 1.8 is required to maintain good write-ability (RABAEY; CHANDRAKASAN; NIKOLIC, 2002; SINGH; MOHANTY; PRADHAN, 2012).

# 2.1.2 SRAM Read Operation

Consider a scenario where a '1' is stored at the Q node. Both bit lines are precharged to Supply Voltage (VDD) before initiating the read operation. Activating the word line starts the read cycle, allowing pass transistors M5 and M6 to engage after the initial word line delay. In a correct read operation, the values stored in Q and QB are transferred to the bit lines, leaving BL at its pre-charge value and discharging BLB through M1-M5. Careful transistor sizing is crucial to prevent unintentionally writing a '1' into the cell, a malfunction commonly referred to as a read upset (RABAEY; CHAN-DRAKASAN; NIKOLIC, 2002).

A simplified model of the 6T-SRAM bit cell during the read operation is depicted

in Figure 2.4. Focusing on the BLB side of the cell, the bit line capacitance for larger memories falls in the pF range. Thus, when the read operation is initiated (WL  $\rightarrow$  1), BLB maintains its pre-charged value at VDD. Combining two NMOS transistors in series pulls down the BLB towards Ground (GND). For a small-sized cell, the goal is to size these transistors as close to the minimum as possible, resulting in a gradual discharge of the large bit line capacitance (RABAEY; CHANDRAKASAN; NIKOLIC, 2002). As the difference between BL and BLB increases, the sense amplifier circuit is triggered to accelerate the reading process.

At the WL rising edge, the QB node moves upward toward the pre-charge value of BLB. It is crucial to keep this voltage rise of QB low enough to avoid a significant current flow through the M2-M4 inverter, which, in the worst case, could lead to a bit-flip. Maintaining the resistance of transistor M5 larger than that of M1 is necessary to prevent such occurrences (RABAEY; CHANDRAKASAN; NIKOLIC, 2002).

Figure 2.4: An example of a simplified model of SRAM cell during the read operation.



Source: Rabaey, Chandrakasan and Nikolic (2002).

#### 2.1.3 SRAM Write Operation

Assuming a stored '1' in the bit cell (Q = 1), writing a '0' is achieved by setting BLB to '1' and BL to '0'. Upon initiating a write operation, the 6T-SRAM bit cell schematic can be simplified to the model shown in Figure 2.5. It is reasonable to assume that the gates of transistors M1 and M4 remain at VDD and GND, respectively, until the switching process begins. Although this assumption is violated once the flip-flop starts toggling, the simplified model suffices for hand-analysis purposes.



Figure 2.5: An example of a simplified model of SRAM cell during the write operation.

Source: Rabaey, Chandrakasan and Nikolic (2002).

It is essential to observe that the QB side of the cell cannot be pulled high enough to guarantee the writing of '1'. The sizing constraint, dictated by read stability, ensures that this voltage remains below the threshold. Consequently, the updated value of the cell must be written through transistor M6. A secure writing process is guaranteed if we pull node Q low enough, typically below the threshold value of transistor M1.

### 2.1.4 Bit Interleaving

Regardless of the technology or architecture adopted, an SRAM can be organized in a certain way to increase its robustness to the radiation effects. An organizational methodology widely used along with other techniques to deal with the radiation effects, which is also applied in this work, is bit interleaving. Bit interleaving is a widely adopted technique to reduce the impact of multi-bit errors on error rates. It refers to a memory layout architecture in which physically adjacent bits belong to different logic words (MAIZ et al., 2003). The bit interleaving technique is typically used along with EDAC techniques, as it facilitates the detection and correction of bits belonging to the same word, treating two adjacent failing bits as separate single-bit errors rather than a double-bit error in the same logic word. In Figure 2.6, it is possible to have an overview of the relationship between the bit interleaving and a 8-bit error. In this organization, the bits belonging to the same word are placed 8 bits apart. Therefore, even with the Multiple-Bit Upset (MBU) shown in the figure, only 1 bit of each word is impacted. The effectiveness of bit inter-



Figure 2.6: The relationship between SRAM bit interleaving and MBU.

Source: Radaelli et al. (2005).

leaving is often determined by the minimum physical distance between two bits within the same logic word. However, a comprehensive assessment of its impact requires detailed insights into multi-bit failure probabilities and sensitivities to operating parameters, information that is typically not available in the open literature (MAIZ et al., 2003).

### **2.2 SRAM Peripheral Circuits**

As mentioned before, an SRAM comprises not only the cells responsible for storing data but also peripheral circuits responsible for the correct memory functioning and communication with other circuits. This section details the peripheral circuits that make up an SRAM-type memory.

### 2.2.1 Pre-Charge Circuit

The pre-charge circuit is an essential peripheral of an SRAM. It is responsible for ensuring that the data stored in the memory cells is not corrupted during read and write operations. The pre-charge circuit ensures that the bit lines (BL and BLB) are in a pre-charged state before a read or write operation is performed. Each pair of bit lines in the SRAM array is connected to a pre-charge circuit. The function of this circuit is to pull up the bit lines of a selected column to a specific voltage level, VDD in this work, and perfectly equalize them before an operation (SINGH; MOHANTY; PRADHAN, 2012).

Different pre-charge circuit architectures are used in SRAM design. One of the most used configurations, and also used in this work, is shown in Figure 2.7. It comprises

Figure 2.7: Pre-charge circuit electric schematic.



Source: From the author.

three PMOS transistors and a Pre-Charge Enable Signal (PC). When all transistors are in ON state, that is, PC is active low, bit lines are connected to VDD. In this configuration, M1 and M2 transistors connect the bit lines to VDD for pull-up, while transistor M3 equalizes both bit lines. PMOS transistors are commonly used in the pre-charge circuit design due to the better VDD passing capacity than NMOS transistors.

#### 2.2.2 Sense Amplifier

The Sense Amplifiers are one of the most important peripheral circuits in SRAMs and have become a separate class of circuits in the literature (SINGH; MOHANTY; PRADHAN, 2012). During a read operation, a Sense Amplifier amplifies a small voltage difference between the two-bit lines and translates it to a full-swing digital output signal. In an SRAM, the Sense Amplifier is responsible to detect the value of the data stored in the selected memory cell and outputting the value to the CPU.

Designing fast, low-power, and robust Sense Amplifier circuits are challenging because modern memory design involves bit lines with a significantly large capacitance. Modern SRAMs embed a large number of bit cells per bit line to enhance array density, but this also makes the circuits more sensitive to process variations, environmental conditions, and device mismatch. Consequently, these challenges impose limits on the sensing speed and robustness of the circuits besides the introduction of extra signal delay (SINGH; MOHANTY; PRADHAN, 2012).

The sense amplifier design depends on the timing requirements and layout constraints of the memory system. In this work, the commonly used latch-type Sense Am-
plifier was chosen for the SRAM design. The electric schematic of the 7T Latch-Type Sense Amplifier is presented in Figure 2.8. This amplifier comprises two cross-coupled inverters and three additional transistors (Mx, My, and Mz), which isolates it from the bit lines and prevents the discharge of bit line on the '0' storage node. To initiate the sense operation in this type of sense amplifier, the inputs (or both bit lines) are pre-charged and equalized to bias the amplifier in the high-gain metastable state. Once a differential voltage is developed on the bit lines that exceeds the sensitivity of the sense amplifier, and the Output Enable (OE) signal is enabled, the bit lines isolation pass-through transistors are turned off. The feedback mechanism of this amplifier immediately picks up the differential voltage and drives the outputs to the full-swing differential voltage.

Figure 2.8: 7T Latch-Type Sense Amplifier electric schematic.



Source: From the author.

#### 2.2.3 Write Driver

The write driver is responsible for driving the data to be written onto the selected memory cell during a write operation. Before every operation, the bit line pair is precharged to VDD. During the write operation, the write driver only needs to pull down one of the two bit lines below the write margin of the 6T-SRAM bit cell based on the input data (SINGH; MOHANTY; PRADHAN, 2012).

There are different types of write drivers commonly used in SRAM array, a typical write driver circuit, presented in Figure 2.9, is used in this work. Data is written to the circuit using two stacked NMOS transistors, specifically M1/M3 and M2/M4, which together create two pass-transistor AND Logic Gate (AND). The write driver is activated by the Write Enable (WE) signal, which turns on transistors M3 and M4. Depending on Figure 2.9: Write Driver circuit electric schematic.



Source: From the author.

the input data received through Inverter (INV) buffers INV-1 and INV-2, when WE is enabled, either transistor M1 or M2 discharges one of the bit lines from the pre-charged level to ground.

### 2.2.4 Row Decoder

The row decoder selects one row of the memory array (word line) to be accessed during read or write operations. The row decoder architecture consists of a set of digital logic circuits that interpret the address signals and generate the necessary control signals to activate the selected row of memory cells. A combination of AND/Not AND (NAND) and OR Logic Gate (OR)/Not OR (NOR) gates is typically used to decode the address signals. The decoder can be implemented in two styles: static and dynamic. The choice of design styles depends on the SRAM area, performance, power consumption, and architectural considerations (SINGH; MOHANTY; PRADHAN, 2012).

The pre-decoding technique is typically used in the circuit design for row decoders with many outputs, as in this work. The pre-decoding technique provides less capacitance to drive per address, significant area savings by "sharing" gates, and easily "pitch fit" a 2-input AND gate in the side of the memory core.

# 2.2.5 Column Decoder

The column decoder selects the specific memory cell within the activated row to be accessed during a read or write operation. The column decoder receives the address signals from the CPU, decodes the signals to determine the column address, and activates the appropriate memory cell within the selected row.

Similar to the row decoder, the column decoder is also designed with a set of digital logic circuits that interpret the address signals and generate the necessary control signals to activate the selected memory cell. The column decoder typically uses a combination of Multiplexer (MUX) and Pass Transistor Logic (PTL) to decode the address signals.

### **3 RADIATION EFFECTS ON ELECTRONIC CIRCUITS**

Anomalies induced by the radiation effects on electronic circuits are known from the beginning of space exploration. The research aimed at the study of the radiation effects on electronic circuits was initially considered a concern of utmost relevance only in projects developed for military or space applications. The first US artificial satellite, Explorer I, designed and built by the Jet Propulsion Laboratory, and launched on January 31, 1958, carried a Geiger counter proposed by J.A. Van Allen. When the spacecraft reached a certain altitude, the counter suddenly stopped counting cosmic rays. From this behavior, the existence of the Van Allen belts was discovered because the counter was in fact saturated by an extremely high particle count rate. The evidence of the existence of trapped particles in Earth's radiation belts can be considered, in this respect, as the very first scientific output of the Space Age (ECOFFET, 2007).

In 1962, the USA proceeded to a high altitude nuclear test in the Telstar telecommunications satellite, designed and built by the Bell Telephone Laboratories with AT&T funds and supported by National Aeronautics and Space Administration (NASA). The extremely high radiation levels induced by electrons injected in the radiation belts caused degradations of some electronic components (diodes in the command decoder) and, finally, the loss of the satellite in 1963. This was the first spacecraft loss due to radiation effects (ECOFFET, 2007). From this moment on, the effects of radiation (whether natural or man-made) on electronic circuits have come to be studied by the scientific community, space agencies and military agencies.

A new class of effects emerged, starting from first observations in 1978 when Intel Corporation discovered that anomalous upsets occurred at the ground level on DRAMs (MAY; WOODS, 1978). It was determined that the faults were caused by alpha particles emitted by the decay of the radioactive uranium and thorium elements, which contaminated the encapsulation material in the memory chip manufacturing process. This was the first study published in the International Reliability Physics Symposium (IRPS) and was the first work to define the anomalies as "soft errors" (MAY; WOODS, 1978). This term was used to differentiate from permanent faults and to characterize the random effects caused by radiation on memory elements.

Guenzer, Wolicki and Allas (1979) reported that the occurrence of soft errors could also come from nuclear reactions where proton particles and high energy neutrons are produced. At that moment, the term "Single Event Effects" was introduced, characterizing the effects that are triggered by only one particle. It was established that ions, protons and neutrons could also produce single event effects, and it soon became one of the significant causes of component dysfunction in space (ECOFFET, 2007).

Most of the research in the 1980s was directed mainly to sequential circuits, such as DRAMs and SRAMs. This was due to the requirement to understand radiation effects and their mitigation to reliably provide data storage (DODD; MASSENGILL, 2003). However, studies focusing on combinational logic circuits began to emerge at the end of this decade in response to the Best Paper of the IRPS "Dynamic fault imaging of VLSI random logic devices" by May et al. (1984).

The Earth is protected by the atmosphere, which acts as a semi-permeable "screen", to let throughout light and heat, while stopping radiation and Ultraviolet (UV) rays (BOUDENOT, 2007). The intensity of the radiation basically increases according to the increase in altitude relative to ground level. However, due to phenomena related to the Earth's magnetic field (the polar regions are an example), some regions suffer from a higher intensity of radiation even though they are located at low altitudes.

In space and the Earth's atmosphere, there is a diverse range of radiation, which is classified into two broad groups: ionizing particles and non-ionizing particles. Cosmic rays, x-rays and radiations from radioactive materials are examples of ionizing radiation. That is, they produce the emission of electrons when interacted with some material. Examples of non-ionizing radiation are ultraviolet light, radio waves and microwaves, as they are not capable of ionizing any material. The main particles that may cause unwanted effects in electronic circuits are electrons, protons, neutrons, muons, alpha particles and heavy ions, as well as electromagnetic radiation, such as x-rays and gamma rays (STASSINOPOULOS; RAYMOND, 1988). At sea level, muons are the most numerous terrestrial species (SIERAWSKI et al., 2010).

Space radiation consists of subatomic particles (e.g., protons, electrons, neutrons), which may originate from heavy ions present in the space environment or alpha particles emitted from radioactive isotopes. These particles travel in space at very high speeds (near the speed of light), which allows them to easily traverse a material and cause various effects on it. The main components of radioactive phenomena encountered in space can be classified into four categories by origin: Radiation belts, solar flares, solar wind and cosmic rays (BOUDENOT, 2007). These phenomena are discussed in detail in the next subsections.

# **3.1 Radiation Belts**

Radiation belts are formed in the terrestrial magnetosphere and contain trapped electrons and protons. A layer of charged energetic particles, which are trapped by the influence of a magnetic field, forms a radiation belt. Earth has two of these belts that are known as Van Allen Belts, as shown in Figure 3.1. Most of the particles that form the belts originate from solar flares, solar winds, and also cosmic rays (ALLEN; FRANK, 1959). The inner belt contains electrons whose energy is less than 5 MeV. The outer belt contains electrons whose energy may reach 7 MeV, furthermore in the case of the outer belt, the electron flux is both more variable and more intense than that of the inner belt. Like electrons and protons, heavy ions may also be trapped in the magnetosphere (BOUDENOT, 2007).



Figure 3.1: Van Allen radiation belt.

Source: Hamer (2017).

In space missions, Van Allen belts have always been a major concern because of their ability to interfere with the smooth operation of systems and possibly to permanently damage satellite electronics (WALT, 2005). Typical proton energies can reach several hundred MeV and are known to cause effects like Total Ionizing Dose (TID), SEEs and Displacement Damage (DD). Electrons, however, reach energies of some MeVs contributing to effects such as TID, DD and charging and discharging (CUMMINGS, 2010).

## 3.2 Sun

The Sun is a gaseous sphere composed primarily of hydrogen and helium, in addition to a small amount of heavier elements such as iron, silicon, neon, oxygen, nitrogen, and carbon (LIOU, 2002). Almost all energy received by the planet and that feeds life in the Earth's atmosphere comes from the Sun, making its existence essential for the maintenance of life on Earth. The Sun's energy source comes from within, where due to high temperatures, fusion reactions occur by turning four hydrogen atoms into a helium atom and releasing energy. The solar atmosphere is known as the solar corona and is visible as a weak white halo during total solar eclipses. Through a cross-section, Figure 3.2 illustrates the interior of the Sun, where the reactions responsible for the release of particles of radiation to their atmosphere and the universe occur.



Figure 3.2: Cross-section of the Sun interior.

Source: NASA (2008).

One of the most important events of solar activity is the **Solar Wind**, which occurs due to the phenomenon of the coronal mass ejection, shown in Figure 3.3. The high temperature of the Sun corona (about two million K) inputs sufficient energy to allow electrons to escape the gravitational pull of the Sun. The effect of the electron ejection's causes a charge imbalance resulting in the ejection of protons and heavier ions from the corona. The particles are homogenized into dilute plasma due to the high temperature of

Figure 3.3: Solar Wind.



Source: NOAA (2015).

the ejected gas. The energy density of the plasma exceeds that of its magnetic field, so the solar magnetic field is "frozen" into the plasma (BOUDENOT, 2007).

Changes in the solar wind density (e.g., solar flares), the solar wind velocity (e.g., coronal mass ejection's), and the orientation of the embedded solar magnetic field can cause significant perturbations in the geomagnetic field. The coronal mass ejection's and solar flares cause disturbances of the solar wind, and it is the interaction between theses disturbances and the Earth's magnetosphere that causes perturbations called magnetic storms and sub-storms (BOUDENOT, 2007).

The solar activity is cyclical, having around 11 years, being on average seven years of high activity and four years of low activity (ASSIS, 2009). When in high solar activity, the surface of the Sun is violently disturbed, causing explosions of particles and radiation. These explosions, known as **Solar Flares**, emit heavy ions (tens of MeV to hundreds of GeV) in addition to alpha particles and electrons. Figure 3.4 contains a representation of a Solar Flare on the surface of the Sun.

Figure 3.4: Solar Flare.



Source: NASA (2012).

## **3.3 Cosmic Rays**

Galactic cosmic rays consist of high energy particles with a very diverse energy spectrum. The origin of this radiation has not been truly identified; it is known that the most energetic ions come from outside the Milky Way Galaxy and the rest from within it. It is believed that they are produced and accelerated by solar flares, supernovae and galactic nucleus explosions (ZIEGLER, 1996). Cosmic rays correlate with solar activity because, in periods of low activity, the cosmic ray flow that reaches the Earth is greater when in high solar activity (MCDONALD, 1998).

By traveling in space at high velocities and with an enormous amount of energy, when entering the terrestrial atmosphere, the cosmic rays collide with the atoms present in the atmosphere, provoking cascade nuclear reactions of particles towards the Earth's surface, as shown in Figure 3.5. Cosmic rays of galactic origin are considered primary particles. The secondary particles, coming from the cascade effect, are formed by protons, neutrons, pions and muons (BALEN, 2010). However, from the total particles generated in this cascade effect, only 5% of protons and 1% of electrons and neutrons reach the surface of the Earth at ground level. This is due to attenuation processes and the short life span of these particles (SIMIONOVSKI, 2012). Although the neutron has no electric charge, it has a higher charge generation property compared to the proton and the electron. The neutron does not directly ionize the silicon but interacts with it, causing a nuclear reaction that releases alpha, beta, and proton particles.



Figure 3.5: Nuclear cascade reactions of particles towards the Earth's surface.

Source: Mészáros, Razzaque and Wang (2015).

## 3.4 South Atlantic Anomaly (SAA)

After presenting the main components related to the origin of the radiation effects, it is important to highlight an anomaly present in a specific region of the Earth. The slope of the Earth's axis of rotation relative to the axis of the magnetic field influences the distribution of the flux of particles present in the inner Van Allen belt, creating a kind of depression region (BALASUBRAMANIAN, 2008), shown in Figure 3.6. In this region, the radiation trapped by the Earth's magnetic field in the belts reaches lower altitudes, including penetrating the atmospheric layers. It produces undesirable effects in the electronic equipment of spacecraft and satellites, which fly over southern Brazil and the Atlantic Ocean (BALEN, 2010). This region is known as the South Atlantic Anomaly (SAA).

Figure 3.7 shows recent satellite data from the European Space Agency (ESA), revealing that the SAA continues to evolve, with the most recent observations showing we could soon be dealing with more than one of these strange phenomena. ESA says that in the last two centuries, Earth's magnetic field has lost about 9 percent of its strength on average. The minimum field strength in the SAA dropped from approximately 24.000 nT to 22.000 nT over the past 50 years. New readings provided by the ESA's Swarm



Figure 3.6: Deformation in the inner Van Allen belt of the Earth due to SAA.



satellites show that within the past five years, the second center of minimum intensity has begun to open up within the anomaly beside Africa.



Figure 3.7: Current scenario of the South Atlantic Anomaly.

Source: ESA (2020a).

#### **3.5 Fault Tolerance Basic Concepts and Terms**

Before presenting the impact of radiation effects on electronic circuits, it is essential to introduce and define key terms and concepts utilized in the fault tolerance area to describe these effects. Within the fault tolerance field, fundamental terms such as fault, error, and failure often present conflicting interpretations. This work adopts concepts and terms widely accepted by the community (LAPRIE, 1985; ANDERSON; LEE, 1981), with specific emphasis on notable works by Pradhan et al. (1996) and Avizienis (1982).

The terms fault, error, and failure are best elucidated through the Three-Universe model proposed by Pradhan et al. (1996), shown in Figure 3.8. This model, an adaptation of the Four-Universe model introduced by Avizienis (1982), describes the different phases of evolution from fault to failure. The first universe is the physical universe, where faults manifest.

Figure 3.8: Three-Universe model proposed by Pradhan et al. (1996).



Source: Adapted from Pradhan et al. (1996).

A **fault** refers to an undesired physical condition or flaw that arises within specific hardware components. The faults may become dormant for a long time and will not influence component performance. When triggered, a fault's effects manifest in the information universe. An **error** is the manifestation of a fault, that is, a change in the state of the system presenting inconsistency in the data generated by the functionality affected by the fault. **Failure** denotes a deviation from the circuit specification, resulting in the component's incapability to fulfill its predefined function. Failures cannot be tolerated, just avoided. In the event of transient faults caused by radiation effects, specific mechanisms mask the impact of a fault, preventing its propagation to subsequent levels and avoiding incorrect values from reaching the output of the circuit. Fault masking can be categorized into three primary types: electric, logical, and latching window.

In the **electric masking**, the fault is not propagated until the output of the circuit due to electrical losses that attenuate its magnitude. Figure 3.9 illustrates the degradation of the pulse, showcasing its possible attenuation, which characterizes electric masking.

Figure 3.9: Degradation of a pulse by electric masking. Depending on the (a) Generated pulse's width, it can be (b) Attenuated or (c) Filtered when propagating through the circuit.



Source: Entrena et al. (2009).

**Logical masking** occurs when the fault affects a region of the circuit that is not determinant for the result obtained at its output when the fault occurred. Figure 3.10 presents two cases of logical masking in a circuit. The first scenario involves a NAND2 logic gate, in which one of its inputs is set to '0'. Therefore, regardless of the values assigned to the other inputs, its output will always be '1'. Another case can be observed in the OR2 gate, in which a transient fault impacts one of its inputs while the other input is equal to '1'. Given that the circuit's output has already been determined by one of its inputs, the transient fault in the other input will not affect the result, so it turns out that there was logical masking of the fault in question.

The masking by **latching window** occurs when a transient pulse, not masked logically or electrically, propagates through the circuit towards a memory element. However,



Figure 3.10: Logical masking example in combinational circuit.

Source: Adapted from Zimpeck, Meinhardt and Butzen (2014).

there is no clock transition during its transition, i.e., the pulse reaches the data lines outside the latching window area, as seen in Figure 3.11. Consequently, this pulse will not be stored in memory, not producing an error.

Figure 3.11: Latching Window masking.



Source: Adapted from Zimpeck, Meinhardt and Butzen (2014).

## 3.6 Characterization of the radiation effects on electronic devices

The effects of radiation affecting the operation of electronic circuits can be classified into three broad groups:

1. Total Ionizing Dose (TID): cumulative effects that occur due to the exposure of integrated circuits to radiation over time. They are produced after an ionizing particle reaches the surface of a device and are not undone over time, i.e., long-term effects in which its intensity depends on the intensity of the radiation and the time the circuit was exposed to this radiation (VELAZCO; FOUILLAT; REIS, 2007).

- 2. Displacement Damage (DD): causes physical damage to the crystalline structure of the material (silicon in the case of the semiconductors of interest in this work) caused by non-ionizing energy loss (NIEL) of the incident particles on the material, degrading the material and their properties.
- **3. Single Event Effects (SEEs):** are effects that occur due to the bombardment of energized particles (electrons, protons, alpha particles and heavy ions) that reach the silicon, ionizing it densely and releasing energy that can damage the circuits permanently or induce transient behavior, affecting the proper functioning of the device. SEEs can be classified as destructive and non-destructive (DODD et al., 2004; CUMMINGS, 2010; AZAMBUJA; KASTENSMIDT; BECKER, 2014):
  - (a) Destructive: are effects that permanently damage the circuit. The four main effects are: Single Event Latch-up (SEL) occurs when the incidence of the particle causes an abnormal increase in the operation current and may cause permanent damage to the device; the Single Event Gate Rupture (SEGR), which the gate oxide is damaged forming a conductive path; Single Event Burnout (SEB) when the particle reaches the source region of the transistor creating a current between the source and the drain. This current can generate a destructive fault in the device, the device literally burnout; and Single Hard Error (SHE), the deposition of large loads of energy can damage the ability of transistors to transition state. In Sexton (2003) destructive SEE mechanisms are reviewed and discussed.
  - (b) Non-destructive: are also commonly known as Soft Errors. They can also be classified into two types depending on the nature of the element reached: Single Event Upset (SEU) when the element hit is a sequential element, for example, a flip-flop, modifying the state of a stored bit (bit flip); and Single Event Transient (SET) If the particle reaches a combinational element, for example, a multiplexer, a transient pulse is generated that may or may not be captured by a memory element.

Figure 3.12 presents the classification of the major SEEs in the literature. The focus of this work are the SEU and MCU effects on SRAMs. The next section presents more details of SEEs, highlighting the SEU.





Source: Adapted from Siegle et al. (2015).

## 3.7 Single Event Effects (SEE)

The Single Event Effects occur due to the interaction of large ionizing particles (protons, neutrons, alpha particles and heavy ions) that pass through insulation, semiconductor layers, or even all Metal Oxide Semiconductor (MOS) device (DODD et al., 2004). Figure 3.13 shows the SEE through the impact of an ionizing particle on the device structure. When the particle enters the silicon material, a transient path composed of ionized elements (electron-hole pairs -  $e^-/h$ ) is generated. This path is arranged under a radial distribution that permeates the path of the incident particle. This transient path may have a sufficient mobile charge to drive a current pulse against the presence of the external electric field due to the polarization of the transistor.

SEEs indicate any measurable or observable change in a state or performance of a microelectronic device, component, subsystem or system (digital or analog) as a result of the incidence of a single energetic particle. According to the intensity and the region in which this current flows, it is capable of causing faults that may be permanent in the device structure, called destructive events (hard errors), or non-destructive events (soft errors), represented by the SET and the SEU (MUNTEANU; AUTRAN, 2008).

The main difference between the two non-destructive events is the incidence location of the particle. If the current pulse occurs within a sequential circuit, such as



Figure 3.13: Single Event Effects - an ionizing particle passing through a sensitive volume (SV) in an active (semiconductor) device.

Source: TNA (2018).

latches or flip-flops, the stored original value can be inverted, producing an SEU or bit-flip (BRAMNIK; SHERBAN; SEIFERT, 2013). Similarly, the SET also generates a pulse, but its origin is by the impact of particles within a combinational circuit. If the pulse generated in a combinational circuit propagates to a sequential circuit, a SET can become a SEU.

The SEUs, unlike the SETs, have a non-transient character. They are associated with the bit inversion of memory elements. SEU may have an indefinite duration or be corrected after one or more clock cycles.

### **3.7.1 Destructive Events**

Unlike the SETs, destructive events permanently damage the device. As mentioned earlier, these effects are not part of this work scope, but they will be briefly described. The destructive events are classified into several different types, as shown in Figure 3.12. The SEB and SEGR effects are better described below:

(a) Single Event Burnout (SEB): occurs when the passage of a high energy ion through the device causes the generation of a dense plasma of  $e^-/h$  h pairs which, under the influence of polarization of the drain terminal, produces a high-density current. This resulting current, if it is not quickly drained, can generate a destructive fault on the device, causing its "burnout" (SEXTON, 2003). (b) Single Event Gate Rupture (SEGR): due to the reduction of transistors dimensions in recent technologies, the thickness of the gate oxide has also been significantly reduced. This reduction increases the electric field of the oxide since this is inversely proportional to the thickness of the dielectric. Thus, perturbations in the electric field that permeates the oxide can cause that it exceeds its dielectric rigidity, causing its rupture.

#### 3.7.2 Single Event Upset (SEU)

A SEU is a change of state caused by a single ionizing particle striking a sensitive node in a microelectronic device, such as in a microprocessor, semiconductor memory, or power transistors. The change of state results from the free charge created by ionization in or close to an important node of a logic element (e.g., Q/Qb nodes of a bit cell). The SEU itself is not considered permanently damaging to the transistor's or circuits' functionality, unlike the case of SEL, SEGR, or SEB.

For nearly 50 years, electrical, aerospace, nuclear, and radiation engineers have conducted research on soft errors. From 1954 to 1957, failures in digital electronics were reported during the above-ground nuclear testing (WANG; AGRAWAL, 2008). These failures were considered electronic anomalies in the monitoring equipment, as they occurred randomly and could not be attributed to any hardware faults (ZIEGLER et al., 1996). Further problems were observed in space electronics during the 1960s, although separating soft errors from other forms of interference was difficult. The first paper to address the SEU effects was not a paper on using electronics in the space environment but assessing scaling trends in terrestrial microelectronics (WALLMARK; MARCUS, 1962). The authors predicted the eventual occurrence of SEU in microelectronics devices due to cosmic rays from Earth's atmosphere and linked the occurrence of particle strikes with the size of the device's active area (DODD; MASSENGILL, 2003).

The first confirmed report of cosmic-ray-induced upsets in space was presented in 1975 (BINDER; SMITH; HOLMAN, 1975). The paper documented four upsets over 17 years of satellite operation in bipolar J-K flip-flops in a communication satellite. Due to the limited number of errors observed, it took several years for the significance of SEU to be fully acknowledged. Only in 1978-1979 a considerable number of research papers on SEU were published (DODD; MASSENGILL, 2003). The initial research papers linked memory upsets to direct ionization by heavy ions (BINDER; SMITH; HOLMAN, 1975;

PICKEL; BLANDFORD, 1978). However, in 1979, two groups published findings on errors induced by indirect ionization effects of protons and neutrons (WYATT et al., 1979; GUENZER; WOLICKI; ALLAS, 1979). This discovery was crucial because protons are much more abundant than heavy ions in the natural space environment. Moreover, this implies that SEEs can result not only from galactic cosmic rays but also from protons emitted during solar events and those confined within the Earth's radiation belts. Proton-induced SEEs often dominates the single-event response of commercial parts operating in low Earth orbits (DODD; MASSENGILL, 2003). The term "Single-Event Upset" was first introduced by Guenzer et al. (GUENZER; WOLICKI; ALLAS, 1979), which was promptly embraced by the scientific community as a descriptor for upsets caused by both direct and indirect ionization.

Over time, the study of SEEs has become increasingly important, essential for advancing microelectronics. This is proven through the increase in radiation hardening by design circuits and the development of new technologies, more robust to the radiation effects, such as Silicon on Insulator (SOI) technology. An increase in the sensitivity to SEU is expected to continue, both in memories and core logic. Upsets in spacial and terrestrial electronics are a severe reliability concern for commercial manufacturers. Single-event vulnerability has become a mainstream product reliability metric for all facets of the integrated circuit industry (DODD; MASSENGILL, 2003).

# 3.7.3 SEU Effects in SRAMs

When an energetic particle strikes a sensitive location in an SRAM bit cell, typically the reverse-biased drain junction of a transistor biased in the "off" state (DODD et al., 1996) (such as the "off" n-channel transistor of one of the cross-coupled inverters), the charge collected by the junction generates a transient current in the impacted transistor. As this current passes through the struck transistor, the restoring transistor ("on" p-channel transistor of the same inverter) attempts to compensate by sourcing current. However, the restoring transistor has a limited current drive and channel conductance. Consequently, the current flowing through the restoring transistor results in a voltage drop at its drain. This interaction between the feedback and recovery process is represented in Figure 3.14. The voltage fluctuation, arising from the transient current caused by the single-event, is the actual mechanism responsible for potential upsets in SRAM bit cells (DODD; MAS-SENGILL, 2003). The voltage fluctuation resembles a write pulse and has the potential Figure 3.14: The SEU response in SRAM Bit cells is determined by the interaction between the feedback process and the recovery process.



Source: Adapted from Dodd and Massengill (2003).

to incorrectly store a memory state (bit flip) within the memory cell.

Within SRAM bit cells are four potential sensitive strike locations, specifically the drains of the four transistors located internally within the SRAM circuit. Regarding charge collection, a crucial factor is whether the junction is situated within a well or in the substrate (DODD et al., 1996). This is significant because the junction between the well and the substrate forms a potential barrier that hinders the diffusion of charge deposited deep within the substrate back to the affected drain junction. For example, in the familiar outside-the-well "off" strike scenario, the affected drain is situated outside the well, allowing charge deposited deep within the substrate to diffuse back towards the drain junction. This is the most sensitive strike location for most technologies (DODD; MASSENGILL, 2003).

Interestingly, particles that have energy levels significantly below the upset threshold can still possess enough ionizing capability to trigger a momentary voltage alteration ("flip") at the impacted node of an SRAM (DODD; MASSENGILL, 2003). For instance, Figure 3.15 illustrates the drain voltage transients in an SRAM resulting from a particle strike with Linear Energy Transfer (LET) well below the upset threshold, just below the upset threshold, and just above the upset threshold. Even the particle with LET well below the upset threshold induces a notable voltage transient at the struck drain, impacting the SET response. The occurrence of an observable SEU relies on the relative speed of

Figure 3.15: SRAM struck drain voltage transients for ion strikes with LET: well below, just below, and just above the SEU threshold.



Source: Dodd and Massengill (2003).

two processes: the feedback of the voltage transient through the opposite inverter and the voltage recovery at the impacted node as the single-event current diminishes (WEAVER et al., 1987). It is important to note that the rapid initial flip of the cell is primarily caused by drift, including funneling effects, while the long-term charge collection through diffusion prolongs the recovery process. Both of these mechanisms play critical roles in the SEU phenomenon (DODD; MASSENGILL, 2003).

# 3.7.4 Multiple-Cell Upsets

As already explained earlier, SEUs in SRAM bit cells can be induced by energetic ions and protons. In some cases, a single event can lead to the upset of multiple cells. The potential for multiple-cell upsets in SRAMs exposed to the space radiation environment was first reported in 1983 (BLAKE; MANDEL, 1986). Subsequent ground-based testing has validated that a single ion has the capability to affect multiple locations within an SRAM device. This type of upset clearly depends on the physical arrangement and size of the memory cells. In particular, as cell sizes shrink and cell density increases, the prevalence of this type of upset can be expected to rise (KOGA et al., 1993b).

MCU can originate from the impact of the particle on the surface of the chip, considering different incident angles. One of the most frequent types of incidence is when an incident ion approaches the surface of the die in a parallel manner, intersecting the sensitive regions of multiple memory cells. It is important to highlight that the affected cells do not necessarily need to be adjacent to each other on the die. Since SRAM dies typically have a rectangular shape with side lengths measuring several millimeters, the range of the ions must be at least this magnitude for such "lateral strikes" or "glancing collisions" to take place (KOGA et al., 1993b).

Except for nearly zero incidence angles (resulting in shorter ion ranges), only cells in close proximity can be impacted. This can occur either when the ion track diameter covers the sensitive regions of multiple cells (such as with high Z particle tracks) or when the charge from the track diffuses into the sensitive regions (MARTIN et al., 1987). Consequently, the erroneous bits resulting from this type of upset are observed in physically clustered memory cell locations.

In general, Multiple-Cell Upsets can be expected to involve more than a single logical memory word, that is, the impacted bits are likely to belong to different words. This fact generates new terminologies for multiple effects, which may cause some doubts. The term MCU, normally refers to the impact of multiple memory cells whether or not they belong to the same word. If the impact of a single particle affects multiple bits of a single word, the terms MBU, and Single-Word Multiple-bit Upset (SMU) (KOGA et al., 1993a) are typically used.

### 3.8 Physical Mechanisms of Charge Deposition and Collection

Soft errors occur when energetic particles interact with silicon colliding with a sensitive circuit area and depositing an additional charge on the transistor's P-N junction region. There are two primary methods of charge deposition attributed to the ionizing radiation in a semiconductor device (DODD; MASSENGILL, 2003):

1. Direct Ionization: when a charged particle travels through a semiconductor material, it loses energy along its path, transferring it to the device and creating a path formed by electron-hole pairs. This resulting ionizing track, when collected by the electric field of the device, generates a transient current/voltage. Direct ionization is considered a primary charge deposition mechanism for upsets caused by the incidence of

alpha particles or heavy ions. Lighter particles like protons do not usually produce enough direct ionization charge to generate upsets in memory circuits. However, as the devices become more susceptible, upsets in digital IC due to direct ionization by protons may occur (DUZELLIER et al., 1997).

2. Indirect Ionization: it is a secondary mechanism of charge deposition, in which, due to nuclear reactions in the semiconductor material, light particles such as protons and neutrons can release energy in the silicon through secondary particles product of the nuclear reaction. Since secondary particles have a greater mass than the initial proton or neutron, they can generate higher charge densities as they move, which may result in an SEU. When a nuclear reaction happens, the charge deposition from secondary charged particles is equivalent to that of a directly ionizing heavy ion strike (DODD; MASSENGILL, 2003).

The energy deposited by a particle due to its ionization in silicon is an important metric in the study of radiation effects in nanotechnologies because it is directly related to the magnitude of the generated transient pulse. Linear Energy Transfer (shown in Equation 3.1) is the amount of energy that a particle releases per unit of compliance with the path traveled by it. LET has units of MeV/cm<sup>2</sup>/mg because the energy loss per unit path length (in MeV/cm) is normalized by the density of the target material (in mg/cm<sup>3</sup>) so that LET may be quoted roughly independent of the target (DODD; MASSENGILL, 2003).

$$LET = \frac{\partial E}{\partial x} \tag{3.1}$$

It is simple to establish a connection between the LET of a particle and the amount of charge it deposits per unit of distance traveled. In silicon, an LET of 97 MeV-cm<sup>2</sup>/mg corresponds to a charge deposition of 1 pC/ $\mu$ m (DODD; MASSENGILL, 2003). The LET depends on the particle's mass and energy and the ionized material, so particles with higher mass and energy ionized in denser materials have higher LET values (BAUMANN, 2005b). Threshold Linear Energy Transfer (LET<sub>th</sub>) is the minimum LET to cause an effect in the circuit (FERLET-CAVROIS; MASSENGILL; GOUKER, 2013), another important metric to evaluate the impact of radiation effects on the devices.

After the ionization of the particle in the silicon, i.e., after the deposition of an additional charge on the affected device, the process of charge collection proceeds through two main mechanisms: Drift and Diffusion. Figure 3.16a shows the resulting ionization

path crossing the depletion region formed at the p-n junctions. When this path crosses or approaches the depletion region, the additional carriers deposited by the ion are rapidly collected by the high-intensity electric field in this region (MUNTEANU; AUTRAN, 2008). This charge collection process is called Drift and is shown in Figure 3.16b.

Figure 3.16: Charge Collection Mechanisms due to an Ion strike in a P-N junction.



Source: Baumann (2005b).

The passage of the particle through the depletion region is responsible for its temporary (in a matter of picoseconds) deformation. The deformation has the format of a funnel, which is known as the Funneling Effect for this purpose. This effect increases the charge collection efficiency due to the rise of the depletion region area (BAUMANN, 2005a). Finally, the Diffusion process collects all the remaining carriers generated besides the depletion layer (Figure 3.16c). Diffusion has a role to play even in the case of direct strikes, as carriers produced outside the depletion region can diffuse towards the junction (DODD; MASSENGILL, 2003).

The typical waveform of the resulting current from the charge collection induced by the incidence of a particle can be seen in Figure 3.17. The Drift and Funneling are very rapid processes, almost instantaneous due to the deformation of the junction's electric field and the consequent increase in charge collection efficiency. Therefore, these processes control the rapid rise of the transient current pulse, as seen in Figure 3.17. In the Diffusion process, a longer time is needed to collect the charge, causing the transient current pulse to slow fall time.

The collected charge also depends on the particle's impact angle on the device and the channel distance. The work of Bartra (2016) analyses particle impacts on the elevated



Figure 3.17: Transient Current Waveform induced by a radiation strike.

Source: Cummings (2010).

source and drain terminals considering three devices with six different impact angles ( $0^{\circ}$ ,  $15^{\circ}$ ,  $30^{\circ}$ ,  $45^{\circ}$ ,  $60^{\circ}$ , and  $75^{\circ}$ ) in five different locations from the silicon nitride separator for each angle (6 nm, 12 nm, 18 nm, 24 nm, and 30 nm). Figure 3.18 presents the collected charge results considering the heavy-ion impact of 100 Mev.cm<sup>2</sup>/mg on a 32 nm Bulk CMOS transistor. The collected charge tendency, in these conditions, is to decrease when the impact is close to the nitride separator, and the impact angle is increased (BARTRA; VLADIMIRESCU; REIS, 2016).

In addition to the particle impact angle, the charge collection process also has some differences in relation to the technology of the impacted device. In SOI technology, used in the design of the final circuit presented in this work, after a particle strike impacts the body of an n-channel SOI transistor, the source and drain electrodes have the ability to collect electrons. However, if a body tie contact exists, holes can exclusively exit through it, while in the absence of body ties, holes can only escape gradually through recombination (DODD; MASSENGILL, 2003). The presence of residual holes in the body of an SOI transistor causes an increase in the body potential and activates the inherent lateral parasitic bipolar transistor. This bipolar current significantly reduces the SEU hardness of SOI technology (DODD; MASSENGILL, 2003). In certain scenarios, the combination of bipolar amplification and impact ionization can cause snapback, a sustained high-current state similar to latchup (DODD et al., 2000). Even when body ties are present, the bipolar effect leads to substantial charge amplification, particularly for ion strikes occurring far from the body ties (MUSSEAU et al., 2000; MASSENGILL et al., 1990).

The transient pulse is generated by the interaction of energetic particles near a

Figure 3.18: Results of the collected charge from the heavy-ion impact at the drain terminal on the 32nm Bulk CMOS device.



Source: Bartra (2016).

sensitive region of a transistor when the collected charge ( $Q_{coll}$ ) exceeds the critical charge ( $Q_{crit}$ ). However, in sub-22nm technological nodes, other phenomena must also be considered in the characterization of the transient pulse. The influence of the charge-sharing mechanism does not seem to have diminished for FinFET technology. Technology Computer-Aided Design (TCAD) results show the extent of electrical perturbations and charge-sharing, similar to what has been observed for older technologies. This effect can cause the pulse quenching in ion-induced transients, resulting in a reduced overall sensitivity of the system against SEEs (BHUVA et al., 2015). These effects will be better described in the next section.

#### **3.9 Emerging Effects at Advanced Technologies**

The high-density integration and reduction of the nodal capacitances have enhanced the charge sharing effect at advanced technologies, increasing the susceptibility to radiation effects (OLSON et al., 2005). The charge sharing effect is characterized by the close proximity of adjacent devices, leading to the multiple node charge collection from a single ion strike. Figure 3.19 presents this effect through two adjacent NMOS devices. As the distance between devices is reduced, an active node, i.e., the stroke node by the ion incidence and actively collecting the deposited charge, is in close proximity to an adjacent



Figure 3.19: Nodal separation setup for NMOS charge sharing.

Source: Amusan et al. (2006).

node. That way, carriers may be able to diffuse at the passive adjacent node and induce a secondary transient current pulse (AMUSAN et al., 2006).

The work of Amusan et al. (2006) investigates the charge collection of the PMOS and NMOS devices. The active and passive device collected charges are shown in Figure 3.20. The passive PMOS device can collect 40% of the total charge collected by the active device, while the passive NMOS collects less than 25% of the total charge. Besides the carrier diffusion process, the bipolar amplification effect is also responsible for the enhancement of charge sharing, explaining the higher collected charge for the passive PMOS device than for the passive NMOS device (AMUSAN et al., 2006; LIU et al., 2009).





Source: Amusan et al. (2006).

The charge sharing mechanism can be considered an adverse effect due to the number of adjacent nodes affected when an ion impacts a single node. However, some researchers have noted that the charge sharing can also reduce the SET pulse width in combinational cells (AHLBIN et al., 2009; ATKINSON et al., 2011). As the signal propagation time is reduced in deeply scaled technology, the multi-collection process provided by charge sharing occurs with a similar time constant. This phenomenon can lead to short-ening the SET pulse width, and it is known as the Pulse Quenching Effect (AHLBIN et al., 2009; AHLBIN et al., 2010).

Figure 3.21 shows the schematic of a three-stage inverter chain and its respective PMOS transistors in a cross-section perspective. Considering that the input signal of the first inverter is at the low level, it will lead to the PMOS device of the second inverter to turn OFF while the first and third PMOS devices are ON. If an ion strikes the sensitive off-state PMOS transistor of the second inverter, as in Figure 3.21, the resulting SET pulse at OUT2 will propagate to the next inverter, turning the adjacent PMOS device OFF. Thus, the third PMOS device will be susceptible to the charge collection by diffusion of the carriers from the charge sharing mechanism. This effect occurs due to the delayed charge collection at the stroke device and the propagation of the generated SET to the adjacent device. This process allows the third PMOS to collect the carriers from charge sharing effect and inducing a transient pulse to revert the output of the chain, as also shown in Figure 3.21.



Figure 3.21: SET Pulse Quenching Effect in a inverter chain.

Source: Ahlbin et al. (2009).

## 3.10 Problem Definition

Among the radiation effects presented in this section, this thesis focuses on MCUs. The importance of dealing with MCU effects in ICs increases even with the transistors shrinking and the use of technologies that are more robust to radiation effects. To better define the problem, it is essential to start by presenting the relationship between the technology scaling and the density of transistors per chip.

As CMOS technologies advance and scale down, the decrease in supply voltages and nodal capacitances results in lower critical charges ( $Q_{crit}$ ) required to disrupt the stored information at circuit nodes. Conversely, decreasing transistor sizes and increasing doping densities reduce charge-collection areas (sensitive areas) per transistor and charge-collection efficiencies. The physical structure of a transistor also influences these aspects (CHATTERJEE, 2020). Figure 3.22 presents the transistor density per chip across technology nodes from 90 nm to 7 nm, considering planar and FinFET transistors.



Figure 3.22: Packing density per chip across technology nodes.

Source: Chatterjee (2020).

Figure 3.23 illustrates the SEU cross-section for SRAM arrays at various technology nodes (CHATTERJEE et al., 2014; SEIFERT et al., 2012; CHATTERJEE et al., 2011; BHUVA et al., 2015; NARASIMHAM et al., 2018). All the data points are standardized based on the Soft Error Rate (SER) observed for 40 nm SRAMs for straightforward comparison. The overall upset cross-section per bit diminishes with technology scaling. The transition from planar to FinFET initially exhibits a significant improvement in SER at



Figure 3.23: Normalized alpha particle induced SER as a function of technology for single-port SRAMs.

Source: Chatterjee (2020).

nominal voltage. Subsequent scaling from 16 nm to 7 nm FinFET is found to result in a proportional reduction in SER aligned with area scaling, akin to SER trends seen within planar process nodes.

The 16 nm node's close cell proximity and reduced critical charge should lead to larger MCU cluster sizes than the 28 nm node under the same ion impact. However, the lesser contribution of MCUs in the 16 nm design suggests reduced charge-sharing compared to planar processes (CHATTERJEE, 2020). These findings strongly indicate a significant decrease in charge collection efficiency in FinFET technologies compared to planar ones. Despite the increased robustness provided by FinFET and also FD-SOI transistors, it is crucial to highlight that soft errors continue to be challenging even in these technologies. Moreover, there is a noteworthy increase in the percentage of MCUs concerning the total soft error rate, as presented below.

As technology scales, allowing for greater memory capacity per unit area, the heightened packing density of bit cells increases the likelihood of multiple adjacent bits experiencing upsets from a single particle strike (SEIFERT et al., 2006; LOVELESS et al., 2011). As presented in Figure 3.24, although device geometry reduction has led to a decrease in the bit-level SER in sub-100 nm technologies (also presented in Figure 3.23), the amplified memory array capacity has led to an increase in system-level SER (IBE et al., 2010). The combination of this heightened SER increased MCU probability due to

Figure 3.24: System-SER, bit-SER, and percentage MCU of the total SER as a function of technology node.



Source: Ibe et al. (2010). Adapted by Neale and Sachdev (2016).

higher packing density and reduced supply voltages (for power efficiency) necessitates the development of advanced soft error mitigation techniques to ensure reliable scaling moving forward (NEALE; SACHDEV, 2016).

Error Correction Code (ECC) is one of the most used techniques for detecting and correcting multiple events. Figure 3.25 shows the influence of bit interleaving combined with ECC on corrected-SER (NEALE; SACHDEV, 2016). The dataset presents

Figure 3.25: Radiation induced SER for each ECC mode at VDD = 500 mV for 1-, 2-, and 4-way interleaving.



Source: Neale and Sachdev (2016).

the radiation-induced SER for each ECC mode at 0.5 V, employing 1-, 2-, and 4-way bit interleaving. Employing the basic Single-Error-Correcting-Double-Error-Detecting (SEC-DED) scheme reduces the SER by 97% with 1-way interleaving, 99.5% with 2-way interleaving, and achieves complete elimination with 4-way interleaving. Implementing the Double-Adjacent-Error-Correcting (DAEC) code corrects all errors for 2-way interleaving, while the Triple-Adjacent-Error-Correcting (TAEC) feature is necessary to correct all errors for 1-way interleaving. With a memory of larger capacity, longer irradiation duration, or reduced bit cell area, a greater number of less probable but larger adjacent errors would emerge (NEALE; SACHDEV, 2016). This would enrich the datasets, potentially allowing for more precise modeling of error rates in highly protected configurations that involve advanced error correction and interleaving (NEALE; SACHDEV, 2016).

Although bit-interleaving helps to increase the detection and correction capacity (reducing the SER) of ECC-based techniques, the method has a limitation according to the increase in the number of MCUs. Trends in MCU ratio and the maximum MCU multiplicity are plotted in Figure 3.26 considering two data patterns: Checker Board (CB) and All "1"/"0" (FF) (IBE et al., 2010). MCU ratio and multiplicity increase exponentially as scaling proceeds, independently of the data pattern (only minor differences between CB and FF are observed in the plot). This fact means that not only EDAC techniques but also other methods already presented to deal with radiation effects are always "chasing" updates to their methods (increasing area and energy consumption overhead) to try to reduce the SEU and MCU rates.

The method proposed in this work can be applied to memory circuits regardless of the technology used. As the method uses detection cells based on data cells from the same array, both cells will have similar sensitivity to soft errors. This sensitivity will be defined mainly by the technology and technological node used.

As seen previously, for mature planar CMOS technologies, the rate and format of the MCUs are smaller, causing the circuit to be much less impacted and, therefore, potentially reducing the detection capability of the method due to the reduced number of events. The MCU rate is higher for current technological nodes besides presenting larger n-bit MCUs. Even for technologies that are more robust to radiation effects, such as FD-SOI and FinFET, the increase in MCU rate continues with the advancement of technological scaling. This fact benefits the proposed method by increasing its detection capability according to the increase in the number of events.



Figure 3.26: Predicted trends of MCU ratio and maximum multiplicity in SRAM cells with technology scaling.

Source: Ibe et al. (2010).

Design Rule (nm)

200

300

100

1 L 0

#### 3.11 Overview

This chapter covered all the main concepts of radiation effects, from their origin to their impact on advanced technologies. Before presenting the comparison between the novel proposed method and state-of-the-art techniques for dealing with radiation effects, in addition to detailing the architecture and operation of the detection method proposed in this work, it is to position the method about the different radiation environments to which circuits may be exposed and consequently the different levels of protection available to deal with the effects arising from these environments.

Considering the origins of radiation effects presented at the beginning of this chapter, typically, the definition of the environment in which a circuit will operate occurs through the different types of orbits that relate to planet Earth. In order to have a more general and summarized view of the impact of the environment on circuit reliability, Table 3.1 presents the probability of radiation-induced upsets in different environments, considering the primary space environments:

• Low Earth Orbit (LEO): is a standard orbit for Earth observation and communication satellites. It is normally at an altitude of less than 2000 km but could be as low as 160 km above Earth. By comparison, most commercial aeroplanes do not fly at altitudes much greater than approximately 14 km, so even the lowest LEO is more than ten times higher than that (ESA, 2020b).

- Geostationary Orbit (GEO): GEO is a high Earth orbit where communication satellites maintain a fixed position relative to the Earth's surface. Satellites in GEO circle Earth above the equator from west to east following Earth's rotation taking 23 hours 56 minutes and 4 seconds by travelling at exactly the same rate as Earth (ESA, 2020b).
- Medium Earth Orbit (MEO): comprises a wide range of orbits anywhere between LEO and GEO. It is similar to LEO in that it also does not need to take specific paths around Earth, and it is used by a variety of satellites with many different applications (ESA, 2020b).

| Table 3.1: Probability | of Radiation-Induced | Upsets in | Different | Environments. |
|------------------------|----------------------|-----------|-----------|---------------|
| 5                      |                      | 1         |           |               |

| Radiation Environment     | Altitude              | Upset Probability                                |  |
|---------------------------|-----------------------|--------------------------------------------------|--|
| (Orbit)                   | (Km) (errors/bit/day) |                                                  |  |
| Low Earth Orbit (LEO)     | 160 to 2,000          | Low to Moderate $(10^{-10} \text{ to } 10^{-9})$ |  |
| Medium Earth Orbit (MEO)  | 2,000 to 35,786       | Moderate to High $(10^{-9} \text{ to } 10^{-7})$ |  |
| Geostationary Orbit (GEO) | 35,786                | High $(10^{-7} \text{ to } 10^{-6})$             |  |

Source: From the author (Data from ESA (2020b), Maqbool (2005)).

To determine the upsets probability according to the environment in which the circuit is exposed, ideally, one should compare the same circuit, with the same level of protection, in different types of orbits. This type of experiment is challenging to carry out, and therefore, the data presented in the literature varies considerably. Therefore, three probability levels were defined to facilitate understanding: Low to Moderate, Moderate to High, and High. Even so, values based on what was found in the literature were used to present at least the order of magnitude of this probability in numerical format. It is essential to highlight that the probability of upsets is not limited to the values presented. Deep Space missions, like those to Mars or other planets, can encounter even higher radiation levels due to cosmic rays and solar particle events.

The different techniques for dealing with the radiation effects previously presented can also be classified according to the radiation hardening level. RHBD cells, Redundancy-based, and EDAC techniques provide a high level of radiation hardening. Considering these three types of techniques, EDAC can be highlighted, which can correct detected upsets through ECC and parity bits. These methods are used in applications in all environments presented, presenting a good level of protection even in GEO. As previously highlighted, the proposed method benefits from the increased probability of upsets and, therefore, becomes a new alternative to be used not only in GEO applications but mainly in Deep Space, where the methods mentioned above are usually significantly impacted, reducing their protection.

## **4 STATE-OF-THE-ART**

This work presents a new method for detecting multiple events in memory circuits focusing on SRAMs. The proposed new method is developed with an architecture not yet presented in the literature, making direct comparisons with existing methods challenging. However, in addition to presenting the comparison with the state-of-the-art techniques for mitigating radiation effects in memory circuits, with emphasis on multiple events, the main works that served as the basis for this thesis will also be presented in this chapter.

# 4.1 Redundancy-based Techniques

Different techniques are presented in the literature to deal with the radiation effects. To begin, the work of **Tan et al. (2021)** was selected, which uses redundancy-based techniques. TMR and Gate-Sizing are commonly used hardening methods for combinational circuits. To address the significant area overhead associated with the traditional TMR method, typically applied at the module level, this paper proposes a more refined and versatile approach called General Efficient TMR (GE-TMR). Moreover, given that the hardening process involves multiple objectives, such as SER, delay, and area, this paper introduces an algorithm called Solution Distribution Optimized NSGA-II. The experimental results show that GE-TMR can provide lower SER solutions than gate-sizing when the area overhead is > 200%. By combining GE-TMR and gate-sizing, in the interval of 100% < area overhead < 200%, the solutions of MIX have lower SER than the two hardening methods optimized separately.

In Li et al. (2020), a design approach based on TMR is explored to mitigate double cell upsets. The method integrates three additional self-voter circuits into a conventional TMR structure to enhance error correction capability. Fault-injection simulations validate the soft error mitigation capabilities of this approach, showcasing its superior hardening performance over TMR and QMR (achieving at least 97% and 90% reductions in SER, respectively). The case study reveals that the proposed technique requires approximately 1/3 larger area than TMR, while both solutions exhibit nearly identical power consumption. The implementation of this approach aligns with the automatic digital design flow and is evaluated for its applicability and performance on a First In, First Out (FIFO) circuit.

An already known problem in using techniques for the robustness of circuits to
radiation is the area overhead, which is well highlighted in the previously presented papers that apply a redundancy-based technique. The redundancy techniques also present other critical points with an increase in multiple events because, in addition to increasing the probability of impact on the voter, the replicated modules themselves can be impacted, not allowing the voter to determine the correct output.

## 4.2 Radiation Hardening by Design Techniques

The works of **Li et al.** (2021) and **Han et al.** (2021) present RHBD memory cells. The paper of **Li et al.** (2021) introduces a high-reliability radiation-hardened memory cell (RH-14T) as a means to mitigate MCUs. Simulation Program with Integrated Circuit Emphasis (SPICE) simulations and 3D TCAD mixed-mode simulations were conducted to validate the RH-14T cell's robustness against SEUs. Comparative analysis with previous radiation-hardened memory cells revealed that the proposed RH-14T cell exhibits similar read access time, shorter write access time, and reduced sensitivity to process variations in both read and write access times. However, it is important to note that improving reliability often involves trade-offs concerning area, power consumption, and performance. The RH-14T cell, in order to achieve high reliability, utilizes additional transistors, resulting in a 1.5 times power consumption overhead compared to the 6T-cell and a larger area penalty.

Another memory cell architecture is proposed in **Han et al. (2021)**. The work proposes an SEU robust dual access 12T (DA-12T) SRAM with a radiation-hardened crossbar-based peripheral circuit. The paper states that the proposed cell with 209% area penalty exhibits superior SEU robustness compared to most existing cells. The utilization of the crossbar-based peripheral circuit effectively reduces the read failure rate in SRAMs. Additionally, implementing a new sense amplifier ensures both accurate and rapid reading operations, even in the presence of read disturbances. Experimental results demonstrate that the proposed cell exhibits a significantly reduced SEU cross-section compared to a standard cell with a dummy, with the SEU cross-section being only 60% of that standard cell. Moreover, when the operating frequency exceeds 40 MHz, the SRAM with crossbar-based peripheral circuit shows minimal to no read failures.

Still considering RHBD techniques, the works of CH et al. (2021) and Prasad et al. (2022) must be highlighted. In the work by CH et al. (2021), a novel RHBD SRAM bit-cell is introduced, leveraging the polarity upset mechanism of SEUs. The study demonstrates that the proposed RHBD14T bit-cell is resilient to SEUs and exhibits a higher critical charge for Single Event Multiple Effect (SEME) than state-of-the-art RHBD SRAM bit-cells. Monte Carlo simulations confirm that process variations do not compromise the SEU and SEME capabilities of the RHBD14T bit-cell. Additional simulations reveal that the proposed RHBD14T exhibits a lower probability of failure than reported RHBD SRAM bit cells. As a result, the sensitive area of the proposed bit cell is 128% smaller in comparison to recently reported state-of-the-art RHBD bit cells.

A RHBD-13T SRAM cell is proposed in **Prasad et al. (2022)** to effectively mitigate single and double-node upsets, achieving a superior balance across various parameters. The key advantage of this proposed cell lies in its exceptionally high critical charge, minimal power consumption, and enhanced stability compared to other standard cells. The proposed cell incorporates a distinctive feedback path among its internal nodes for enhanced speed. The author asserts that by comparing the single node upset probability of failure among cells, the proposed cell emerges as a superior choice for sub-nanometer aerospace applications.

In the previously presented papers that propose a RHBD cell, the important area and power consumption penalties of the new proposed architectures are verified again. This type of technique does not allow the detection and correction of upsets. The new proposed cells support a greater amount of collected charge without causing a bit-flip, in addition to reducing the charging-sharing effects. However, depending on the environment in which these circuits will be exposed and the amount of charge that will impact the circuits, the cells will be impacted, making it not possible to detect and correct the upsets.

### **4.3 Error Detection and Correction Techniques**

In Vlagkoulis et al. (2022), a new EDAC technique for SRAM-based Field-Programmable Gate Array (FPGA) is presented. FPGA vendors commonly incorporate ECC into the configuration memory to support designers in implementing scrubbing mechanisms. Although these ECC schemes typically ensure the correction of single- and double-bit errors per configuration frame, they cannot correct upsets with higher multiplicity caused by a single event within a single frame. This paper presents a configuration memory scrubbing approach designed for SRAM-based FPGA devices. The proposed approach combines embedded ECC logic with an interframe, interleaved parity code to create a mixed 2-D coding technique. By incorporating this technique, the on-chip ECC scheme's multiple-bit error correction capabilities are enhanced while maintaining low error correction latency and hardware cost. The scrubbing concept has been validated under heavy-ion irradiation, where it corrected all the single and multiple upsets observed during the radiation experiment.

SEC-DED codes are widely used to enhance memory system reliability. However, standard SEC-DED implementations are no longer sufficient to ensure information reliability, mainly when dealing with a significant number of bit-flips per coded word, such as in the case of MCUs. In this context, **Silva et al. (2020)** introduce the extended Matrix Region Selection Code (eMRSC), an enhanced version of MRSC. The original 16-bit code is extended to a new 32-bit MRSC version. Additionally, a novel data matrix region scheme is proposed to minimize the generation of redundant bits. Comparative experiments against well-known codes demonstrate the superior performance of the proposed codes. Synthesis analysis indicates that these codes not only enhance reliability but also incur low implementation costs, including minimal area, reduced coding/decoding delays, and lower power overheads.

Both works use one of the most used techniques to deal with MCUs, as it allows not only the detection but also the correction of some upsets. Despite providing very powerful features to increase the robustness of the circuit, the challenge in EDAC techniques is precisely the limit of events that can be detected and corrected. The use of this type of technique always has to consider the environment in which the circuit will be exposed because, possibly due to the increase in the number of multiple events, the method cannot detect a significant amount of the upsets that impacted the memory.

### 4.4 Radiation Monitors and Software-based Techniques

Another technique for dealing with radiation effects is to monitor the radiation in the environment where other circuits will be exposed. The paper of **Wang et al.** (2021) presents an SRAM-based flexible radiation monitor. The monitor was fabricated using 65 nm CMOS technology and designed as an Application-Specific Integrated Circuit (ASIC). It comprises a 768 kbit SRAM bit cell matrix with an individual power supply and a digital control core featuring a serial peripheral interface. The flexibility of the radiation sensitivity is achieved by adjusting the core voltage of the SRAM matrix. Additionally, the study implements SRAM bit cells with different threshold voltages to expand the tunable sensitivity range. The radiation monitor underwent testing using heavy ions with a LET ranging from 1.5 to 48.5 MeV.cm<sup>2</sup>/mg, high-energy (50-186 MeV) and low-energy (0.7-5 MeV) protons, as well as 14 MeV and thermal neutrons. An analysis was conducted to observe the changes in SEU sensitivity while adjusting the supply voltage under various radiation environments. The results demonstrate that the monitor exhibits potential for applications in space and other relevant facilities.

A technique used more often but still adopted for different functions is Bulk Builtin Current Sensor (BBICS). BBICS is a cost-effective solution to detect energetic particle strikes in integrated circuits. By strategically placing an adequate number of BBICSs throughout the chip, it becomes possible to pinpoint soft error locations. This allows for activating dynamic fault-tolerant mechanisms in those areas, effectively correcting the soft errors in the affected logic.

In the work of **Andjelkovic et al.** (2022), a pulse-stretching BBICS (PS-BBICS) was presented by combining a conventional BBICS with a custom-designed pulsestretching cell. PS-BBICS aims to facilitate on-chip measurement of the SET pulse width, enabling the detection of the LET from incident particles, thereby providing a more precise assessment of radiation conditions. The simulation results demonstrate that the PS-BBICS proposed can detect SET current pulses induced by energetic particles with LET ranging from 1 to 100 MeV cm<sup>2</sup> mg<sup>-1</sup>. The study reveals a notable increase in SET pulse width, ranging from 620 to 800 ps across the investigated LET spectrum, considering up to ten monitored inverters. Continually monitoring supply voltage and temperature variations is necessary to ensure precise SET pulse width measurement. Additionally, accounting for the impact of process corners is vital.

Both techniques presented allow knowing the environment in which the circuit in question will be exposed and using it with greater safety against the radiation effects. The difference to the method proposed in this thesis is that this type of technique does not directly increase the robustness of a specific circuit against single or multiple upsets. The SRAM and the circuits in which the BBICS are used for monitoring can be impacted in some way that impairs its monitoring.

In addition to the traditional and most used hardware-based techniques, softwarebased methods for mitigating soft errors are becoming increasingly crucial to achieving a high level of protection with minimal overhead. Deep Neural Network (DNN) models are utilized in safety-critical embedded devices to perform tasks like object identification, recognition, and trajectory prediction. Optimized variants of these models, especially convolutional ones, are gaining popularity in edge-computing devices with limited resources. However, DNN models are susceptible to radiation-induced soft errors, posing a significant and necessary challenge when addressing these errors in resource-constrained devices.

Aiming to mitigate the drawbacks of hardware-based approaches, **Gava et al.** (2023) investigate the effectiveness of a lightweight software-based mitigation strategy known as register allocation technique (RAT). The study applies RAT to a Convolutional Neural Network (CNN) model running on two commercial Arm microprocessors (Cortex-M4 and M7) while exposed to neutron radiation. Results gathered from two neutron radiation campaigns indicate that RAT reduces the number of critical faults in the CNN model on both Arm Cortex-M microprocessors. Furthermore, the outcomes suggest that the RAT-hardened CNN model can achieve a reduction of up to 83% in Silent Data Corruption (SDC) Failure in Time (FIT) rate, with a runtime overhead of 32%. The author asserts that future works intend to combine and compare RAT with other mitigation techniques developed at a lower code level, aiming to surpass actual limitations and consider architectures with more resources available.

The techniques presented throughout this chapter use different methodologies to increase the robustness of circuits, providing many benefits. However, as expected, many challenges are also introduced. These challenges vary according to the technique in question, but from this overview of all the main techniques used to deal with radiation-induced upsets, one can see a common challenge. The reduction in the robustness of the proposed techniques (reduction of the ability to mitigate, detect, or correct) is observed as the number of events increases.

The method proposed in this thesis also introduces challenges that must be considered. However, unlike existing techniques, the proposed method does not have this limitation according to the increase in the MCU rate. The chosen percentage of detection cells in the SRAM array dictates the memory's level of radiation hardening. With an increase in the MCU rate, the method's detection capability improves, aligning with the heightened probability of a detection cell being impacted. The results of this thesis are compared to the state-of-the-art works in Section 7.4.

## 4.5 Related Work

All the works presented previously are also related to this thesis. However, two works directly related to this thesis are highlighted in this section. The work of **Mederos** (2017) contributes to this thesis by designing an SRAM using the same technology used in one of the circuits designed in the thesis. In **Deval, Lapuyade and Rivet (2019)**, the basis of the initial idea of the method proposed in this thesis was presented.

The work of Mederos (2017) aims to develop analytical expressions to determine the primary performance parameters of an SRAM cache implemented in 28 nm FD-SOI technology. The main objective is to explore transistor dimensions cost-effectively, resulting in efficient designs prioritizing energy consumption, speed, and yield. The thesis introduces a novel approach to sizing the 6T-SRAM bit cell, departing from the conventional thin-cell design. Instead of focusing on the traditional aspects, this approach utilizes transistor lengths as a design variable to minimize static leakage. The study combines the single-P-well structure with the reverse-body-biasing Reverse-Body-Biasing (RBB) technique to better balance PMOS and NMOS transistors. The outcome of this research is the development of a 128 kB SRAM cache. Post-layout simulations demonstrate that the circuit operates under an average energy consumption of 0.604 pJ per word access (64 I/O bits) when supplied with 0.45 V and operating at 40 MHz. The main contribution of this work to the thesis is the presentation of the complete design of an SRAM in the 28 nm FD-SOI technology from ST Microelectronics. Schematics and layouts of each sub-circuit that makes up the SRAM are presented in detail, facilitating the understanding and the design of part of the circuit presented in the actual thesis.

The paper of **Deval, Lapuyade and Rivet (2019)** addresses some design techniques to mitigate the sensitivity of silicon integrated circuits to radiation effects. Both analog and digital circuits are addressed here. Section C of chapter 3 of the paper has presented the idea which inspired this thesis. The solution proposed to detect radiationinduced upsets in a memory circuit consists in inserting at regularly-spaced points of the memory plan some dual-stable-state sub-circuits designed to be as -or more if neededradiation sensitive as the memory cells dedicated to storing data. The paper does not suggest any architecture for the detection cells or how the memory would communicate with the processor or other devices, warning that it was impacted. These characteristics were developed throughout this thesis.

## **5 DETECTION METHOD**

The method presented in Figure 5.1 consists of spatially interleaving a memory plan with a network of memory radiation detectors (detection cells). Each detector is associated with a cluster of physically adjacent data cells. The detection cells will preferably be positioned interleaved with the data cells in order to obtain a larger coverage of the memory plan with the smallest number of detection cells. At the bottom of this plan, a logic circuit is implemented to create an alarm signal when a radiation-induced particle impacts the memory plan changing the detector's state. The detection cells are automatically refreshed to their initial state after the impact. The alarm signal is sent to the processor, which can perform an interruption or a total/partial memory refresh.

The main advantage of the proposed solution is that the technique does not present an event detection limitation, according to the increase of SEUs and MCUs. If the ratio of the number of radiation detectors and the number of data cells is sufficiently high, the probability of detecting an upset increases, also increasing the tolerance to SEUs and MCUs. For instance, considering 50%, the ratio of detection and data cells, the probability of detecting MCUs, considering only the SRAM core, can reach close to 100%. Only SEUs that exclusively impact data cells would not be detected. This idea is not based on a specific experiment that presents this detection rate but rather on validating the detection capability of the cells designed for this. When an MCU occurs in a memory plan, two or more adjacent cells will be impacted. Considering the percentage of detection cells in the example, there will be detection cells next to each data cell, and then some of these cells will also be impacted, making the proposed method offer a detection probability close to that presented. It is essential to highlight that the proposed method is not applied to the peripheral circuits of an SRAM (sense amplifiers, write drivers, and decoders). So considering a complete SRAM and not just the core, the detection probability of the method will be maintained, and the impact of upsets on peripheral circuits will depend on a radiation hardening design of these circuits.

## 5.1 Architecture

According to the objectives already mentioned in Section 1.1, it is important to initially consider the detection cells separately from the data cells to validate the new method. In this way, in the prototype circuit (presented in the next chapter), the detection

Figure 5.1: The design-at-the-circuit-level solution: (a) Memory plan spatially interleaved; (b) Alarm signal; (c) Cluster of physically adjacent data cells.



Source: From the author.

cells were designed and placed in a square plan, simulating a traditional memory plan but composed of detection cells only. With the initial validation completed, the method is revalidated in a complete memory plan composed of data and detection cells, presented in Chapter 7.

A detection logic circuit, responsible for creating the alarm signal (SIGNAL<sub> $N \times N$ </sub>), is placed at the bottom of this plan. In the prototype version of the circuit, the detection logic is designed in a tree structure composed of AND2 logic gates and an SR Latch circuit. The detection logic is improved in the full memory version, using NAND2/NOR2 gates. To better understand the architecture and operation of the detection method, the electrical schematics presented below will only show detection cells, that is, the schematics of the prototype version of the circuit. However, it is essential to note that data and detection plans will be interleaved and physically together in the designed SRAM. The detection plan schematic, highlighting the detection logic and the detection cell architecture, is presented in Figure 5.2. The SIGNAL<sub> $N \times N$ </sub> output represents the alarm signal, with the N×N representing the array's dimensions. Also, the PULSE<sub>N×N</sub> output serves as a debug to verify if the detection cells are retrieving their previously stored values after being impacted. In every two columns of the cell plan, the Detection Lines (DLs) are connected to the AND2 gates positioned on the first level of this circuit. The outputs of these same gates are used to send this signal back to the cell plan through the Refresh Lines (RLs) and propagate the signal to the other AND2 gates in the circuit to get a single alarm signal through the SR Latch output.



Figure 5.2: The detection plan schematic, highlighting the detection logic and the detection cell architecture.

Source: From the author.

It is essential to highlight that the detection and refresh lines impose a challenge concerning the relation between their length and the activation delay of the alarm signal, which is detailed in the following chapters. However, the parasitics in the metal tracks that make up these lines in the circuit design do not directly influence the performance of the rest of the SRAM, especially considering the data cells. Despite the proposed method adds detection cells (with detection and refresh lines) to the SRAM array, these cells are not physically connected with the SRAM data cells, not contributing to the capacitance increase in the bit lines and thus not impacting the SRAM bit cells performance. A slight increase in the capacitance of the bit lines will be added just by increasing the length of the bit lines' metal tracks due to the increased number of cells (data and detection) in the same column of the memory array.

The architecture of the detection cell (shown in Figure 5.2) was defined based on the traditional 6T-SRAM bit cell. The 6T-SRAM bit cell core is maintained with two PMOS and NMOS transistors forming a cross-coupled pair of inverters. However, in the proposed detection cell, the NMOS access transistors are replaced by pairs of PMOS/NMOS transistors in charge of detecting and refreshing the detection cell after a state change. The goal is to obtain a detection cell with the area and sensitivity to radiation effects similar to the traditional memory cell. As the detection cells will be interleaved with the data cells, it is important that these cells have the same area in order to fit perfectly, physically connecting. Due to the greater number of transistors used in comparison with the 6T-SRAM bit cell and also according to the sizing of these transistors, the area of the cells used in the memory array will be delimited by the area of the detection cells. Different from the data cell and peripheral circuits that make up an SRAM, and as it is a new architecture, the sizing of the detection cell is not trivial and cannot be based on previous work. In addition to the objective of using transistors with reduced dimensions, the critical point in the detection cell sizing is the transistors P0, N0, P3, and N3. These transistors must be sized with a sufficiently large width to drive the signal through the DLs and RLs in the shortest possible time, reducing the alarm signal delay. However, these same transistors will contribute to an increase in the metal tracks' capacitance that makes up the DLs and RLs; thus, there is a trade-off. Chapters 6 and 7 will present, in more detail, the sizing of the detection cells designed in the two technologies adopted in this work. The operation of the detection method highlighting the detection cells and logic is detailed in the next section.

## **5.2 Operation**

The PMOS/NMOS pair of transistors on the left side of the detection cell is responsible for detecting a change in the value stored in the cell. The transistors positioned on the right side, on the other hand, are in charge of refreshing the cell to its initial value after a change of state occurs.

Figure 5.3a and Figure 5.3b present the detection plan operating modes. In regular operation (no cell impacted), all the detection cells store a '0' logic value. In this case, considering a single detection cell view (presented in Figure 5.2), there is an ON-state PMOS (P0) and an OFF-state NMOS (N0) transistor on the detection side, allowing VDD to pass through the detection line and reach the detection logic circuit ( $DL_{0_OUT} = '1'$ ). Through the first level AND2 logic gate output ( $RL_0$ ), this same signal arrives at the right side of the detection cell, where the ON-state NMOS (N3) and OFF-state PMOS (P3) transistors will keep the value stored by the cell. This signal is also propagated through the detection logic keeping the alarm signal off.



Figure 5.3: Detection plan operation modes: (a) Regular operation (no event); (b) Cell impacted by a radiation event.

Source: From the author.

Whenever one or more cells in the plan are impacted, causing a change in the stored logic value  $(0\rightarrow 1)$ , there is an inversion in the states of all the transistors abovementioned. Thanks to the ON-state NMOS transistor on the left side of the detection cell (N0), in this operation mode, VDD is no longer passing through the detection line but GND. This value also modifies the output of the AND2 gate at the first level ( $RL_{0_OUT} = '0'$ ), which is propagated throughout the detection logic, modifying the output of the last level AND2 gate and therefore triggering the alarm signal (SIGNAL<sub>N×N</sub> = '1'). That same signal also arrives on the right side of the detection cell, causing the cell to retrieve the value previously stored.

### **5.3 Overheads**

Despite the advantages of the new proposed method, some challenges are also introduced and must be presented. It is not possible to accurately determine the area and power consumption penalties considering the prototype version of the circuit, as the circuit is not yet interleaved in a traditional SRAM plan. However, through the circuit layout, it is possible to estimate the impact of the detection cells in a memory array.

#### **5.3.1 Area Penalty**

The area penalty is related to the number of detection cells that will be interleaved in the plan. In order to obtain the greatest robustness, 50% of the cells would be used for detection; that is, considering only the core of an SRAM composed of traditional data cells (6T), the area penalty would be 100%. However, it is also necessary to consider the area increase of about 44% (varies depending on the sizing used for the data cell) for each cell due to the addition of two transistors in the detection cell design. This estimate uses as a basis for comparison a 6T-SRAM bit cell, sized according to the relationship:  $W_{pull-down} = 2 \times W_{access}$  and  $W_{pull-up} = 1.5 \times W_{access}$ . This transistor sizing used only as an example in this comparison, is by the limits presented in Section 2.1.1 to obtain stable read/write operations (CR = 2, PR = 1.5). In the most robust case, the proposed method would observe a total area penalty of about 144%.

Considering not only the core but the entire memory, the area dedicated to detection logic must also be considered. The area penalty of the detection logic is minor, as it is reduced according to the increase in the memory plan size. For the  $8 \times 8$  plan, the area penalty imposed by the detection logic is about 20% of the total area of the plan. However, considering a commercial-size SRAM, also designed in this work, with  $256 \times 256$ memory cells, the area penalty becomes almost negligible, being around 1% of the total memory area. The addition of detection and refresh lines in the design of the detection and data cells does not influence the area penalty, as these lines occupy the same area added by the two extra transistors, which already makes the total area of the detection cell slightly larger than the area of the data cell.

#### **5.3.2 Power Consumption Penalty**

The power consumption of an SRAM can be separated into static and dynamic components. The static power consumption is determined by the sub-threshold leakage currents of the cells. The dynamic power consumption is determined by the capacitance switching on the read and write operations. The proposed method adds unique cells for

detection but does not modify the data cells used in SRAM. Therefore, considering only the data cells, the new method adds no power consumption penalty.

Because it is based on the conventional 6T-SRAM bit cell, the detection cell, individually, presents a total power consumption similar to the data cell. However, data cells are read and written constantly according to the SRAM operation frequency, consuming dynamic power and increasing the total power consumption. The detection cells will consume dynamic power only when a particle impacts these cells, causing bit-flips and increasing the total power consumption until the alarm signal is sent.

With the possibility of adding different percentages of detection cells in the memory plan and also the uncertainty of how many cells will be impacted in a given period, it is very challenging to quantify the power penalty of the proposed method. It is possible to estimate that the static power will increase linearly according to the percentage of detection cells added. This increase will be practically negligible in mature technologies, as the contribution of static power to total power consumption is very small in these technologies (BHAT et al., 2005; TELIKEPALLI, 2005). In advanced technology nodes, below 90 nm, the supply voltages are smaller, and the static power is more significant due to the leakage currents (BHAT et al., 2005; TELIKEPALLI, 2005). Thus, not only the detection cells but also the data cells would have a slightly higher contribution to the total power consumption compared to mature technologies. For instance, to achieve high reliability, the RH-14T cell (LI et al., 2021) utilizes additional transistors, resulting in a  $1.5 \times$  power consumption overhead compared to the 6T-SRAM bit cell.

### 6 PROOF-OF-CONCEPT

After presenting the architecture and operation of the proposed new method, it is essential to implement and test the initial idea to validate it. As a proof-of-concept, a prototype version of the circuit composed only of detection cells was designed, manufactured, and tested using two methods: electrically-induced SEU/MCU testing and SEEs laser testing. The main contributions presented in this chapter are published in (BRENDLER et al., 2022; BRENDLER et al., 2023b).

## 6.1 Circuit Design

The circuit was designed and manufactured in the 350 nm CMOS Process Technology from Austria Micro Systems (AMS). This technology was chosen for cost reasons to manufacture a prototype version before being scaled down. Two square detection plans with different sizes were designed to validate the method proposed and observe the circuit's behavior in the relationship between the design and the different radiation-induced effects. Their layouts are presented in Figure 6.1.

Figure 6.1: Detection plan layouts: (a) Detection Cell, (b)  $4 \times 4$  Detection Plan, and (c)  $8 \times 8$  Detection Plan.



Source: From the author.

As already presented in Chapter 5, the detection cell sizing is not a trivial task. This prototype version's main objective was to use transistors with  $W \approx W_{min} = 0.7 \ \mu m$  considering the technology adopted. Therefore, only two different W values were used to size the eight transistors of the detection cell. The transistors responsible for switching in the DLs (P0, N0) were sized with a larger W to reduce the alarm signal delay. Considering the electrical schematic presented in Figure 5.2, the sizing of the detection cell considered  $L = L_{min} = 0.35 \ \mu m$  for all transistors, with  $W_{P0,P1,P2} = W_{N0,N2} = 2 \ \mu m$  and  $W_{P3} = W_{N1,N3} = 1 \ \mu m$ .

The smallest plan (4×4), shown in Figure 6.1b with an area of 3,519  $\mu$ m<sup>2</sup>, comprises 16 detection cells; its logic circuit in charge of sending the alarm signal has three AND2 logic gates. The biggest plan (8×8), shown in Figure 6.1c, and designed in a 10,153  $\mu$ m<sup>2</sup> area, is composed of 64 detection cells, in addition to seven AND2 gates for sending the alarm signal. The output of the last level AND2 gate in the detection circuit of each plan is connected to an SR Latch circuit that checks if the received impact was enough to cause a bit-flip in the cell, sending or not the alarm signal. Therefore, each detection plan has two outputs: the SR Latch output (alarm signal) and the last level AND2 gate output (impact received by the cell and retrieval of the stored value).

In addition to the detection plans described above, on-chip test circuits were designed to simulate the impact of one or more particles on these plans. Figure 6.2 presents the layout of the test module composed of two sub-blocks for current pulse generation in Q and QB nodes (previously presented in the detection cell schematic of Figure 5.2). The sub-block presented in Figure 6.2a is responsible for the pulse generation in all the selectable Q nodes, and the sub-block shown in Figure 6.2b is responsible for the pulse generation in all the selectable QB nodes. In Figure 6.2a, the circuits that compose each sub-block are highlighted: pulse generators (composed of edge detector circuits),  $3 \times 8$ detector (for cell selection), and PMOS/NMOS structures for current insertion. The architecture and operation of these circuits are better described in the section 6.1.1.

The core of the chip, composed of the detection plans and the test circuits, has a total area of 0.073 mm<sup>2</sup>. The complete test chip layout, considering the insertion of 23 PADs allowing for the input and output of data and power, has a total area of 1.21 mm<sup>2</sup> and is shown in Figure 6.3. The test chip is composed of 18 different pins distributed in the 23 PADs; the complete pin description is presented in Table 6.1. The pins can be divided as:

Figure 6.2: Layout of the on-chip fault injection test circuits composed of pulse generators,  $3 \times 8$  decoders, and PMOS/NMOS current insertion structures.



Source: From the author.

- 9× Input pins;
- 4× Outputs pins;
- 2× Power Supply pins;
- $1 \times$  Ground pin;
- $2 \times$  Photodiode pins (laser beam calibration).

The test chip was manufactured in a TQFP32 open cavity (removable lid) packaging, enabling radiation experiments, such as pulsed laser testing. Before proceeding with the laser tests, the initial objective is to verify the circuit operation, and observe possible different behaviors about the location of the electrically-induced SEUs/MCUs inserted precisely in the internal nodes of the cells, through the previously presented test structures that are detailed in the next section. Figure 6.3: The full test chip layout highlighting the core area with the designed circuit and the total area of the chip considering the pads.



Source: From the author.

# 6.1.1 On-Chip Test Circuits

The test structure comprises decoder and edge detector circuits to select the cell and generate the pulse, respectively. Two  $3 \times 8$  decoders are used to select the cells (and their respective internal nodes) that will be impacted. In total, 28 different assessment scenarios are available for the silicon fault injection. Four rising edge detector circuits were designed to generate the current pulse that will be inserted into the Q/QB nodes of the cells: two of them to carry out the insertion of positive faults (SET 010) and two for the insertion of negative faults (SET 101) in both plans. The outputs of the decoders and edge detection circuits are connected to series PMOS/NMOS transistor structures in order to convert the voltage pulses into current pulses. The PMOS/NMOS transistor structures are sized to insert a current pulse of the same amplitude into all selected cells. The outputs of these structures (the drains of the PMOS and NMOS transistors) are directly connected to the internal nodes of the selectable detection cells through metal tracks also designed to keep the same current pulse amplitude for all selectable cells. In Figure 6.4, this relationship between cell selection and insertion of the current pulse can be better observed.

| PIN                   | Direction | Value                                | Description                                                                                                |  |  |
|-----------------------|-----------|--------------------------------------|------------------------------------------------------------------------------------------------------------|--|--|
| S0_Q                  | Input     | 0V/3.3V                              | Bit 0 of the $3 \times 8$ Decoder 'Q'                                                                      |  |  |
| S1_Q                  | Input     | 0V/3.3V                              | Bit 1 of the $3 \times 8$ Decoder 'Q'                                                                      |  |  |
| S2_Q                  | Input     | 0V/3.3V                              | Bit 2 of the $3 \times 8$ Decoder 'Q'                                                                      |  |  |
| S0_QB                 | Input     | 0V/3.3V                              | Bit 0 of the $3 \times 8$ Decoder 'QB'                                                                     |  |  |
| S1_QB                 | Input     | 0V/3.3V                              | Bit 1 of the $3 \times 8$ Decoder 'QB'                                                                     |  |  |
| S2_QB                 | Input     | 0V/3.3V                              | Bit 2 of the $3 \times 8$ Decoder 'QB'                                                                     |  |  |
| RESET                 | Input     | 3.3V<br>(rising edge)                | Signal to reset and start the Latches.                                                                     |  |  |
| PULSE_GEN             | Input     | $0V \rightarrow 3.3V$ (square pulse) | Signal (#1) of the pulse generator circuits: voltage pulse.                                                |  |  |
| V_CONTROL             | Input     | 600 mV - 3.3 V                       | Signal (#2) of the pulse generator circuits to control the width of the generated pulse: constant voltage. |  |  |
| VDD                   | I/O       | 3.3V                                 | Power Supply of the main circuits $(4 \times 4 \text{ and } 8 \times 8 \text{ detection plans}).$          |  |  |
| VDD0                  | I/O       | 3.3V                                 | Power Supply of the test circuits (Decoders, pulse generators and buffers).                                |  |  |
| GND                   | I/O       | 0V                                   | Ground shared by the entire circuit.                                                                       |  |  |
| $SIGNAL_{4 \times 4}$ | Output    | N/A                                  | Alarm signal of the $8 \times 8$ detection plan.                                                           |  |  |
| $PULSE_{4 \times 4}$  | Output    | N/A                                  | Output (pulse) of the $8 \times 8$ detection plan (before the latch).                                      |  |  |
| $SIGNAL_{8 \times 8}$ | Output    | N/A                                  | Alarm signal of the of the $4 \times 4$ detection plan.                                                    |  |  |
| $PULSE_{8 \times 8}$  | Output    | N/A                                  | Output (pulse) of the $4 \times 4$ detection plan (before the latch).                                      |  |  |
| ANODE                 | N/A       | N/A                                  | Anode of the photodiode (to calibrate the laser beam).                                                     |  |  |
| CATHODE               | N/A       | N/A                                  | Cathode of the photodiode (to calibrate the laser beam).                                                   |  |  |

Table 6.1: Pins description of the test chip.

Source: From the author.

Figure 6.4: Schematic of the on-chip fault injection test circuits composed of two 3x8 decoders and the PMOS/NMOS test structures.



Source: From the author.

# **6.1.2 Test Board Features**

A Printed Circuit Board (PCB) was designed, manufactured, and associated with the test chip, allowing the insertion of transient faults in different positions of the designed detection plans. Figure 6.5a and Figure 6.5b present the designed PCB with the test chip and die photo highlighting the test circuits and detection plans, respectively. The PCB was designed aiming to provide multiple signal and power input options in order to carry out tests more accurately:

Figure 6.5: (a) Test Chip (highlighted in grey) with associated PCB, and (b) Die Photo with zoom in the test circuits and the detection plans.





# 1. **Power - VDD (3 options):** *To power supply the test chip.*

- Directly from the power source;
- Battery;
- Regulated power supply (2.4V 3.6V).
- 2. Signal IN\_Pulse\_Gen and RESET\_Latch (3 options): Input of the edge detector circuits (Pulse Generator) and the reset signal of the SR Latch.
  - Directly from the signal generator source (SMA);
  - Button/DIP Switch;
  - External board (digital).

- 3. Signal Pulse Width Controller (2 options): Input of the edge detector circuits (Pulse Generator). Possibility to vary the pulse width.
  - Regulated voltage + Potentiometer;
  - Directly from the power source.
- 4. Signal Decoders Input: Input of the two  $3 \times 8$  decoders.
  - 6-Position DIP SWITCH: 3 positions (bits) for each decoder.
- 5. Signal SIGNAL<sub>4×4</sub> and SIGNAL<sub>8×8</sub> outputs: Alarm signal outputs of the two detection plans.
  - 2× Header Pins (one for each output): direct connection with the oscilloscope probe.
- 6. Signal PULSE<sub>4×4</sub> and PULSE<sub>8×8</sub> outputs: Pulse debug outputs of the two detection plans.
  - 2× Header Pins (one for each output): direct connection with the oscilloscope probe.

# 6.2 Electrically-Induced SEU/MCU Test Methodology

Besides evaluating the circuit through a traditional SEEs laser testing, this work also offers a different possibility of testing in silicon. Through the on-chip test structure previously presented and using a model typically used only for electrical simulations, it is possible to generate electrically-induced SEUs/MCUs in different positions of the detection plans. The test circuits allow the selection of a detection cell and a specific internal node (Q or QB nodes presented in the detection cell schematic of Figure 5.2) within the detection plan to insert a transient fault through a current pulse. Unlike electrical simulations, this test method allows the injection of faults in the silicon, providing more accurate results allowing to verify the correct operation of the circuit.

Regarding the  $4 \times 4$  plan, 8 of the 16 cells can be selected for silicon fault injection, 4 with the pulse insertion in the Q node, and 4 in the QB node. The outputs for this plan are represented by SIGNAL<sub>4×4</sub> and PULSE<sub>4×4</sub>. In this plan, it is possible to impact 1 or 2 cells simultaneously, simulating the Double-Bit Upset (DBU) effects. The cells that can be impacted were chosen in order to provide different possibilities and DBU formats: horizontal, diagonal, and vertical. Figure 6.6 presents the cells that can be impacted in the  $4 \times 4$  plan, considering an example of three different possibilities for inserting the faults. Figure 6.6: Selected cells for electrically-induced SEU/MCU fault injection in the  $4 \times 4$  plan, considering different scenarios.



Source: From the author.

In the 8X8 plan, 10 of the 64 cells can be selected for silicon fault injection, 5 with pulse insertion in the Q node, and 5 in the QB node. The outputs for this plan are represented by SIGNAL<sub>8×8</sub> and PULSE<sub>8×8</sub>. In this plan, it is possible to impact 1, 2, 3, or 4 cells simultaneously, simulating the DBU, Triple-Bit Upset (TBU), and Quadruple-Bit Upset (QBU) effects. Figure 6.7 shows the detection cells that can be impacted in the  $8\times8$  plan, considering an example of three different scenarios.

# 6.3 Laser Test Methodology

In addition to the initial validation of the circuit through electrical measurements, SEEs laser testing was chosen in order to provide a more realistic radiation experiment with more accurate results (MELINGER et al., 1994). Pulsed laser experiment was performed at the ATLAS laser facility of the IMS Laboratory, University of Bordeaux. Figure 6.8a presents a schematic of the experimental setup. The front-side SEEs laser testing method was used, and the tests were performed at room temperature. The laser wavelength was 1064 nm (single-photon absorption) with a pulse duration of 30 ps. The focused laser full width was defined at half maximum, Full Width at Half Maximum (FWHM), providing a spot radius of about 4  $\mu$ m at the target plan. The test chip was

Figure 6.7: Selected cells for electrically-induced SEU/MCU fault injection in the  $8 \times 8$  plan, considering different scenarios.





mounted on a three-dimensional motorized stage, and a scan was performed in the  $8 \times 8$  plan of the test chip. A photo of the laser test setup is presented in Figure 6.8b.

It is important to highlight that the laser wavelength chosen to perform the experiments presented in this work has already been used to induce SEUs in commercial SRAMs (DARRACQ et al., 2002; FARAUD et al., 2011). Also, front-side laser testing was used due to the PCB incompatibility for back-side laser testing. The experiments were carried out in the single-shot test mode, in which only one laser pulse was used to induce an event.



Figure 6.8: Laser test setup: (a) Schematic of the test setup considering the equipment used, and (b) Photo of the test setup highlighting the frontside approach.

(a)



(b)

Source: From the author.

The main objective was to verify if the circuit could detect the impact of a laser beam, maintaining its correct operation. As previously mentioned, the manufactured circuit is not composed of traditional SRAM bit cells or their peripheral circuits. The focus of the performed laser tests is the detection cells. Unlike the initial tests, in which it was not possible to vary the intensity of the inserted current pulse, for the laser tests, it is possible to vary the laser pulse energy. This makes it possible to analyze, more precisely, a possible difference in sensitivity between the detection cells according to their position in the plan. The pulse width and amplitude observed at the output of the circuit will not be evaluated in this case, but the threshold energy of the laser pulse that causes an alarm signal considering each row of the plan. The variation of the energy supplied by the laser in the test circuit was possible through the use of an attenuator, represented in the schematic of Figure 6.8a. The tests were performed with different energy values ranging from 0.32 nJ to 2.1 nJ.

A laser spot large enough to cover the area of a single cell was used to carry out the experiments. Ideally, more minor laser spots are used to impact only the sensitive area of the cells and reduce the reflection on metal tracks. However, at this first moment, the goal of the work is not to determine the most sensitive areas of the detection cell but the detection plan as a whole. These cells will be interleaved on a traditional SRAM plan in more advanced technology. That is, with a higher probability of only one particle impacting more than one cell simultaneously. Therefore, using a laser spot with a size close to the cell area will allow a more realistic analysis of the circuit behavior.

It is important to point out that it was not possible to simulate the MCU effects using this test methodology. To impact more than one cell of the detection plan simultaneously would require using a laser spot of at least two times the size of the one used in this work. This increase in the laser spot would significantly reduce the energy that is deposited on the chip, and no event could be observed.

### 6.3.1 Single Cell

Besides verifying the expected behavior after the laser beam impact, the first laser test aimed to determine the circuit's pulsed laser threshold energy. A cell in row 8 of the  $8 \times 8$  detection plan was selected for the evaluation. In the top-down direction, this is the last row of the detection plan and, therefore, the closest to the detection logic, reducing the capacitance that must be driven in detection and refresh lines. A single-shot laser beam was inserted in the center of the chosen cell, varying the laser energy from 0.32 nJ until a change of state was observed in the circuit.

## 6.3.2 Scan Test

In the  $2^{nd}$  laser test, several scans of the 8x8 plan were performed for different laser energy values ranging between 0.32 nJ and 2.1 nJ. For each scan, the cells were scanned using a 4 µm laser spot in 10 µm steps, inserting a single-shot laser beam at the center of each of the 64 cells in the plan. Due to the size of the spot, no differences in behavior were observed using smaller steps, inserting more than one laser beam into the same cell. The main objective of this test was to verify the possible sensitivity difference of the cells according to the position in the detection plan. More specifically, if the cells show a reduction in sensitivity according to the distance from the detection logic.

### 6.4 Electrically-Induced SEU/MCU Measurements

By design, all selectable detection cells are impacted after injecting one or more simultaneous faults, representing both SEUs and MCUs. For both detection plans evaluated, it is possible to observe a negative voltage pulse at the circuit outputs (PULSE<sub>4×4</sub> and PULSE<sub>8×8</sub>), representing the bit-flip in the cell and the immediate recovery of its previously stored value. The capture of this pulse and the generation of the alarm signal at the SR Latch output (SIGNAL<sub>4×4</sub> and SIGNAL<sub>8×8</sub>) are also observed, validating the correct operation of the circuit. Considering the pulse amplitude and width metrics, it is possible to compare the different scenarios to obtain the circuit behavior concerning the impact location, detection plan size, and event type.

## 6.4.1 4X4 Detection Plan

Figure 6.9 presents the two outputs of the  $4\times4$  plan after a fault injection (SEU) in the Q node of the cell positioned in the 1<sup>st</sup> row/column of the plan. Through the PULSE<sub>4×4</sub> output, it is possible to observe the impact suffered by the cell, reducing the detection line voltage to less than VDD/2 (0.698 V) and the recovery of the stored value by the return of VDD in the detection line. At the same time that the PULSE<sub>4×4</sub> voltage



Figure 6.9: Experimental circuit behavior after an electrically-induced SEU impact on the  $4 \times 4$  detection plan.

Source: From the author.

drops below VDD/2, the alarm signal is sent, represented by the SIGNAL<sub>4×4</sub> transition ('0' to '1') with a rise time of 3.66 ns. The alarm signal remains active until a reset signal is sent.

The behavior observed in the graph is repeated in all fault injections performed in this work. However, sensitivity differences concerning the cells' position in the plan are mainly observed through the pulse amplitude and width, presented in Table 6.2 considering the 4×4 plan measurements. The cells positions in the plan are presented in the format  $Q_{ij}/QB_{ij}$ , where  $i = row_index$  and  $j = column_index$ .

The most significant sensitivity difference is noticed in comparing the cells placed

| Sconario |            | <b>SIGNAL</b> <sub>4X4</sub> |              |                  |
|----------|------------|------------------------------|--------------|------------------|
| Scenario | Amplitude  | Width                        | Lowest Level | <b>Rise Time</b> |
|          | <b>(V)</b> | (ns)                         | <b>(V)</b>   | (ns)             |
| 1 - Q11  | 2.602      | 12.365                       | 0.698        | 3.664            |
| 2 - Q21  | 2.670      | 12.773                       | 0.630        | 3.763            |
| 3 - Q33  | 2.690      | 13.282                       | 0.610        | 3.663            |
| 4 - Q43  | 2.719      | 13.381                       | 0.581        | 3.721            |
| 5 - QB12 | 2.631      | 12.703                       | 0.669        | 3.652            |
| 6 - QB22 | 2.680      | 13.269                       | 0.620        | 3.758            |
| 7 - QB34 | 2.719      | 13.702                       | 0.581        | 3.745            |
| 8 - QB44 | 2.729      | 14.044                       | 0.571        | 3.720            |

Table 6.2: 4×4 Plan Measurements - Complete SEU Analysis.

at the top and the bottom of the detection plan. Considering faults inserted in the Q node, the pulse amplitude, and width observed at the output increased 4.5% and 8.22%, respectively, for the cells positioned closer to the detection logic. The pulse width is 10.56% longer for cells positioned at the bottom of the plan, considering the impact on the QB node. The same behavior concerning the positioning of the cells is maintained in the MCUs analysis. However, an increase in sensitivity is observed considering the impact of horizontal and vertical DBUs, compared to an SEU inserted in the same row of the plan, i.e., without the influence of cell positioning. The insertion of a DBU in the lowest row of the plan represents the most sensitive scenario among all the analyses carried out. In this scenario, we have a pulse with 2.81 V amplitude and 14.84 ns width, 5.56% wider than the most sensitive scenario pulse considering only SEUs.

### 6.4.2 8X8 Detection Plan

All the behaviors observed in the analysis of the  $4 \times 4$  plan are also present in the  $8 \times 8$  plan. In this plan, the most important is to verify the impact of the increase in the number of cells per column on the circuit susceptibility. Considering the SEU impact, an average reduction of approximately 14% in the pulse amplitude and an average increase of 17% in the SIGNAL<sub>8×8</sub> rise time was verified. This behavior shows the impact of increasing the detection line capacitance as the number of cells per column increases. Besides reducing the cells' sensitivity, due to the decrease in the PULSE<sub>8×8</sub> output pulses amplitude, the alarm signal delay also increases according to the increase in SIGNAL<sub>8×8</sub> rise time.

Considering the five different MCU scenarios (the three presented in Figure 6.7 plus two vertical DBUs inside the highlighted QBU, which are defined as DBU#1 and DBU#2), it was found that an increase in the number of cells impacted simultaneously in the same column provides an increase in the circuit susceptibility. In Figure 6.10, the two vertical DBUs in 5<sup>th</sup> and 6<sup>th</sup> rows (DBU#1) and in 7<sup>th</sup> and 8<sup>th</sup> rows (DBU#2) were evaluated considering the same column. The DBU#2, closest to the detection circuit, generated a pulse with 2.52 V amplitude, 3.61% greater than the DBU#1. When the same four cells were impacted simultaneously, representing a QBU, the pulse amplitude was increased by 3.84% in comparison to the DBU#2. This increase in sensitivity with the increase in simultaneously impacted cells proves the MCU robustness of the method proposed. However, it is also important to note that even considering this QBU in the



Figure 6.10: Measurements of the MCU effects in the  $8 \times 8$  plan and the comparison with the DBU impact on the  $4 \times 4$  plan.

Source: From the author.

 $8 \times 8$  plan, a simple DBU in the  $4 \times 4$  plan (the vertical DBU highlighted in Figure 6.6) has greater sensitivity, reinforcing the impact of increasing capacitance on the detection lines.

#### 6.5 Laser Results

After validating the circuit through electrically-induced SEU/MCU experiments, the circuit was also validated through SEEs laser testing. The results obtained, divided into the two stages of experiments detailed in Section 6.3, are presented below.

#### 6.5.1 Single Cell

For the initial laser energy of 0.32 nJ, defined for carrying out the tests, no events were observed, and the circuit remained in its initial state. The laser energy was gradually increased, and a new shot was performed in the selected cell for each new value. Only from the laser energy of 0.61 nJ can an event be observed at the output of the circuit after the laser beam impact, proving the correct operation of the circuit, now considering a more realistic radiation experiment. Therefore, the laser threshold energy value for the  $8 \times 8$  detection plan was measured as 0.61 nJ. Figure 6.11 shows a general view of the

different states of the circuit before, during, and after a laser shot in the circuit, activating the alarm signal.





Source: From the author.

Figure 6.12 presents in detail the circuit behavior after the impact of a laser beam with enough energy to cause a change of state in the selected cell. In the first 12 ms, the circuit is in its initial state, with the alarm signal (SIGNAL<sub>8×8</sub>) deactivated and the detection lines supplied by VDD (PULSE<sub>8×8</sub>). Immediately after the laser impact, the



Figure 6.12: Experimental circuit behavior after a laser beam impact on the 8th row of the  $8 \times 8$  detection plan.

Source: From the author.

voltage drop at the circuit output can be observed, signaling the change of state of the impacted cell. The alarm signal is not activated immediately, and the circuit is in the transition state for about 35 ms. After this transition, the circuit reaches its final state. The alarm signal is activated, and the detection lines start to conduct VDD again, indicating the refresh of the detection cells.

Firstly, it is important to highlight the difference in the transition time (change in order from ns to ms) of the signals compared to the electrical measurements performed previously. It is also possible to notice that the alarm signal activation does not happen immediately after  $PULSE_{8\times8}$  output falls, but only with the cell refresh. Several factors can lead to these differences in comparing the two types of tests presented in this work. The relation between the laser pulse energy and exposure time and the intensity of the electrically inserted current pulse is a possible factor. Another important possible factor responsible for these observed differences is the area impacted according to the laser spot size compared to the current pulse inserted only in the internal nodes of the detection cell.

With the use of a laser spot that covers almost the entire area of the detection cell, it is not possible to accurately determine the regions where the laser energy has been deposited. Nor can it be determined whether more than one cell was impacted simultaneously, for example, due to charge-sharing mechanisms. These factors can originate signals with different delays entering the detection logic, which due to its architecture, can cause problems of undefined state in the SR Latch (race condition) output, explaining the difference in the transition time between the two types of tests performed.

# 6.5.2 Scan

As previously presented in the single cell analysis, the laser threshold energy of the circuit is 0.61 nJ. As expected, the scan made it possible to confirm that all cells of the same row, after being individually impacted, have the same sensitivity. Considering laser energy of 0.61 nJ, not only is the analyzed cell impacted, but all neighboring cells belonging to the same row (row 8) are also impacted from the same amount of energy, considering SEUs.

In the scan performed in the  $8 \times 8$  plan, shown in Figure 6.13, it was observed that for laser energy from 0.61 nJ to 0.75 nJ, the two rows at the bottom of the detection plan are impacted. For values greater than 0.75 nJ and smaller than 0.92 nJ, row 6 is also impacted. Increasing the laser energy up to 1.12 nJ, rows 4 and 5 are also impacted. Rows 2 and 3 are also impacted for values between 1.12 nJ and 1.34 nJ, and finally, the entire  $8 \times 8$  detection plan is impacted for laser energy values greater than or equal to 1.34 nJ. This behavior agrees with the results obtained previously, showing the reduction of cell sensitivity according to the position of the detection plan. The greater the distance of a row to the detection logic, the lower the sensitivity of the cells, reaching about 120% difference for the ends of the evaluated detection plan.

The behavior observed through the scan performed in the  $8 \times 8$  detection plan can be better detailed through the cross-section by laser pulse energy graph. The laser crosssection is defined in

$$\sigma_{laser} = \left(\frac{cm^2}{dev}\right) = \frac{N_{SEU}}{N_{pulse}} \times S_{dev}(cm^2), \tag{6.1}$$

where  $N_{SEU}$  is the total number of SEUs recorded at a given laser energy,  $N_{pulse}$  is the total number of laser pulses inserted on the DUT, and  $S_{dev}$  is the area of the DUT (DAR-RACQ et al., 2002). As the experiments were performed in the 8×8 detection plan,  $S_{dev} = S_{8X8}$ .

Figure 6.14 presents the laser cross-section for the  $8 \times 8$  detection plan. The blue squares present the experimental laser beam data, and a Weibull distribution was used to fit the results generating the cross-section curve. From the laser threshold energy  $(E_{th})$ ,



Figure 6.13: Scan performed in the  $8 \times 8$  plan for different laser energy values.

Source: From the author.

the graph presents a rapid rise until reaching the saturation point ( $\sigma_{SAT}$ ), at which the total area of the 8×8 detection plan is impacted. This behavior is already expected in the cross-section reliability analysis, representing the progressive increase of the impacted circuit area according to slight energy variations.

## 6.6 Applicability of the Method Proposed in Advanced Technology Nodes

It is essential to highlight that due to the factors already presented above, the mature 350 nm technology was used to demonstrate and validate the proposed method in silicon. The idea presented in this work is not only functional in mature technological nodes but also of total relevance, especially for Very large-system Integration (VLSI) nodes, since the method does not have an event detection limitation, as in other techniques available in the literature. That is, the new challenges arising from the increase in the MCU in modern nodes are totally beneficial to the method proposed because with the increase in the number of events in a memory plan, the probability of detecting an event considering the method presented in this work increases. Also, with the increase in the MCU, it is possible to reduce the percentage of detection cells, thus reducing the area



Figure 6.14: Experimental and Weibull fitted cross-section of the  $8 \times 8$  plan.

Source: From the author.

penalty but still maintaining a high detection rate.

A critical point of the presented method is related to the size of the memory plan, regardless of the technology used. Considering adopting the method proposed in this work in a commercial-size SRAM, as we will see in the next chapter, one can think that the presented behavior could lead to significant sensitivity reductions in the whole plan, decreasing the radiation robustness of the method. However, this challenge can be mitigated by inserting some buffers connected to detection lines at the bottom of this plan to drive its capacitive charge. Only a variation in the alarm signal delay would be observed, keeping all cells in the plan with the same sensitivity.

### **7 INTERLEAVED DATA/DETECTION SRAM**

The new method proposed in this thesis was validated on silicon considering a detectors-only plan as presented in Chapter 6. The detection method was now applied in a new memory circuit with characteristics closer to current commercial memories. Besides confirming the correct operation of the method in an interleaved SRAM designed with data and detection cell, the primary goal is to verify the method behavior in a more advanced technology and the detection delay considering a commercial-sized memory plan. The main contributions presented in this chapter are published in (BRENDLER et al., 2023a).

### 7.1 The Design

A 256×256 interleaved data/detection SRAM was designed in the 28 nm FD-SOI Technology from ST Microelectronics. Among the different models available on this Process Design Kit (PDK), this work considers the Regular Threshold Voltage (RVT) with a nominal supply voltage of 1V. Of the 65,536 cells available in the SRAM core, it was defined that 50% of the cells would be used only for detection. This causes the memory to have its storage capacity reduced from 64 kb to 32 kb. In the current circuit, the data cells of a conventional SRAM are also considered in addition to the detection cells, so the design of all peripheral circuits that make up an SRAM must also be carried out. Before presenting the design of the complete SRAM, the layouts of each circuit that composes the memory are briefly presented. The transistor sizing of the data cells, the detection cells, and the main peripheral circuits of the SRAM are presented in the Sections 7.1.1, 7.1.2, and 7.1.3, respectively. Regarding the design of the decoders and the detection logic, logic gates were used, sized through logical effort, using as a basis an inverter with dimensions  $W_{PMOS} = 220$  nm and  $W_{NMOS} = 80$  nm. This sizing was chosen according to the ratio  $W_{PMOS}/W_{NMOS} = 2.75$  (to obtain balanced rise and fall propagation times), obtained through the characterization of PMOS/NMOS RVT transistors in the 28 nm FD-SOI technology.

## 7.1.1 Data Cell

Figure 7.1a presents the electric schematic of the 6T-SRAM bit cell with the transistor sizing adopted. The main characteristics of this cell were already presented in Section 2.1.1. The transistor sizing was chosen to obtain stable read/write operations through balanced Static Noise Margin (SNM). The CR =  $1.2 (W/L_{M1,M2}/W/L_{M5,M6})$  and PR =  $0.8 (W/L_{M3,M4}/W/L_{M5,M6})$  are in accordance with the definitions present in the literature and previously presented in Section 2.1.1 (RABAEY; CHANDRAKASAN; NIKOLIC, 2002; SINGH; MOHANTY; PRADHAN, 2012). The 6T-SRAM layout, shown in Figure 7.1b, was designed using the rectangular-diffusion topology (ALORDA et al., 2014). This topology is commonly used for high-density SRAMs, reducing the bit cells area and process variability, besides improving the SNM (MEDEROS, 2017).

Figure 7.1: 6T-SRAM Bit Cell: (a) Electric Schematic and (b) Layout.





# 7.1.2 Detection Cell

As previously described in Section 5.1, in the proposed detection cell, the NMOS access transistors of the 6T-SRAM bit cell are replaced by pairs of PMOS/NMOS transistors in charge of detecting and refreshing the detection cell after a state change. Figure 7.2a and Figure 7.2b present the electric schematic and the layout of the detection cell, respectively. After validating the method and, consequently, the detection cells used in the previous analysis, the sizing of the detection cell in the SRAM design maintained the same standards but adapted to the characteristics of the FD-SOI technology adopted in this analysis. Three different values of W are used for the eight transistors of the detection



Figure 7.2: Detection Cell: (a) Electric Schematic and (b) Layout.



cell. Electrical simulations confirmed the trade-off observed in the previous analysis regarding transistors P0/N0. As the width of these transistors increases, the current capacity increases, and the alarm signal delay decreases. However, these transistors contribute to increasing the capacitance in the DLs, increasing the alarm signal delay. Therefore, an optimal W/L = 240 nm was defined for the pair of transistors P0/N0. It is important to remember that to interleave the data and detection cells in the memory array, both cells have the same area, which is delimited by the sizing used in the detection cell transistors.

#### 7.1.3 Main SRAM Peripherals

The architectures of the SRAM peripherals circuits were already detailed in Section 2.1. Figure 7.3, Figure 7.4, and Figure 7.5 present the electric schematic and layout of the main SRAM peripherals: Pre-Charge, Sense Amplifier, and Write Driver, respectively. As the data part of the SRAM designed and presented in this work has no constraints as in a high-performance or low-power design, the architecture used to design these three peripheral circuits was defined according to the circuits most used in the SRAM design presented in the literature (RABAEY; CHANDRAKASAN; NIKOLIC, 2002; SINGH; MOHANTY; PRADHAN, 2012). The peripheral circuits' layouts are designed to reduce the SRAM's total area and facilitate the connection with the other blocks. The sizing of the three main peripherals was based on the sizing already used for the same architectures in the 28 nm FD-SOI technology from ST Microelectronics (MEDEROS, 2017). The Pre-Charge circuit layout width is the same as the data and the detection cells in order to fit perfectly on the top of each column of the memory plan. The Sense Amplifier and Write Driver circuits are designed to optimize the area use.


Figure 7.3: Pre-Charge circuit: (a) Electric Schematic and (b) Layout.



Figure 7.4: 7T Latch-Type Sense Amplifier: (a) Electric Schematic and (b) Layout.



Source: From the author.





Source: From the author.

#### 7.1.4 8X256 Row Decoder

In this work, the static AND/NAND-type structure was chosen mainly due to its lower power consumption and architecture, which can fit more easily with the memory core layout (RABAEY; CHANDRAKASAN; NIKOLIC, 2002; SINGH; MOHANTY; PRADHAN, 2012). The decoder was designed using the pre-decoding technique, which decodes smaller groups of address bits first and uses a single gate for each of the shared terms of the decoder's logic function. The  $8 \times 256$  decoder is composed of two sub-blocks of  $4 \times 16$  decoders, which are composed of two sub-blocks of  $2 \times 4$  decoders each. Only 2-inputs AND gates and inverters are used in the decoder design. Due to the size of the layout, Figure 7.6 shows only a part of the  $8 \times 256$  decoder layout, in which it is already possible to observe the use of sub-blocks in the design.



Figure 7.6: A part of the  $8 \times 256$  Row Decoder layout.

Source: From the author.

## 7.1.5 5X32 Column Decoder

In the designed 32 kbit SRAM, there are 256 cells per row (50% data, 50% detection), which were divided into 32 interleaved 8-bit words, as shown in Figure 7.7a. As already presented in Section 2.1.4, the bit interleaving technique is used to increase the robustness of the memory, reducing the probability that a single particle impacts bits belonging to the same memory word and, thus, facilitating the detection and correction of the errors. This technique was also used in the design of the SRAM presented in this work, aiming at a future extension of the proposed method, both about the detection logic (being able to provide the region of the array in which the fault occurred), and the use of Figure 7.7: (a) Memory organization: 32 interleaved 8-bit words and (b) Selection of the 8-bit Word 0 through the  $5 \times 32$  Column Decoder.



512x connections (BL/BLB)

SA

WD

BIT[4]

SA

WD

BIT[5]

SA

WD

BIT[6]

SA

WD

BIT[7]

Column Decoder

SA

WD

BIT[3]

SA

WD

BIT[1]

SA

WD

BIT[2]

SA

WD

BIT[0]

(b) Selection of the 8-bit Word 0 through the 5X32 Column Decoder

Word 0



the method along with other existing techniques, such as ECC (which require bit interleaving to improve their error detection/correction capacity). It is important to remember that 50% of the cells in the memory plan are exclusive for detection; that is, it is not possible to store data in these cells. Therefore, the cells were organized so that in even rows, it is only possible to store data in cells localized in even columns, and in odd rows, only cells positioned in odd columns.

To select 1 of the 32 words when a word line is selected, a  $5 \times 32$  column decoder is designed. One word will be selected through the five selection bits; more specifically, eight memory cells, each one with 2-bit lines (BL/BLB), need to be accessed. Each of the eight pairs of bit lines is connected in the Sense Amplifier and Write Driver circuits to perform the read/write operations of the 8-bit word, as presented in Figure 7.7b. The

A <4:0>

column decoder combines two architectures: PTL MUX with a  $2 \times 4$  decoder and a Tree decoder structure (RABAEY; CHANDRAKASAN; NIKOLIC, 2002). A part of the  $5 \times 32$  column decoder layout is presented in Figure 7.8.



Figure 7.8: A part of the  $5 \times 32$  Column Decoder layout.

Source: From the author.

### 7.1.6 Detection Logic Circuit

The detection logic is designed in a tree structure composed of NAND2/NOR2 logic gates and an SR Latch circuit. Due to its size, Figure 7.9 shows just part of the detection logic circuit layout. In every two columns of the SRAM core, the DLs are connected to the NAND2 gates placed on the first level of this circuit. The outputs of these same gates are used to send this signal back to the cell plan through the RLs and propagate the signal to the other NOR2 and NAND2 gates in the circuit to get a single alarm signal through the SR Latch output.

# 7.1.7 256X256 Interleaved Data/Detection SRAM

With the design of the data and detection cells, in addition to the design of all SRAM peripheral circuits, the next step is to perform the placing and routing of all these



Figure 7.9: A part of the Detection Logic layout.



sub-blocks. Regarding the SRAM core layout, the data and detection cell layouts were instantiated and placed according to the size  $(256 \times 256)$  and percentage of detection of cells (50%) chosen. Due to PDK design rules, the SRAM core layout was divided into two sub-blocks of  $128 \times 256$  cells each to better distribute the supply voltage across the circuit through the guard rings.

Figure 7.10a presents the complete SRAM block diagram highlighting the relationship between the core and the peripheral circuits. The difference between this radiation-hardened SRAM that implements the new method and a conventional SRAM is highlighted in orange: the detection cells and detection logic circuit. The layout of the SRAM core with a total area of 0.061mm<sup>2</sup> is shown in Figure 7.10b, highlighting how the data and detection cells layout, presented in Sections 7.1.1 and 7.1.2 were interleaved composing the SRAM array.

### 7.2 Test Methodology

For this complete SRAM with interleaved detection cells, tests based on postlayout simulation are performed. Despite being physically interleaved, the data and detection cells have separate logical connections and can be tested separately. Before verifying the correct operation of the detection logic and detection cells, it is important to verify the operation of the SRAM itself, considering only the data cells. The performance of the designed SRAM is not a constraint in this work. However, all the experiments were Figure 7.10: (a) 256X256 Interleaved Data/Detection SRAM block diagram and the (b) SRAM core layout.



Source: From the author.

performed considering a maximum frequency of 1 GHz.

# 7.2.1 SRAM Data Cells Test

Several types of tests at different levels can be performed to characterize an SRAM-type memory. In this work, the SRAM is tested in its complete form, with all its peripherals, simulating its communication with the processor. The initial goal is to characterize the memory through consecutive readings and writings, allowing it to observe its correct operation. Afterward, a more comprehensive test is carried out, widely used to characterize commercial SRAMs and known as the March Test (GOOR, 1993), specifically, the March-C test. March tests offer the benefit of achieving high fault coverage while maintaining a test time typically proportional to the memory size, making it a feasible option from an industrial standpoint.

### 7.2.2 SRAM Detection Cells Test

The main differential of the method proposed, which modifies the design of a traditional SRAM, is the addition of exclusive cells for detection, interleaved in the SRAM core. In order to verify the correct operation of the method, it is fundamental to test the detection cells, now, in a memory plan of commercial dimensions and designed in modern technology, such as the 28 nm FD-SOI.

Some detection cells in different positions of the memory plan were selected to be impacted to simulate the SEU and MCU effects. The SEU/MCU fault injection is modeled by the widely adopted Messenger's equation, shown in Eq. 7.1 (MESSENGER, 1982), in which  $Q_{coll}$  is the collected charge,  $\tau_{\alpha}$  (1.64 × 10<sup>-10</sup>s) is the collect charge timing constant,  $\tau_{\beta}$  (5 × 10<sup>-11</sup>s) is the timing constant to establish the ion track and *L* (28 nm) is the charge collection depth. The values used in this work are the typical values used for simulations and experiments presented in (CARRENO; CHOI; IYER, 1990). However, the charge collection depth was modified to 28 nm to better characterize the charge collection process in recent technologies, such as the 28 nm FD-SOI. This effect is reproduced on the Cadence<sup>®</sup> Spectre Circuit Simulator as a current source inserted in the detection cells' internal nodes.

$$I(t) = \frac{Q_{coll}}{\tau_{\alpha} - \tau_{\beta}} \left( e^{-\frac{t}{\tau_{\alpha}}} - e^{-\frac{t}{\tau_{\beta}}} \right)$$

$$Q_{coll} = 10.8 \times L \times LET$$
(7.1)

The primary objective of these experiments is to verify that the detection cells are impacted by a certain amount of deposited energy, sending the alarm signal (SIGNAL) to the CPU and automatically refreshing itself (PULSE). Still, it is also essential to observe a possible difference in the detection delay concerning the position of the impacted cell and also in comparison with the detection plan of smaller dimensions, presented in Chapter 6.

### 7.3 Interleaved Data/Detection SRAM Post-Layout Simulations

Before verifying the correct operation of the new method for detecting multiple and single events, it is essential to validate the SRAM data cells, which are interleaved with the detection cells. Figure 7.11 shows the correct memory operation through consecutive writings and readings (0s and 1s) in the bit cells.

From top to bottom, the first set of signals represents the memory addresses selected to perform the operations: word 0 of the first row and word 1 of the second row of the memory plan. The second set of signals, Chip Select (CS), Write Enable (WE), and



Figure 7.11: The 32 kb SRAM validation through consecutive write/read operations.

Source: From the author.

Output Enable (OE), determines what type of operation will be performed on the memory (write or read). Afterward, it is highlighted which data is being sent to memory through an external source. Finally, the 8-bit I/Os for each operation are presented in two formats. It is important to point out that the external data source purposely sends the opposite values to those previously written during the read operation. This allows verifying precisely that the data stored in the memory cell is being read correctly and that the value sent by the source is not being misread.

As previously mentioned, March-type tests allow for validating the memory in more detail, covering many possible faults that an SRAM can present. In Figure 7.12, the March C - Test is applied to the 16 words of memory row 0. As many operations are performed in this type of test, only a snippet of the test is presented, highlighting the expected I/Os and the execution of the test without any errors.

After inserting single and multiple current pulses, simulating the SEU and MCU effects, in some detection cells located in different positions of the memory plan, it was verified that there is no difference in the sensitivity for impacted cells in the same row of the array, even being in different columns. The impact of multiple cells (e.g., two detection cells and two data cells diagonally neighboring) also does not significantly influence the sensitivity and, consequently, the detection delay of the proposed method. The delay of the alarm signal will be determined by the detection cell placed further down the memory plan, that is, closer to the detection logic, as presented next.



Figure 7.12: March-C Test in 128 bits of the SRAM: 16 words of Row 0.

Source: From the author.

Figure 7.13 presents the circuit's behavior after the simulated impact of two detection cells placed in the same column but at the boundaries of the memory plan, Row 0 and Row 127. Figure 7.13a shows the behavior of the detection logic after a current pulse is inserted in the Q node of the detection cell placed in the last row of the first column of the memory array. As this cell is placed on the row closest to the detection logic, the delay between the insertion of the pulse (in orange) at t = 2 ns and the activation of SIGNAL (in red) is almost negligible, being equal to 289 ps. This behavior aligns with what was seen in the silicon experiments previously presented in this work but now considering post-layout simulations. Unlike the previous analysis, Figure 7.13b shows the circuit's behavior after the impact of a detection cell located as far away as possible from the detection logic: in the first row of the array. It can be seen that for the same amount of deposited energy, the circuit also sends the alarm signal regardless of the position of the impacted cell. However, the time for this signal to be sent to the processor is much longer. In this case, the detection delay is 143.3 ns, proving the impact of an SRAM of commercial dimensions on using the proposed method.

The results obtained so far have validated the idea of a new method for detecting radiation-induced effects in memory circuits. By presenting proof-of-concept through silicon experiments, the successful design of a complete SRAM that incorporates the proposed method has demonstrated its usability. Moreover, it has confirmed the anticipated challenges in implementing the method in SRAMs of commercial dimensions. In the complete memory plan designed, featuring a  $256 \times 256$  cells configuration, the difference in detection delay in the comparison between the impact of a particle on the first and

Figure 7.13: The behavior of the detection method considering post-layout simulations of a particle impact in: (a) The bottom row (Row 127) and (b) The top row (Row 0).





last rows of this array is 143 ns. This discrepancy holds particular significance given the operational frequency of the designed memory, set at 1 GHz. The substantial difference in delay suggests the potential occurrence of multiple misreadings before receiving the alarm signal.

## 7.4 Comparison with State-of-the-Art

The challenges posed by the main state-of-the-art techniques to deal with radiationinduced upsets were presented in Chapter 4. The contributions and results of the proposed new method were described throughout the thesis. Therefore, placing the new method in the state-of-the-art is now essential. Table 7.1 briefly compares the previously presented techniques used in the state-of-the-art papers and the method proposed in this thesis. In addition to the area penalty added by each technique, three other parameters related to MCU protection are analyzed:

- **Detection:** capacity of the technique to detect/alert about an MCU;
- Correction: capacity of the technique to correct the detected MCU;
- **# of bits and shape:** the limitation of the technique concerning the number of bits and the MCU format that can be detected/corrected.

It is essential to highlight that the term "correction" is being used when the technique (typically EDAC techniques) offers the possibility of rewriting the data that has been corrupted in memory, correcting it. This feature is different from term "masking", already presented in Section 3.5 and which is present in RHBD techniques, for example.

| XX7                | Technique                     | Area Penalty              | MCU Protection Features |            |                            |  |
|--------------------|-------------------------------|---------------------------|-------------------------|------------|----------------------------|--|
| WOLK               |                               |                           | Detection               | Correction | # of bits and shape        |  |
| TAN et al.         | Redundancy-based              | 100% - 200%               | x                       | ×          | Limited <sup>[a]</sup>     |  |
| (2021)             | (GE-TMR + GS)                 | 100 /0 - 200 /0           | ~                       |            | (Logical Masking)          |  |
| LI et al.          | Redundancy-based              | 220%                      | x                       | ×          | Limited <sup>[a]</sup>     |  |
| (2020)             | $(3 \times \text{Voter-TMR})$ | 22070                     |                         |            | (Logical Masking)          |  |
| LI et al.          | RHBD                          | $\sim 233\%$              | ×                       | ×          | None <sup>[b]</sup>        |  |
| (2021)             | (RH-14T)                      | $\sim 255 \%$             | ~                       |            | (Electrical Masking)       |  |
| HAN et al.         | RHBD                          | 200%                      | ×                       | ×          | None <sup>[b]</sup>        |  |
| (2021)             | (DA-12T)                      | 20970                     | ~                       |            | (Electrical Masking)       |  |
| CH et al.          | RHBD                          | $\sim 220\%$ X            |                         | Y          | None <sup>[b]</sup>        |  |
| (2021)             | (RH-14T)                      | $\sim 220$ //             | ~                       | ^          | (Electrical Masking)       |  |
| PRASAD et al.      | RHBD                          | $\approx 222\%$ X         |                         | v          | None <sup>[b]</sup>        |  |
| (2022)             | (RH-13T)                      |                           |                         | ^          | (Electrical Masking)       |  |
| VLAGKOULIS et al.  | EDAC                          | up to 100%                | 1                       | 1          | Limited                    |  |
| (2022)             | (ECC + Parity Code)           | up to 100%                | v                       |            | (for detection/correction) |  |
| SILVA et al.       | EDAC                          | ► 10007 <b>/</b>          |                         | 1          | Limited                    |  |
| (2020)             | (ECC - eMRSC)                 | > 100%                    | v                       | v          | (for detection/correction) |  |
| WANG et al.        | SRAM-based                    |                           | ×                       | ×          | $N/A^{[c]}$                |  |
| (2021)             | Radiation Monitor             | -                         |                         |            |                            |  |
| ANDJELKOVIC et al. | BBICS-based                   | Circuit-                  | 1                       | ×          | $\mathbf{N}/\mathbf{A}[c]$ |  |
| (2022)             | Radiation Monitor             | Based                     | v                       |            | IN/At s                    |  |
| GAVA et al.        | Software-based                | Application-              | ~                       | ~          | $None^{[d]}$               |  |
| (2023)             | (Register Allocation)         | Based                     | ^                       | ^          | (RC Masking)               |  |
| THIS               | Interleaved                   | up to 15007. [e]          | 1                       | ×          | Unlimited <sup>[f]</sup>   |  |
| THESIS             | Data/Detection SRAM           | up to 150% <sup>[4]</sup> | <b>v</b>                |            | (for detection)            |  |

Table 7.1: Comparison between the state-of-the-art and this thesis.

<sup>a</sup> Up to two of the three replicas can be impacted. Voters will logically mask the fault.

<sup>b</sup> Limited amount of charge collected supported until failure.

<sup>c</sup> Only the monitoring of the environment where the circuit will be exposed.

<sup>d</sup> RC = Resource-Constrained. Execution of the application on a limited number of registers to reduce the probability of failure.

<sup>e</sup> The method is customizable. It is possible to vary the amount of detection cells added (robustness X area penalty).

 $^{\rm f}$  It is possible to detect unlimited n-bit MCUs in many shapes.

Regarding the area penalty, despite the significant increase added by the method proposed in this thesis, similar and even more significant penalties are also perceived in the other techniques. Redundancy-based techniques, software-based techniques, and RHBD cells attempt to mask MCUs through different types of masking but do not allow detection or correction of MCUs. Among all the analyzed methods, only EDAC techniques allow the direct detection and correction of MCUs.

A critical point in common among all the highlighted techniques is the "concern" about increasing the MCU rate. Techniques that use masking as a basis are always limited according to the number of MCUs or the charge deposited by the particle after impacting the circuit. As the MCU rate increases, the probability of all modules of a TMR being impacted, in addition to the voters, increases significantly, making the technique unable to mask the events. In harsh environments, the charge deposited by the particle can be high, being greater than the critical charge of the designed RHBD cell, causing several cells in the memory array to be impacted. Despite not using masking as a methodology, EDAC techniques are also limited by the increased MCU rate, detecting and correcting a certain number of impacted cells according to the resources used and the MCU shape (distance between impacted bits).

Despite not providing the direct possibility of correction, the great difference of the proposed method is that it does not have this limitation according to the increase in the MCU rate. According to the percentage of detection cells chosen for the SRAM array, the memory radiation hardening level is already defined, and increasing the MCU rate will increase the detection capability of the method, as the probability of a detection cell being impacted will also increase. As it is customizable, the proposed method allows varying the ratio between the radiation hardening level and area penalty according to the objective. A tool was developed to facilitate this task and is presented in the next chapter. It is important to emphasize that creating a new technique does not eliminate the use of another but rather provides a new option, in addition to the possibility of combining different techniques to increase the robustness of the circuit and reduce the penalties imposed by the techniques.

### **8 RADIATION-HARDENED SRAM LAYOUT GENERATION TOOL**

Layout automation is an area that is already well known, but it continues to be of great importance, especially in the design of increasingly larger and denser circuits. Because it is customizable, the presented method allows for varying the number of added detection cells, making it possible to trade off robustness and area/power overheads. The design of a conventional SRAM is not a trivial task; designing an SRAM that implements the proposed method with data and detection cells becomes more challenging. The main goal of the developed tool is to automatically generate the layout of the core of a radiationhardened SRAM, facilitating the application of the new method and providing multiple sizes and protection configurations.

### 8.1 Tool Features and Implementation

Considering the IC design flow, before the layout generation stage, the electrical schematic of the circuit to be designed is created. This schematic is usually made graphically, using the transistor symbols provided by the PDK and other sub-blocks of cells. Depending on the complexity of the circuit, the circuit is often described using a hardware description language, such as Verilog, rather than designing graphically. It is essential to highlight that besides the automatic layout generation, the tool also provides the option of automatically generating the schematic of the generated layout, a function of utmost importance due to the difficulty in logically relating many data and detection cells.

When designing larger and more complex circuits, not only the layout but also the electrical schematic design of that circuit becomes more challenging. Regarding the proposed method, the most challenging point is logically relating both the traditional SRAM data and detection cells. Although laborious, in a small circuit (maximum of a few dozen cells), it is possible to instantiate each cell and manually make the connection between them. However, considering the circuit designed and presented in Chapter 7, for instance, creating the schematic in this way becomes unfeasible. The tool automatically generates the schematic to overcome these challenges by facilitating and significantly reducing the SRAM design time. This functionality will be detailed in the next section.

Two types of data inputs are available to determine the size of SRAM that will be generated: number of rows and columns or memory size (in kb). In the first option, the user is free to insert the desired number of rows and columns, the tool being responsible

for validating the data according to the rules of an SRAM array. In the second option, six different predefined memory sizes (from 1 kb to 1 Mb) are available to the user.

The differential of the presented method is the insertion of exclusive cells to detect radiation-induced upsets. Thus, the developed tool offers five different levels of radiation robustness for generating the SRAM layout. Level 0 does not add any detection cells to the memory plan and creates a traditional unhardened SRAM array. From level 1 to level 4, different percentages of detection cells (ranging from 12.5% to 50%) are interleaved in the memory plan, always looking for the best array coverage. The tool does not add the detection logic and the peripheral circuits; it is necessary to add them manually after the core layout generation. In order to facilitate the integration with one of the most used tools in the integrated circuits design, the tool was implemented in the Cadence<sup>®</sup> SKILL language. Algorithm 1 presents a summarized view of the code developed for the tool's implementation.

It is essential to highlight that the tool uses small pre-designed macros to replicate them to generate different SRAM core layout options. For the design of these macros, only the data and detection cell layouts that will be used in SRAM are needed. The already presented Figure. 7.1 and Figure. 7.2 show the electrical schematic and layout of the data and detection cells designed in the 28 nm FD-SOI technology from ST Microelectronics used in this first version of the tool. Among the different models available on this PDK, the RVT model is considered. The architecture chosen for the data cell design was the traditional 6T-SRAM. Using Design Rule Check (DRC) clean macros, the tool will also generate the final SRAM layout without DRC errors, regardless of the technological node used. In addition to the layout, if the user allows automatic generation of the schematic, it will also be possible to perform physical verification, such as Layout Versus Schematic (LVS).

It is important to remember that the macro cells used in this tool version have already been tested through post-layout simulations for an array of size  $256 \times 256$ , as presented in the previous chapter. Although the data entry option through the number of rows and columns allows, the maximum size (1 Mb) available in the data entry option through memory size also indicates the size limit for generating an array considering the macros already simulated. The alarm signal activation delay for the same operating frequency increases proportionally to the number of array rows. Therefore, it is not recommended to generate an array layout with more than 1024 rows, corresponding to a size of 1 Mb, considering a square array.

#### Algorithm 1: Automatic SRAM layout generation tool

 $set\_bindkey; \Rightarrow$  To start the tool.  $display\_form;$ 

**Input Data:**  $\Rightarrow$  Receive data from user.  $Bool \leftarrow bool; \Rightarrow$  To select the input data method.  $Bool2 \leftarrow bool2; \Rightarrow$  To enable/disable the schematic generation option.  $Row \leftarrow row; \Rightarrow$  Number of rows in the SRAM core.  $Col \leftarrow col; \Rightarrow$ Number of columns in the SRAM core.  $Size \leftarrow size; \Rightarrow$  Memory size method.  $Macro \leftarrow macro; \Rightarrow$  Radiation hardening level.

 $\Rightarrow$  Check data input method.

### if bool is 0 then

```
      if size is "1 kb or 4 kb or ... 1 Mb" then

      row \leftarrow 32 or 64 or ... 1024;

      col \leftarrow 32 or 64 or ... 1024;

      end
```

```
\Rightarrow Input data validation.
```

```
if \log_2(row) & \log_2(col) are INTEGERS then
    cv \leftarrow open\_cell\_view();
    if macro is "0% or 12.5% or ... 50%" then
        create_layout(cv macro row col);
        \Rightarrow Create the schematic view according to the user selection.
        if bool2 is 1 then
           create_schematic(cv macro row col);
        end
    end
    \Rightarrow Prepare layout information.
    num\_cells \leftarrow row \times col;
    num\_detection \leftarrow macro \times num\_cells;
    data\_capacity \leftarrow \frac{num\_cells-num\_detection}{1004}:
                                  1024
    display_succesfull_message;
    display_layout_information;
else
   display_error_message;
end
```

### **8.2** Tool Execution and Results

After creating a cell view layout in Cadence<sup>®</sup> Virtuoso, the tool is started by pressing a defined shortcut key. Figure. 8.1 shows the new window (main window of the tool) that appears over the cell view window, with all the functions available for generating the SRAM layout: SRAM core dimensions, radiation hardening level, and generation of the schematic view.

Figure 8.1: Automatic SRAM layout generation tool main window, highlighting all the available features.





In Figure. 8.2, an example of the tool's execution is presented. In step 1, the data entry method is chosen (keeping the checkbox on), the desired number of rows and columns  $(16 \times 16)$  is entered, and the radiation hardening level is set (1) via the slide button. In step 2, the layout is generated, and a success message appears on the screen. In the image, it is possible to observe the detection cells positioned between the SRAM data cells in a smaller amount (12.5%). Finally, step 3 presents the generated layout information: total number of cells, number of detection cells, and data storage capacity.

Regarding the automatic schematic generation, the symbols of both types of cells (data/detection) are instantiated the number of times necessary, and labels are added to

Figure 8.2: An example of running the tool: (1) defining the input data, (2) generating the SRAM layout, (3) providing layout information.



Source: From the author.

the wires of each cell connection, using the vector expressions in multiple-bit wire names functionality, performing the correct logical connection, without the need to connect each cell individually. The detailed explanation of how this functionality works is somewhat beyond the scope of this thesis but can be found in (CADENCE, 2023, chap. 2). To facilitate understanding, Figure 8.3 presents an example of schematic and layout generation with  $4 \times 4$  cells and protection level 2. The data cells are separated to distribute them into even and odd rows. In Figure 8.3a, the first data cell (I0) is instantiated eight times (<0:7>) and distributed across all columns of the even rows of the array. The second data cell (I1) is instantiated four times (<0:3>) and distributed only in the odd columns of the odd rows. Each presented detection cell is instantiated two times (<0:1>), each pair sharing the same DL and both pairs sharing the same RL.

The generated layout information helps know the newly created layout's characteristics and compares the cost-benefit between circuit protection level, storage capacity, and area penalty. From protection levels 1 to 4, a reduction in storage capacity is observed following the increase in the percentage of detection cells (from 12.5% to 50%). Table 8.1 compares the radiation hardening (Rad-Hard) level with the data storage capacity, considering the six different memory sizes available in the tool. The inversely proportional relationship between Rad-Hard level and storage capacity was already expected.

From this data, it is possible to determine the area overhead in the SRAM core

Figure 8.3: An example of a  $4 \times 4$  data/detection SRAM (a) Schematic and (b) Layout generation, highlighting the vector expressions in multiple-bit wire names functionality.



(a)  $4 \times 4$  data/detection SRAM (Rad-Hard Level = 2) - Schematic



(b)  $4 \times 4$  data/detection SRAM (Rad-Hard Level = 2) - Layout

Source: From the author.

layout (without considering the two extra detection cell's transistors), which, considering the levels of protection from 1 to 4, equals 14.3%, 33.33%, 60%, and 100%, respectively. Figure. 8.4 shows this trend between the radiation hardening levels provided by the tool and the area overhead. Based on the results generated by the tool and defining the necessary protection level according to the environment in which the circuit will operate, it is possible to automatically generate the SRAM core layout with the lowest area penalty possible.

Suppose the user intends to employ an alternative technology or to use different architectures for data/detection cells. In that case, providing its own macro cell layouts is

| Memory Size/          | Data Storage Capacity (kb) |           |         |           |         |  |  |
|-----------------------|----------------------------|-----------|---------|-----------|---------|--|--|
| <b>Rad-Hard</b> Level | 0 (0%)                     | 1 (12.5%) | 2 (25%) | 3 (37.5%) | 4 (50%) |  |  |
| 1 kb                  | 1                          | 0.875     | 0.75    | 0.625     | 0.5     |  |  |
| 4 kb                  | 4                          | 3.5       | 3       | 2.5       | 2       |  |  |
| 16 kb                 | 16                         | 14        | 12      | 10        | 8       |  |  |
| 64 kb                 | 64                         | 56        | 48      | 40        | 32      |  |  |
| 256 kb                | 256                        | 224       | 192     | 160       | 128     |  |  |
| 1 Mb                  | 1024                       | 896       | 768     | 640       | 512     |  |  |

Table 8.1: SRAM Generated Layout Data

Source: From the author.

essential so the tool can operate seamlessly. This tool will allow the user to optimize the layout design time of large memory arrays and easily use different amounts of detection cells in the memory plan, verifying the trade-off between area penalty and robustness of the circuit to radiation.

Figure 8.4: The impact of the radiation hardening levels provided by the tool in the total area of the SRAM core.



Source: From the author.

## **9 CONCLUSIONS**

Radiation-tolerant circuits, particularly memory circuits, are of utmost importance for space applications. The literature presents numerous techniques at different levels to tackle radiation effects. However, given the notable increase in the MCU rate both for emerging technological nodes and for the environment in which the circuit will operate, the conventional techniques may not provide a satisfactory level of robustness to radiation effects depending on the application needs. This thesis presented a new method to detect radiation-induced upsets in memory circuits, mainly considering the MCUs. First, the method was validated by designing a prototype version as a proof-of-concept. Silicon measurements prove the detection cell's correct operation: sending the alarm signal and retrieving its previously stored value. Then, the detection method was extended and applied in a commercial-size SRAM designed in the 28 nm FD-SOI technology. A radiation-hardened SRAM layout generation tool was also developed to simplifying the application of the new method and providing multiple sizes and protection configurations.

Susceptibility differences in the circuit behavior were observed according to the type of event to which the detection plan was exposed, the position of the cell in the plan, and the size of the evaluated detection plan. Regarding the prototype version and the type of event, it was observed that the cells are more sensitive to MCUs compared to SEUs, presenting output pulse widths on average 5.56% greater. This behavior is consistent with the desired objectives because according to the increase in the number of cells impacted by a single particle, an increase in the sensitivity is verified.

A reduction in the sensitivity of the cells, considering the same inserted fault, is observed going through the bottom-up direction of the detection plan, that is, further from the detection circuit. This sensitivity reduction is related to the increase in the detection line capacitance and how the first level of AND2 logic gates in the detection logic circuit will drive this load. This behavior is also reflected in the comparison between the two evaluated plans in the prototype version. As the number of cells in each column increases, the impact of the detection line capacitance increases proportionally.

The reduction in the sensitivity of the detection cells going through the bottom-up direction of the memory array is more critical in analyzing a commercial-size SRAM. As the number of cells in each column increases significantly in the designed memory, the impact on the detection delay is more worrisome. However, it is essential to highlight that the detection rate will remain constant. The challenge is dealing with the increase in

the delay to know if an event happened. This challenge was partially solved by inserting some buffers connected to detection lines at the bottom of this plan to drive its capacitive charge. According to the application requirements in terms of performance, the method can still provide a high level of radiation robustness.

## 9.1 Applicability of the Method at the System Level

Despite the definition and validation of the method through the experiments presented throughout the manuscript, sometimes it is not so clear to imagine how an SRAM that adopts the proposed method would operate at the system level. This section clarifies this and suggests new ideas for how the method works at a system level.

In Chapter 7, the SRAM version presented has only one alarm signal as its output. This means that the connection to the processor would be established just like any conventional SRAM, with control signals, address bits, I/O data bits, and the alarm signal. In this case, the alarm signal would allow the processor to interrupt and refresh the memory completely, for example, cutting off the power supply of the SRAM via the CS signal. To achieve a partial memory refresh, it is essential to expand the detection logic by including new latches at earlier stages of the logic instead of relying on just a single latch at the final stage. This approach enables only a designated area of the SRAM array, represented by a set of bits, to be transmitted to the processor for a partial memory refresh. Furthermore, a new control signal is required to reset the power supply solely for the chosen columns.

As will also be highlighted in Section 9.3.2, various techniques available in the literature can aid in implementing the proposed method at the system level. The ECC presenting in EDAC techniques would provide the capability of correcting to the method proposed in this work. On the other hand, the method would make the detection capability unlimited. The combination of the proposed method with an EDAC technique can also help in refreshing memory data. ECC already offers this functionality through new writes in memory without requiring the power supply of the cells to be refreshed.

The approach based on resource constraints (GAVA et al., 2023), which has been discussed in this work, can be a suitable option to implement the proposed method. By incorporating detection cells only in a limited portion of the SRAM array, and depending on the application, the critical part of the circuit could run exclusively in this region, thereby enhancing the overall robustness of the circuit and reducing the area overhead.

#### 9.2 Competitive Advantage of the Proposed Method

Regarding the area penalty, despite the significant increase added by the method proposed in this thesis, similar and even more significant penalties are also perceived in the other techniques. The proposed method's power consumption overhead is significantly lower than other existing techniques. The new challenges arising from the increase in the MCU rate in modern nodes benefit the new method validated in the thesis because, with the increase in the number of events in a memory plan, the probability of detecting an event also increases. This fact highlights the originality of the method, going the opposite way of the techniques already present in the literature, achieving a high detection rate even in the harshest environments.

Although it does not offer direct correction capability, the significant advantage of the proposed method lies in its freedom from the detection limitation according to the increase in the MCU rate. Depending on the chosen percentage of detection cells in the SRAM array, the memory radiation hardening level is already defined, and increasing the MCU rate will increase the detection capability of the method, as the probability of a detection cell being impacted will also increase. Thanks to its customizable nature, the proposed method allows for a flexible adjustment between the radiation hardening level and the associated area penalty based on specific objectives. It is essential to emphasize that introducing a new technique does not eliminate the use of another; instead, it presents an additional option. Furthermore, it opens avenues for combining various techniques to enhance circuit robustness and mitigate the penalties imposed by these methods. The solution is of great value for designing radiation-hardening memories focusing on critical space applications.

### 9.3 Future Works and Perspectives

The results presented at the end of the thesis consolidate the presentation of a new method to deal with radiation effects in memory circuits. However, this does not mean the end of the work. From the circuits and results presented throughout the thesis, in addition to new technologies already being developed, new perspectives may emerge to improve the presented method.

# 9.3.1 Rad-Hard Test

With the finalization of the SRAM layout, having a circuit ready for manufacturing, the goal is to study the best possibilities for sending this circuit for manufacturing. In the manufactured circuit, the idea is to conduct radiation tests with heavy ions, considering the complete memory (data and detection cells) in two ways: static and dynamic testing.

In the static test, the objective would be to write a data pattern in memory (e.g., checkerboard) and then conduct a fault injection campaign in different array positions. In this test, the goal would be to verify, in addition to detecting failures by sending an alarm signal, the impact of a single particle on more than one cell, characterizing MCUs. On the other hand, the dynamic test would allow an even more detailed view of the circuit's behavior, considering the chip's bombardment while consecutive readings and writings are performed in the memory. This test determines how many incorrect data readings would be made in the memory, from the particle's impact to the alarm signal's triggering.

In addition to finalizing the circuit design, receiving the manufactured circuit usually takes between 6 and 12 months. Carrying out tests with heavy ions also demands a particular time, considering the stages of proposal submission, preparation of experiments, execution, and data acquisition. Finally, it is crucial to emphasize the significant high cost associated with these tests, which underscores the need to explore partnerships that can facilitate their implementation. Due to these factors, the idea is to try to send the circuit for manufacturing after the end of the thesis so that radiation tests with heavy ions can be performed between the end of 2024 and the beginning of 2025.

## 9.3.2 A combination of widely used techniques and the proposed method

As previously mentioned, a common critical aspect across all the prominent techniques discussed in this thesis pertains to the escalating MCU rate. Techniques founded on masking are inherently constrained by the count of MCUs or the charge deposited by particles upon circuit impact. In harsh environments, particle-induced charge can reach substantial levels, surpassing the critical charge of the designed RHBD cell, resulting in numerous cells within the memory array being affected. Even though EDAC techniques do not utilize masking as their methodology, they too face limitations due to the heightened MCU rate. They detect and correct a specific number of impacted cells based on the allocated resources and the shape of the MCU (distance between impacted bits).

In this context, the advantages of the new method proposed in this thesis could be combined with the advantages offered by other methods already present in the literature. In addition to further increasing the robustness of the memory circuit to multiple events, a mitigation of the overheads introduced by all techniques could be achieved. For example, we could use RHBD cells instead of the traditional 6T-SRAM bit cells and base the architecture of the detection cells similarly to increase the critical charge of all the SRAM array. Furthermore, we could use the new proposed method to increase the detection rate using an EDAC technique. The ECC presenting in EDAC techniques would also provide correction capacity, which is not provided by the proposed method.

### 9.3.3 AI-driven SRAM

A possible idea would be related to the versatility of the cells used in the SRAM array. Instead of using different cell architectures to store data and perform detection, the objective would be to create a single versatile cell that could be dynamically modified according to the system's needs. Considering the promising evolution in Artificial Intelligence (AI), the idea would be the creation of an AI-driven SRAM to protect itself depending on a mission profile that can change over its life. That is, an SRAM already present in a satellite, for example, could vary in its level of radiation robustness and storage capacity according to the environment in which that satellite is located. This behavior would provide the best possible cost-benefit ratio between storage capacity and level of protection. It is an idea that is still very far from our current reality, but we can take advantage of new technologies to try to evolve in this direction.

# REFERENCES

AHLBIN, J. et al. The effect of layout topology on single-event transient pulse quenching in a 65 nm bulk cmos process. **IEEE Transactions on Nuclear Science**, IEEE, v. 57, n. 6, p. 3380–3385, 2010.

AHLBIN, J. R. et al. Single-event transient pulse quenching in advanced cmos logic circuits. **IEEE Transactions on Nuclear Science**, IEEE, v. 56, n. 6, p. 3050–3056, 2009.

ALLEN, J. A. V.; FRANK, L. A. Radiation around the earth to a radial distance of 107,400 km. **Nature**, State Univ. of Iowa, Iowa City, v. 183, 1959.

ALORDA, B. et al. Adaptive static and dynamic noise margin improvement in minimum-sized 6t-sram cells. **Microelectronics Reliability**, Elsevier, v. 54, n. 11, p. 2613–2620, 2014.

AMUSAN, O. A. et al. Charge collection and charge sharing in a 130 nm cmos technology. **IEEE Transactions on nuclear science**, IEEE, v. 53, n. 6, p. 3253–3258, 2006.

ANDERSON, T.; LEE, P. Fault Tolerance: Theory and Practice. Englewood Cliffs, NJ: Prentice-Hall, 1981.

ANDJELKOVIC, M. et al. Ps-bbics: Pulse stretching bulk built-in current sensor for on-chip measurement of single event transients. **Microelectronics Reliability**, v. 138, p. 114726, 2022. ISSN 0026-2714. 33rd European Symposium on Reliability of Electron Devices, Failure Physics and Analysis.

ASSIS, T. R. d. Analysis of transistor sizing and folding effectiveness to mitigate soft errors. Dissertation (Master) — PPGC/Universidade Federal do Rio Grande do Sul, 2009.

ATKINSON, N. M. et al. Layout technique for single-event transient mitigation via pulse quenching. **IEEE Transactions on Nuclear Science**, IEEE, v. 58, n. 3, p. 885–890, 2011.

AVIZIENIS, A. The four-universe information system model for the study of fault tolerance. In: **Proceedings of 12th International Symposium on Fault-Tolerant Computing**. Los Angeles: [s.n.], 1982. p. 6–13.

AZAMBUJA, J. R.; KASTENSMIDT, F.; BECKER, J. **Hybrid Fault Tolerance Techniques to Detect Transient Faults in Embedded Processors**. Switzerland: Springer, 2014.

BALASUBRAMANIAN, A. **Measurement and analysis of single event induced crosstalk in nanoscale cmos technologies**. Thesis (PhD) — Faculty of the Graduate School of Vanderbilt University, 2008.

BALEN, T. R. Efeitos da radiação em dispositivos analógicos programáveis (FPAAs) e técnicas de proteção. Thesis (PhD) — PPGEE/Universidade Federal do Rio Grande do Sul, 2010.

BARTRA, W. C.; VLADIMIRESCU, A.; REIS, R. Fdsoi and bulk cmos sram cell resilience to radiation effects. **Microelectronics Reliability**, Elsevier, v. 64, p. 152–157, 2016.

BARTRA, W. E. C. Modelamento do single-Event effects em circuitos de memória FDSOI. Thesis (PhD) — PGMICRO/Universidade Federal do Rio Grande do Sul, 2016.

BAUMANN, R. The impact of technology scaling on soft error rate performance and limits to the efficacy of error correction. In: IEEE. **Digest. International Electron Devices Meeting,**. San Francisco, CA, 2002. p. 329–332.

BAUMANN, R. The impact of technology scaling on soft error rate performance and limits to the efficacy of error correction. In: **Digest. International Electron Devices Meeting,** [S.l.: s.n.], 2002. p. 329–332.

BAUMANN, R. Soft errors in advanced computer systems. **IEEE Design & Test of Computers**, IEEE, v. 22, n. 3, p. 258–266, 2005.

BAUMANN, R. C. Radiation-induced soft errors in advanced semiconductor technologies. **IEEE Transactions on Device and materials reliability**, IEEE, v. 5, n. 3, p. 305–316, 2005.

BHAT, S. et al. **Energy models for network-on-chip components**. Master's Thesis — Eindhoven University of Technology, 2005.

BHUVA, B. et al. Multi-cell soft errors at advanced technology nodes. **IEEE Transactions on Nuclear Science**, IEEE, v. 62, n. 6, p. 2585–2591, 2015.

BINDER, D.; SMITH, E. C.; HOLMAN, A. B. Satellite anomalies from galactic cosmic rays. **IEEE Transactions on Nuclear Science**, v. 22, n. 6, p. 2675–2680, 1975.

BLAKE, J. B.; MANDEL, R. On-orbit observations of single event upset in harris hm-6508 1k rams. **IEEE Transactions on Nuclear Science**, v. 33, n. 6, p. 1616–1619, 1986.

BOUDENOT, J.-C. Radiation space environment. In: VELAZCO, R.; FOUILLAT, P.; REIS, R. (Ed.). Radiation Effects on Embedded Systems. Dordrecht: Springer, 2007. p. 1–9.

BRAMNIK, A.; SHERBAN, A.; SEIFERT, N. Timing vulnerability factors of sequential elements in modern microprocessors. In: IEEE. **IOLTS, 2013 IEEE 19th International**. Chania, 2013. p. 55–60.

BRENDLER, L. H. et al. An sram-based multiple event upsets detection method for space applications. In: IEEE. **2022 22th European Conference on Radiation and Its Effects on Components and Systems (RADECS)**. [S.1.], 2022.

BRENDLER, L. H. et al. A mcu-robust interleaved data/detection sram for space environments. In: **2023 IEEE Computer Society Annual Symposium on VLSI** (**ISVLSI**). [S.1.: s.n.], 2023. p. 1–6.

BRENDLER, L. H. et al. A proof-of-concept of a multiple-cell upsets detection method for srams in space applications. **IEEE Transactions on Circuits and Systems I: Regular Papers**, p. 1–11, 2023.

CADENCE, L. . S. **Virtuoso Schematic Editor User Guide**. 2023. Available from Internet: <a href="https://support.cadence.com/apex/techpubDocViewerPage?path=comphelp/comphelpIC23.1/comphelpTOC.html">https://support.cadence.com/apex/techpubDocViewerPage?path=comphelp/comphelpIC23.1/comphelpTOC.html</a>>.

CANNON, M. J. et al. Improving the reliability of tmr with nontriplicated i/o on sram fpgas. **IEEE Transactions on Nuclear Science**, v. 67, n. 1, p. 312–320, 2020.

CARRENO, V. A.; CHOI, G.; IYER, R. Analog-digital simulation of transient-induced logic errors and upset susceptibility of an advanced control system. Washington, USA, 1990.

CH, N. R. et al. Single-event multiple effect tolerant rhbd14t sram cell design for space applications. **IEEE Transactions on Device and Materials Reliability**, v. 21, n. 1, p. 48–56, 2021.

CHATTERJEE, I. From mosfets to finfets-the soft error scaling trends. **RADNEXT** tribune, 2020.

CHATTERJEE, I. et al. Single-event charge collection and upset in 40-nm dual- and triple-well bulk cmos srams. **IEEE Transactions on Nuclear Science**, v. 58, n. 6, p. 2761–2767, 2011.

CHATTERJEE, I. et al. Impact of technology scaling on sram soft error rates. **IEEE Transactions on Nuclear Science**, v. 61, n. 6, p. 3512–3518, 2014.

CLEMENTE, J. A. et al. Impact of the bitcell topology on the multiple-cell upsets observed in vlsi nanoscale srams. **IEEE Transactions on Nuclear Science**, v. 68, n. 9, p. 2383–2391, 2021.

CUMMINGS, D. Enhancements in CMOS Device Simulation for Single-event Effects. Dissertation (Master) — University of Florida, 2010.

DARRACQ, F. et al. Backside seu laser testing for commercial off-the-shelf srams. **IEEE Transactions on Nuclear Science**, v. 49, n. 6, p. 2977–2983, 2002.

DAWOUD, S.; PEPLOW, R. **Digital system design-use of microcontroller**. [S.l.]: Taylor & Francis, 2010.

DEVAL, Y.; LAPUYADE, H.; RIVET, F. Design of cmos integrated circuits for radiation hardening and its application to space electronics. In: **2019 IEEE 13th International Conference on ASIC (ASICON)**. [S.l.: s.n.], 2019. p. 421–424.

DODD, P.; MASSENGILL, L. Basic mechanisms and modeling of single-event upset in digital microelectronics. **IEEE Transactions on Nuclear Science**, v. 50, n. 3, p. 583–602, 2003.

DODD, P. et al. Impact of technology trends on seu in cmos srams. **IEEE Transactions on Nuclear Science**, v. 43, n. 6, p. 2797–2804, 1996.

DODD, P. et al. Single-event upset and snapback in silicon-on-insulator devices and integrated circuits. **IEEE Transactions on Nuclear Science**, v. 47, n. 6, p. 2165–2174, 2000.

DODD, P. E. et al. Production and propagation of single-event transients in high-speed digital logic ics. **IEEE Transactions on Nuclear Science**, IEEE, v. 51, n. 6, p. 3278–3284, 2004.

DUZELLIER, S. et al. Low energy proton induced see in memories. **IEEE Transactions** on Nuclear Science, v. 44, n. 6, p. 2306–2310, 1997.

ECOFFET, R. In-flight anomalies on electronic devices. In: VELAZCO, R.; FOUILLAT, P.; REIS, R. (Ed.). Radiation Effects on Embedded Systems. Dordrecht: Springer, 2007. p. 31–68.

ENTRENA, L. et al. Set emulation considering electrical masking effects. **IEEE Transactions on Nuclear Science**, IEEE, v. 56, n. 4, p. 2021–2025, 2009.

ESA, D. of G. D. S. **Development of the South Atlantic Anomaly**. 2020. Available from Internet: <a href="https://www.esa.int/Applications/Observing\_the\_Earth/Swarm/Swarm\_probes\_weakening\_of\_Earth\_s\_magnetic\_field">https://www.esa.int/Applications/Observing\_the\_Earth/Swarm/Swarm\_probes\_weakening\_of\_Earth\_s\_magnetic\_field</a>.

ESA, E. S. A. **Types of orbits**. 2020. Available from Internet: <a href="https://www.esa.int/Enabling\_Support/Space\_Transportation/Types\_of\_orbits">https://www.esa.int/Enabling\_Support/Space\_Transportation/Types\_of\_orbits</a>.

FAMÁ, M.; ESTELA, J. Space environments. In: **Radiation Effects on Integrated Circuits and Systems for Space Applications**. [S.l.]: Springer, 2019. p. 1–11.

FARAUD, E. et al. Investigation on the sel sensitive depth of an sram using linear and two-photon absorption laser testing. **IEEE Transactions on Nuclear Science**, v. 58, n. 6, p. 2637–2643, 2011.

FERLET-CAVROIS, V.; MASSENGILL, L. W.; GOUKER, P. Single event transients in digital cmos—a review. **IEEE Transactions on Nuclear Science**, IEEE, v. 60, n. 3, p. 1767–1790, 2013.

GAVA, J. et al. A lightweight mitigation technique for resource- constrained devices executing dnn inference models under neutron radiation. **IEEE Transactions on Nuclear Science**, v. 70, n. 8, p. 1625–1633, 2023.

GOOR, A. J. V. D. Using march tests to test srams. **IEEE Design & Test of Computers**, IEEE, v. 10, n. 1, p. 8–14, 1993.

GUENZER, C. S.; WOLICKI, E. A.; ALLAS, R. G. Single event upset of dynamic rams by neutrons and protons. **IEEE Transactions on Nuclear Science**, v. 26, n. 6, p. 5048–5052, 1979.

HAMER, A. The South Atlantic Anomaly Is the Bermuda Triangle of Space. 2017. Available from Internet: <a href="https://curiosity.com/topics/">https://curiosity.com/topics/</a> the-south-atlantic-anomaly-is-the-bermuda-triangle-of-space-curiosity/>.

HAN, Y. et al. Radiation hardened 12t sram with crossbar-based peripheral circuit in 28nm cmos technology. **IEEE Transactions on Circuits and Systems I: Regular Papers**, v. 68, n. 7, p. 2962–2975, 2021.

HARAN, A. et al. Single-event upset tolerance study of a low-voltage 13t radiationhardened sram bitcell. **IEEE Transactions on Nuclear Science**, v. 67, n. 8, p. 1803–1812, 2020. HEIDEL, D. F. et al. Single-event upsets and multiple-bit upsets on a 45 nm soi sram. **IEEE Transactions on Nuclear Science**, IEEE, v. 56, n. 6, p. 3499–3504, 2009.

HUBERT, G.; ARTOLA, L.; REGIS, D. Impact of scaling on the soft error sensitivity of bulk, fdsoi and finfet technologies due to atmospheric radiation. **Integration**, Elsevier, v. 50, p. 39–47, 2015.

IBE, E. et al. Impact of scaling on neutron-induced soft error in srams from a 250 nm to a 22 nm design rule. **IEEE Transactions on Electron Devices**, v. 57, n. 7, p. 1527–1538, 2010.

JIANG, J. et al. Quadruple cross-coupled latch-based 10t and 12t sram bit-cell designs for highly reliable terrestrial applications. **IEEE Transactions on Circuits and Systems I: Regular Papers**, v. 66, n. 3, p. 967–977, 2019.

JOSHI, R.; KIM, K.; KANJ, R. Finfet sram design. In: \_\_\_\_\_. **Nanoelectronic Circuit Design**. New York, NY: Springer New York, 2011. p. 55–95. ISBN 978-1-4419-7609-3. Available from Internet: <a href="https://doi.org/10.1007/978-1-4419-7609-3\_3">https://doi.org/10.1007/978-1-4419-7609-3\_</a>.

KANG, S. M.; LEBLEBICI, Y. CMOS digital integrated circuits. [S.l.]: MacGraw-Hill, 2003.

KASTENSMIDT, F. L.; CARRO, L.; REIS, R. A. da L. Fault-tolerance techniques for SRAM-based FPGAs. Boston, MA: Springer, 2006.

KOGA, R. et al. Single ion induced multiple-bit upset in idt 256k srams. In: **RADECS 93. Second European Conference on Radiation and its Effects on Components and Systems (Cat. No.93TH0616-3)**. [S.l.: s.n.], 1993. p. 485–489.

KOGA, R. et al. Single-word multiple-bit upsets in static random access devices. **IEEE Transactions on Nuclear Science**, v. 40, n. 6, p. 1941–1946, 1993.

KUENTZER, F. A.; KRSTIC, M. Soft error detection and correction architecture for asynchronous bundled data designs. **IEEE Transactions on Circuits and Systems I: Regular Papers**, v. 67, n. 12, p. 4883–4894, 2020.

LAPRIE, J.-C. Dependable computing and fault-tolerance. **Digest of Papers FTCS-15**, p. 2–11, 1985.

LI, H. et al. Design of high-reliability memory cell to mitigate single event multiple node upsets. **IEEE Transactions on Circuits and Systems I: Regular Papers**, v. 68, n. 10, p. 4170–4181, 2021.

LI, Y. et al. Double cell upsets mitigation through triple modular redundancy. **Microelectronics Journal**, v. 96, p. 104683, 2020. ISSN 0026-2692. Available from Internet: <a href="https://www.sciencedirect.com/science/article/pii/S0026269218309595">https://www.sciencedirect.com/science/article/pii/S0026269218309595</a>>.

LIOU, K.-N. An introduction to atmospheric radiation. San Diego, CA: Elsevier, 2002.

LIU, B. et al. Temperature dependency of charge sharing and mbu sensitivity in 130-nm cmos technology. **IEEE Transactions on Nuclear Science**, IEEE, v. 56, n. 4, p. 2473–2479, 2009.

LOVELESS, T. D. et al. Neutron- and proton-induced single event upsets for d- and dice-flip/flop designs at a 40 nm technology node. **IEEE Transactions on Nuclear Science**, v. 58, n. 3, p. 1008–1014, 2011.

MAIZ, J. et al. Characterization of multi-bit soft error events in advanced srams. In: **IEEE International Electron Devices Meeting 2003**. [S.l.: s.n.], 2003. p. 21.4.1–21.4.4.

MAQBOOL, S. System-level mitigation of sefis in data handling architectures, a solution for small satellites. 2005.

MARTIN, R. C. et al. The size effect of ion charge tracks on single event multiple-bit upset. **IEEE Transactions on Nuclear Science**, v. 34, n. 6, p. 1305–1309, 1987.

MASSENGILL, L. et al. Single-event charge enhancement in soi devices. **IEEE Electron Device Letters**, v. 11, n. 2, p. 98–99, 1990.

MAY, T. et al. Dynamic fault imaging of vlsi random logic devices. In: IEEE. **22nd** International Reliability Physics Symposium. Las Vegas, NV, 1984. p. 95–108.

MAY, T. C.; WOODS, M. H. A new physical mechanism for soft errors in dynamic memories. In: IEEE. **16th International Reliability Physics Symposium**. San Diego, CA, 1978. p. 33–40.

MCDONALD, F. B. Cosmic-ray modulation in the heliosphere a phenomenological study. **Space Science Reviews**, Springer, v. 83, n. 1-2, p. 33–50, 1998.

MEDEROS, L. F. O. Study and development of low power consumption SRAMs on 28 nm FD-SOI CMOS process. Thesis (PhD) — Universidade Federal do Rio de Janeiro, 2017.

MEINHARDT, C. Variabilidade em FinFETs. Thesis (PhD) — PPGC/Universidade Federal do Rio Grande do Sul, 2014.

MELINGER, J. et al. Critical evaluation of the pulsed laser method for single event effects testing and fundamental studies. **IEEE Transactions on Nuclear Science**, v. 41, n. 6, p. 2574–2584, 1994.

MESSENGER, G. Collection of charge on junction nodes from ion tracks. **IEEE Transactions on nuclear science**, IEEE, v. 29, n. 6, p. 2024–2031, 1982.

MUNTEANU, D.; AUTRAN, J.-L. Modeling and simulation of single-event effects in digital devices and ics. **IEEE Transactions on Nuclear science**, IEEE, v. 55, n. 4, p. 1854–1878, 2008.

MUSSEAU, O. et al. Laser probing of bipolar amplification in 0.25-/spl mu/m mos/soi transistors. **IEEE Transactions on Nuclear Science**, v. 47, n. 6, p. 2196–2203, 2000.

MéSZáROS, P.; RAZZAQUE, S.; WANG, X. Y. **Cosmic ray physics**. 2015. Available from Internet: <a href="http://www2.astro.psu.edu/users/nnp/cr.html">http://www2.astro.psu.edu/users/nnp/cr.html</a>.

NARASIMHAM, B. et al. Scaling trends and bias dependence of the soft error rate of 16 nm and 7 nm finfet srams. In: **2018 IEEE International Reliability Physics Symposium (IRPS)**. [S.1.: s.n.], 2018. p. 4C.1–1–4C.1–4.

NASA, G. S. F. C. **Magnificent CME Erupts on the Sun**. 2012. Available from Internet: <a href="https://www.flickr.com/photos/24662369@N07/7931831962">https://www.flickr.com/photos/24662369@N07/7931831962</a>>.

NASA, S. H. O. **Staring Into the Sun**. 2008. Available from Internet: <a href="https://www.nasa.gov/multimedia/imagegallery/image\_feature\_588.html">https://www.nasa.gov/multimedia/imagegallery/image\_feature\_588.html</a>>.

NEALE, A.; SACHDEV, M. Neutron radiation induced soft error rates for an adjacent-ecc protected sram in 28 nm cmos. **IEEE Transactions on Nuclear Science**, v. 63, n. 3, p. 1912–1917, 2016.

NOAA. **National Geophysical Data Center**. 2015. Available from Internet: <a href="https://www.ngdc.noaa.gov/ngdcinfo/onlineaccess.html">https://www.ngdc.noaa.gov/ngdcinfo/onlineaccess.html</a>.

OLSON, B. D. et al. Simultaneous single event charge sharing and parasitic bipolar conduction in a highly-scaled sram design. **IEEE Transactions on Nuclear Science**, IEEE, v. 52, n. 6, p. 2132–2136, 2005.

ORSHANSKY, M.; NASSIF, S. R.; BONING, D. Introduction. **Design for Manufacturability and Statistical Design: A Constructive Approach**, Springer, p. 1–8, 2008.

PICKEL, J. C.; BLANDFORD, J. T. Cosmic ray induced in mos memory cells. **IEEE Transactions on Nuclear Science**, v. 25, n. 6, p. 1166–1171, 1978.

PRADHAN, D. K. et al. **Fault-tolerant computer system design**. USA: Prentice-Hall Englewood Cliffs, 1996.

PRASAD, G. et al. Double-node-upset aware sram bit-cell for aerospace applications. **Microelectronics Reliability**, v. 133, p. 114526, 2022. ISSN 0026-2714. Available from Internet: <a href="https://www.sciencedirect.com/science/article/pii/S0026271422000506">https://www.sciencedirect.com/science/article/pii/S0026271422000506</a>>.

RABAEY, J. M.; CHANDRAKASAN, A. P.; NIKOLIC, B. **Digital integrated circuits**. [S.l.]: Prentice hall Englewood Cliffs, 2002.

RADAELLI, D. et al. Investigation of multi-bit upsets in a 150 nm technology sram device. **IEEE Transactions on Nuclear Science**, v. 52, n. 6, p. 2433–2437, 2005.

RAGHURAM, C. N.; GUPTA, B.; KAUSHAL, G. Double node upset tolerant rhbd15t sram cell design for space applications. **IEEE Transactions on Device and Materials Reliability**, v. 20, n. 1, p. 181–190, 2020.

SEIFERT, N. et al. Soft error susceptibilities of 22 nm tri-gate devices. **IEEE Transactions on Nuclear Science**, IEEE, v. 59, n. 6, p. 2666–2673, 2012.

SEIFERT, N. et al. Radiation-induced soft error rates of advanced cmos bulk devices. In: **2006 IEEE International Reliability Physics Symposium Proceedings**. [S.l.: s.n.], 2006. p. 217–225.

SEXTON, F. W. Destructive single-event effects in semiconductor devices and ics. **IEEE Transactions on Nuclear Science**, IEEE, v. 50, n. 3, p. 603–621, 2003.

SIEGLE, F. et al. Mitigation of radiation effects in sram-based fpgas for space applications. **ACM Computing Surveys (CSUR)**, ACM New York, NY, USA, v. 47, n. 2, p. 1–34, 2015.

SIERAWSKI, B. D. et al. Muon-induced single event upsets in deep-submicron technology. **IEEE Transactions on Nuclear Science**, IEEE, v. 57, n. 6, p. 3273–3278, 2010.

SILVA, F. et al. Extended matrix region selection code: An ecc for adjacent multiple cell upset in memory arrays. **Microelectronics Reliability**, v. 106, p. 113582, 2020. ISSN 0026-2714. Available from Internet: <a href="https://www.sciencedirect.com/science/article/pii/S0026271419302835">https://www.sciencedirect.com/science/article/pii/S0026271419302835</a>>.

SIMIONOVSKI, A. Sensor de corrente transiente para detecção do SET com célula de memória dinâmica. Dissertation (Master) — PPGEE/Universidade Federal do Rio Grande do Sul, 2012.

SINGH, J.; MOHANTY, S. P.; PRADHAN, D. K. **Robust SRAM designs and analysis**. [S.1.]: Springer Science & Business Media, 2012.

STASSINOPOULOS, E.; RAYMOND, J. P. The space radiation environment for electronics. **Proceedings of the IEEE**, IEEE, v. 76, n. 11, p. 1423–1442, 1988.

TAN, C. et al. General efficient tmr for combinational circuit hardening against soft errors and improved multi-objective optimization framework. **IEEE Transactions on Circuits and Systems I: Regular Papers**, v. 68, n. 7, p. 3044–3057, 2021.

TAUR, Y. et al. Cmos scaling into the nanometer regime. **Proceedings of the IEEE**, IEEE, v. 85, n. 4, p. 486–504, 1997.

TELIKEPALLI, A. Power vs. performance: The 90 nm inflection point. Xilinx White Paper, v. 223, 2005.

TNA, S. E. M. Testing at the Speed of Light: The State of U.S. Electronic Parts Space Radiation Testing Infrastructure. Washington, DC: The National Academies Press, 2018.

VELAZCO, R.; FOUILLAT, P.; REIS, R. **Radiation effects on embedded systems**. Dordrecht: Springer Science & Business Media, 2007.

VLAGKOULIS, V. et al. Configuration memory scrubbing of sram-based fpgas using a mixed 2-d coding technique. **IEEE Transactions on Nuclear Science**, v. 69, n. 4, p. 871–882, 2022.

WALLMARK, J. T.; MARCUS, S. M. Minimum size and maximum packing density of nonredundant semiconductor devices. **Proceedings of the IRE**, v. 50, n. 3, p. 286–298, 1962.

WALT, M. **Introduction to geomagnetically trapped radiation**. [S.l.]: Cambridge University Press, 2005.

WANG, F.; AGRAWAL, V. D. Single event upset: An embedded tutorial. In: **21st International Conference on VLSI Design (VLSID 2008)**. [S.l.: s.n.], 2008. p. 429–434.

WANG, J. et al. Study of seu sensitivity of sram-based radiation monitors in 65-nm cmos. **IEEE Transactions on Nuclear Science**, v. 68, n. 5, p. 913–920, 2021.

142

WEAVER, H. et al. Ram cell recovery mechanisms following high-energy ion strikes. **IEEE Electron Device Letters**, v. 8, n. 1, p. 7–9, 1987.

WESTE, N. H.; HARRIS, D. **CMOS VLSI design: a circuits and systems perspective**. [S.l.]: Pearson Education India, 2015.

WYATT, R. C. et al. Soft errors induced by energetic protons. **IEEE Transactions on Nuclear Science**, v. 26, n. 6, p. 4905–4910, 1979.

ZHANG, Z. et al. Extrapolation method of on-orbit soft error rates of edac sram devices from accelerator-based tests. **IEEE Transactions on Nuclear Science**, v. 65, n. 11, p. 2802–2807, 2018.

ZIEGLER, J. F. Terrestrial cosmic rays. **IBM journal of research and development**, IBM, v. 40, n. 1, p. 19–39, 1996.

ZIEGLER, J. F. et al. Ibm experiments in soft fails in computer electronics (1978–1994). **IBM Journal of Research and Development**, v. 40, n. 1, p. 3–18, 1996.

ZIMPECK, A.; MEINHARDT, C.; BUTZEN, P. Análise do comportamento de portas lógicas cmos com falhas stuck-on em nanotecnologia. v. 1, 02 2014.

ZIMPECK, A. L. et al. Impact of different transistor arrangements on gate variability. **Microelectronics Reliability**, Elsevier, v. 88, p. 111–115, 2018.

### ANNEX A — LIST OF PUBLICATIONS

The following papers were published by the PhD student during the PhD thesis:

## **PUBLISHED JOURNALS:**

1. **L. H. Brendler**, H. Lapuyade, Y. Deval, F. Darracq, F. Fauquet, R. Reis, F. Rivet, "A Proof-of-Concept of a Multiple-Cell Upsets Detection Method for SRAMs in Space Applications," in IEEE Transactions on Circuits and Systems I: Regular Papers. doi: 10.1109/TCSI.2023.3310876.

### **PUBLISHED CONFERENCE PAPERS:**

1. L. H. Brendler, H. Lapuyade, Y. Deval, R. Reis and F. Rivet, "An SRAM-based Multiple Event Upsets Detection Method for Space Applications," 2022 22th European Conference on Radiation and Its Effects on Components and Systems (RADECS), Venice, Italy, 2022. (*in press*)

2. L. H. Brendler, H. Lapuyade, Y. Deval, R. Reis and F. Rivet, "A MCU-robust Interleaved Data/Detection SRAM for Space Environments," 2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Foz do Iguacu, Brazil, 2023, pp. 1-6. doi: 10.1109/ISVLSI59464.2023.10238542

3. L. H. Brendler, H. Lapuyade, Y. Deval, R. Reis and F. Rivet, "A Tool for Automatic Radiation-Hardened SRAM Layout Generation," 2023 IEEE 30th International Conference on Electronics, Circuits and Systems (ICECS), Istanbul, Turkey, 2023. doi: 10.1109/ICECS58634.2023.10382845

#### **WORKSHOP PRESENTATIONS:**

1. **L. H. Brendler**, H. Lapuyade, Y. Deval, R. Reis and F. Rivet, "An SRAM-based Multiple Event Upsets Detection Method for Space Applications," 12th IEEE CASS Rio Grande do Sul Workshop (CASSW), Porto Alegre, Brazil, 2022. (*Best Poster Award*) The following paper was published by the PhD student during the PhD thesis and is not directly related to this thesis proposal:

1. **L. H. Brendler**, A. L. Zimpeck, F. L. Kastensmidt, C. Meinhardt and R. Reis, "Voltage Scaling Influence on the Soft Error Susceptibility of a FinFET-based Circuit," 2021 IEEE 12th Latin America Symposium on Circuits and System (LASCAS), Arequipa, Peru, 2021, pp. 1-4. doi: 10.1109/LASCAS51355.2021.9459127.

The PhD student also published the following papers in project collaboration:

1. B. B. Sandoval, L. H. Brendler, A. L. Zimpeck, F. L. Kastensmidt, R. Reis and C. Meinhardt, "Exploring Gate Mapping and Transistor Sizing to Improve Radiation Robustness: A C17 Benchmark Case-study," 2021 IEEE 22nd Latin American Test Symposium (LATS), Punta del Este, Uruguay, 2021, pp. 1-6. doi: 10.1109/LATS53581.2021.9651798.

2. B. B. Sandoval, L. H. Brendler, F. L. Kastensmidt, R. Reis, A. L. Zimpeck, R. B. Schvittz, C. Meinhardt, "Impact on Radiation Robustness of Gate Mapping in Fin-FET Circuits under Work-function Fluctuation," 2023 IEEE International Symposium on Circuits and Systems (ISCAS), Monterey, CA, USA, 2023, pp. 1-5. doi: 10.1109/IS-CAS46773.2023.10181528.

3. C. M. Marques, L. H. Brendler, Frédéric Wrobel, Alexandra L. Zimpeck, Walter E. C. Bartra, Paulo F. Butzen, Cristina Meinhardt, "A Detailed Electrical Analysis of SEE on 28 nm FDSOI SRAM Architectures," 2023 36th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI), Rio de Janeiro, Brazil, 2023, pp. 1-6. doi: 10.1109/SBCCI60457.2023.10261665.