Sterpone and C. Nowadays, due to the shrinking technology process, Integrated Circuits became sensitive also to other kinds of radiation particles such as neutron that can exist at the earth surface and affects ground-level safety critical applications such as automotive or medical systems. The process of analyzing and hardening digital devices against soft errors implies rising the final cost due to time expensive fault injection campaigns and radiation tests, as well as reducing system performance due to the insertion of redundancy-based mitigation solutions.
The main industrial problem arising is the localization of the critical elements in the circuit in order to apply optimal mitigation techniques. The proposal of this tutorial is to present and discuss different solutions currently available for assessing and implementing the fault tolerance of digital circuits, not only when the unique design description is provided but also at the component level, especially when Commercial-of-the-shelf COTS devices are selected.
Costenaro and L. Electrical, logical and temporal masking prevent the majority of SETs from becoming functional failures. Current work on SET analysis starts from a gate-level circuit representation, however, in an industrial design cycle, by the time a gate-level netlist is available, it is too late to make design changes. The SET sensitivity of the cell library and the masking characteristics of standard combinatorial design blocks are pre-characterized and stored in compact models.
The SET sensitivity of a complex circuit is then calculated by decomposing it into blocks and combining the compact SET models.
- Roots and Wings!
- ISDE Research and Affiliated Faculty?
- Radiation Effects and Soft Errors in Integrated Circuits and Electronic Devices;
Hao Xie, Li Chen, A. Evans, Shi-Jie Wen and R. In this paper, we present an approach for mitigating the effects of faults in combinatorial logic through the selective addition of redundant logic. This approach can be applied to a generic digital circuit, protects against multiple fault models and offers a trade-off between area and fault coverage. We propose a state-aware analysis methodology that improves the accuracy of Soft Error Rate data for individual sequential instances based on the circuit and application. Furthermore, we exploit the intrinsic imbalance between the SEU susceptibility of different flip-flop states to implement a low-cost SER improvement strategy.
We apply de-rating techniques to accurately evaluate their contribution to the overall flip-flop SEE sensitivity. Rohani, H. Kerkhoff, E. However, the pulse-length in this model has a vital contribution to the accuracy and validity of the rectangular pulse model. The work presented in this paper develops two approaches for determination of the pulse-length of the rectangular pulse model used in Single Event Transient SET faults.
The first determination approach has been extracted from radiation testing along with transistor-level SET analysis tools. The results show that applying these two pulse-length determination approaches to the rectangular pulse model will cause the fault injection results converge much faster up to sixteen times , compared to other conventional approaches.
Pontarelli, M. Ottavi, A. Evans and S. The proposed approach is described in detail and its performance results are presented.
The advantages of the proposed method are that no modifications to the TCAM device are required, the checking is done on-line and the approach has low power and area overheads. Nicolaidis, R. Aitken, B. Aktan and O. As transistor geometries shrink, the number of physical failure mechanisms is increasing while at the same time the number of transistors per chip is still growing.
The rollout of new services is pushing compute demands both in handheld devices and in the data center which is driving up complexity and the level of integration. People are becoming critically dependent on mobile services and expect high availability. Looking forward to the deployment of the Internet of Things IoT where processors and routers will be embedded in billions of end-points, we are only going to see an increased demand for reliable computing. In this session, we bring together three different industrial perspectives on reliability.
The first looks at the end-points, the second looks at the servers and the last looks at the economic drivers for reliability and the demand for new EDA tools for reliability analysis. In the first talk, Rob Aitken from ARM will discuss the reliability challenges in mobile applications. As mobile systems continue to increase in size and complexity, and user requirements are also becoming more stringent, it is important for designers of mobile systems to be aware of reliability issues, and to adapt their methodologies accordingly.
This talk discusses the issues involved, from latent defects, through soft errors, aging and wearout, and shows how to consider these as part of the design process, how to quantify their effects, and how to mitigate them through design changes.
In the second presentation, Burcin Aktan from Intel is going to discuss the evolution of the reliability features that are found in server applications. With so many processing units packed in data centers the reliability requirements on an individual device is growing, especially with integrated memory controllers and very high bandwidth data pathways. Finally we will close with remarks on future directions and possible research areas.
In the final presentation, Olivier Lauzeral from iROC Technologies will discuss the importance of methodologies for the reliability analysis of complex SoCs. There is an inherent cost to adding reliability features in a complex IC and designers need to be able to make informed decisions about how much hardware to allocate for mitigation redundancy, error correction, repair.
A prerequisite to make such choices is clearly defined targets and this requires an economic framework where the cost of failures is understood. Once the reliability targets for a system and individual devices are established, there is a need for EDA tools which allow designers to compute the failure rate and failure modes within the device. This analysis must include all failure mechanisms radiation effects, lifetime effects, manufacturing detects and take into account the relevant de-ratings between faults and observed errors.
This new EDA infra-structure is key for designers to make effective trade-offs in order to arrive at a cost effective design. Nicolaidis, S. Wen and T. The identification of the optimal set of flip-flops to protect typically requires compute-intensive fault-injection campaigns. We present new techniques which group similar flip-flops into clusters to significantly reduce the number of fault injections. The number of required fault injections can be significantly lower than the total number of flip-flops and in one industrial design with over , flip-flops, by simulating only 2, fault injections, the technique identified a set of 4.
Memory cells designed using eDRAM technology in addition to being logic-compatible, are variation tolerant and immune to noise present at low supply voltages. However, two major causes of concern are the data retention capability which is worsened by parameter variations leading to frequent data refreshes resulting in large dynamic power overhead and the transient reduction of stored charge increasing soft-error SE susceptibility.
The retention time on-average is improved by 2. Valadimas, Y. Tsiatouhas, A. Arapoyanni and A. In this work, we present a flip-flop oriented soft error detection and correction technique. It exploits a transition detector at the output of the flip-flop for error detection along with an asynchronous local error correction scheme to provide soft error tolerance.
Alternatively, a low cost soft error detection scheme is introduced, which shares a transition detector among multiple flip-flops, while error recovery relies on architectural replay. Alexandrescu and E. The SER study aims at providing relevant information about the circuit behavior in the specified working environment, in terms of Functional Failures rates, criticality and so on. Ultimately, the error mitigation efforts are directed at improving the function of the circuit in the presence of SEE by either reducing the failure occurrence rate or the failure impact.
However, when dealing with SEEs affecting highly sophisticated electronic designs, functional issues are one of the most complex aspects to reliably characterize.
ISBN 13: 9789812389404
This paper aims at proposing and evaluating several fault characterization techniques, meant to approximate the functional failures induced by Single Event Upsets in complex circuits, very early in the design flow. The two main contributions of our efforts consist in a differential fault simulation approach based on standard simulation tools and a novel parallel, SEE-optimized, stand-alone simulation tool. Both methods accurately evaluate the immediate propagation of SEE-induced faults and predict the long-term behavior of the faulty circuit running a specified application.
The works described in this paper also benefit from various optimization techniques targeting lower simulation costs in terms of CPU and man-power while preserving the accuracy of the results. Ultimately, the results of each method compare positively with reference data obtained from an exhaustive fault simulation campaign.
This encouraging outcome suggests that we can reliably obtain highly informative functional error information while spending reasonable resources CPU, man-power, time. Wong, B. Bhuva, A. However, RAS evaluation at the system-level requires precise mapping between component failure modes, their system failure signatures and system reliability requirements. In this paper, RAS analysis carried out on internet switches in a top-down hierarchical fashion is presented.
Results show availability of failure classification at a lower-level of design allows for better fault management and improved RAS metrics at the system-level. A hierarchical modeling format is proposed to standardize the reporting of component failure modes to improve the system level modeling of RAS. Vilchis, R. Venkatraman, E. The analysis integrates tightly with the design flow and provides static and dynamic de-rating algorithms. Wen, D. This language enables EDA tools to analyze reliability models and to compute the failure rates for complex systems. A formal language makes it possible for suppliers and consumers to exchange reliability information in a consistent fashion and to use this information to build accurate reliability models.
The RIIF language is a general purpose reliability modeling language and is not tied to a specific application domain or implementation technology. Embedded SRAM instances are critical contributors to the overall Soft Error Rate of the system, requiring a careful consideration of the reliability aspects and adequate sizing of the error mitigation capabilities. While error detecting and correcting codes are widely available and particularly effective against most types of Single Event Effects, Multiple Bit Upsets and progressive errors accumulation may defeat the error correction capabilities of standard SECDED codes.
Accordingly, the paper presents an overall approach to the structural and functional SER analysis of the memory instances in addition to error mitigation efficiency estimation. Moreover, intrinsic, nominal, SER figures are not a realistic indicator of the memory behavior for a given application. We propose instead, an opportunity window metric, associated to the notion of data lifetime in the memory, as extracted from functional simulations. Lastly, based on the opportunity window figures, targeted and efficient fault simulation campaigns can be prepared to estimate high-level functional failures induced by Single Events.
The overall memory SER evaluation aims at assisting the designers to improve the performances of the design and to document the reliability figures of the system. The work environment may cause a myriad of distinctive transient pulses in various cell types that are used in widely different configurations. We present practical methods to help characterizing the standard cell library using dedicated tools and results from radiation testing. This item is printed on demand. Book Description Wspc.
Seller Inventory ING Daniel M. Publisher: Wspc , This specific ISBN edition is currently not available.
View all copies of this ISBN edition:. Synopsis About this title This book provides a detailed treatment of radiation effects in electronic devices, including effects at the material, device, and circuit levels. Review : "Ron Schrimpf and Dan Fleetwood are world renowned experts in radiation effects. Buy New Learn more about this copy. Customers who bought this item also bought. Stock Image. Published by World Scientific New Quantity Available: 1. Seller Rating:. New Hardcover Quantity Available: 1. Filtering preventing a signal to pass in the case that one of the redundant signals is wrong by comparing the value of the redundant signals , and voting circuits selects the correct signal from the majority among several 3 or more redundant signals.
However, the application of a corresponding error correction to logic circuits is very limited and application specific e. State-of-the art for layout techniques for soft-error hard design mainly consist of simple spacing and sizing, and in adding additional contacts. A radiation generated single event soft- error SEE occurs when the charge, generated in the semiconductor material by one or more e. This leads to current pulses on the circuit nets, connected to these contact areas, which, in their turn, cause voltage pulses in the circuit which can upset a sequential element latch, flip-flop or propagate through combinational logic and be latched in as errors at the next sequential element in the circuit.
This invention comprises a unique new layout method, which takes advantage of the overall circuit response to a single event effect, and, furthermore, comprises circuit cells, with layout, which are protected against soft errors.
Soft errors in advanced CMOS technologies - IEEE Conference Publication
The method uses an arrangement of critical contact areas in such a way that single event pulses in the circuit, that are generated on multiple nodes, act to oppose each other and hence cancel or greatly reduce the effect of the single event. In the case that a primary and secondary circuit is used to maintain, or process the signal in a circuit, addition rules, described in section 4, are used, so that no possibility remains that a error is generated in both primary and secondary circuit, and hence that the combination of primary and secondary circuit will be fully error free.
Table 1. The state for the nodes in a circuit that uses a primary nodes n 1 ,n 2 and secondary nodes n 3 ,n 4 circuit for storage or processing of the state. A first preferred layout arrangement for the layout of the DICE latch cell. Any cyclic simultaneous permutation of the n and p nodes will be equivalent and part of the invention. The mosfets can be placed in separate active areas, or the adjacent n and p nodes can be placed in the same active area. The MOSFET sources can be placed in the line of the drains or in the direction vertical to the line of the drain nodes.
The well contacts can be placed on either side only, or also surround the adjacent node pairs. The shapes numbered 60 - 65 shows the masks in the layout of the integrated circuit, which are used to form the contact areas. An integrated circuit is designed by a creation of a complete representation of the circuit in a computer in the form of digital information.
This representation consist of geometric shapes that make up the layout of the circuit, along with some information about how these shapes are organized and how they will be used in the manufacturing of the circuit. This digital representation of the circuit has a unique one-to-one physical transformation to the final integrated circuit.
The transformation is realized by passing the data of the tape-out to manufacturing e. The geometrical shapes in the circuit representation are organized in masks, where each mask comprises a set of geometric shapes and is applied to define the structures and shapes on the final integrated circuit that are created in a certain step in the manufacturing process. A final shape in the integrated circuit can be defined by one mask or by a combination of two or more masks, depending on how the masks are used in the manufacturing process.
In FIG. The active mask 60 is used in a etch step to define active areas where contacts and devices can be formed in the semiconductor substrate. After a deposition of additional material layers insulator material and gate material the poly mask 61 is used in an etch step to etch the additional material layers. The device contact areas source and drain are the active areas that are not covered by material after the etching using the poly 61 mask. The contact areas are therefore defined by the difference of the active mask and the poly mask.
The poly mask regions which are inside the active regions will form the device gate contacts. A gate contact is not a connection to the substrate, but to the gate material, and is always defined as the region between two source and drain device contact areas. The type of a contact area is defined by other masks which define where certain type of doping impurities such as boron, phosphorus, arsenic, etc. The n-well mask 62 defines regions where an n-type doping will be implanted into the substrate.
These regions are the so called n-wells, and the contact regions inside the n-wells will either become device source- and drain- contact areas of p-type MOSFET devices, or so called well-ties direct contact to the n-well doping region. The regions outside of the n-well mask 62 will be doped with a p-type doping, and form the so called p-wells. The contact regions inside the p-wells will either become device source- and drain- contact areas of n-type MOSFET devices, or so called well-ties direct contact to the p-well doping region. The semiconductor devices in a MOS technology are fully defined by the basic masks 60 , 61 , 62 , 63 , and In some technologies additional masks may be used for some special manufacturing step related to the device formation.
However, the contact areas will be fully defined by the geometric shapes in a set of masks in the representation of the integrated circuit. Additional masks are then applied to connect the contact areas with metal lines according to the schematics of the circuit. The contact mask 65 is one of masks used for this. It defines where a metal contact a so called via is created, which connects a device contact area to a metal line in a first metal layer above the semiconductor substrate.
The connection to the gate contacts are defined by mask shapes in the same way as the device source and drain contact areas. Several metal layers are formed above the semiconductor substrate, each with a mask that defines the pattern the metal lines and with so called via masks that define where one metal layer is connected to another metal layer. The creation of the various metal- and via-masks is referred to as the routing. It simply implements the connections between the devices which are fully defined by the circuit schematics or net list which describes how the devices are connected to each other.
The layout in FIG. The notation of the contact areas in FIG. These additional devices can be inserted in the layout in a straight forward manner according to the steps and rules in the layout methodology, such that the basic arrangement of all contact areas connected to net 1 - 4 remain the same, and such that all contact areas remain along a line in the layout. Node 6 a and 6 b are connected. The dotted gate adjacent to node 6 a may or may not be included both variants included in the claims , but p 1 and 6 a are physically separate.
The layout derives from the layout in FIG. The dotted gate adjacent to node 6 a and 7 a may or may not be included, but the adjacent drain areas, are physically separate. The dotted gates adjacent to nodes 6 a , 7 a , 8 a , 9 a may or may not be included both variants included in the claims , but nodes 6 a , 7 a , 8 a , 9 a are physically separate from their adjacent MOSFET drains. For a single event affecting several nodes, the primary latch can only be upset when node 1 is HIGH, and the redundant latch can only be upset when node 1 r is LOW.
Hence, any single event that affects both latches, can only upset one of the two latches in the BISER configuration, and therefore, cannot generate an error. In a duplicated inverter where the redundant and primary nodes carry opposite states, error signals on both primary and redundant nodes can be generated if both ndrain 0 and pdrain 1 are affected if D is high or if both ndrain 1 and pdrain 0 are affected D low. By placing the nodes such, that if a particle trace goes through two nodes that can cause an error transient on both primary and redundant output, then the trace also passes through the other nodes and the pulse on one of the nets are suppressed.
For example, consider the trace in the figure; if node 0 is high, the charge collected on ndrain 0 will pull node 0 low error transient , the charge collected on node pdrain 1 will pull node 1 high, however, the charge collected on node ndrain 1 will pull node 1 low, opposing the effect on pdrain 1 , and keeping node 1 low i.
If node 0 is low, the charge collected on node ndrain 1 , will pull node 1 low error transient , however, the charge collected at ndrain 0 , will keep node 0 low i. It should be pointed out that in the general case there will be some pulses on all nodes; however a full swing pulse a transient that can propagate can only be generated on one, and one only, of the duplicated nodes. The layout apparatus system comprises a computer server A. The computer server A is connected via a network, such as Ethernet, to terminal devices B for input of data and commands.
Inside the computer server, represented as digital information in computer memory, resides a representation of the integrated circuit. In particular, a representation of the geometrical shapes of the layout of the integrated circuits, which fully specify the integrated circuit, reside in memory in A.
Furthermore, the instructions to manipulate the layout shapes based on the rules as presented herein reside in the computer memory and can be executed to carry out the manipulation of the layout shapes. The computer server A has an output device where the completed layout for the integrated circuit is output to a storage medium aka tape out, C.
The geometric shapes in the layout of the integrated circuit have a unique one-to-one physical transformation to the final integrated circuit. This transformation is realized by passing the data of the tape-out to manufacturing e. The method includes designing a circuit layout of an electronic integrated circuit, the circuit comprising component contact areas, voltage states, and nets, the method being embodied in a data processing apparatus having at least an arithmetic processor and memory.
It can also include designing a mask layout of the integrated circuit, the mask layout based on the circuit layout designed using this method, and storing the mask layout in the data processing apparatus memory. Further, the method of designing a circuit layout of an electronic integrated circuit can further comprise a non-transitory computer-readable medium storing a computer program for causing a computer to perform at least one of the steps in the method.
As shown in FIG. These components are connected to one another via a bus J, which is a signal line to transmit data, so as to be capable of sending and receiving data. Next, the data structure of content is described. It comprises representations of the integrated circuit as is standard in modern electronic design.
In particular it contains the geometric shapes that make up the layout of the circuit. These are organized in masks which will be used, typically in an automated process, to perform the physical transformation of the computer representation of the integrated circuit to the integrated circuit itself.
The geometric shapes in the layout form electronic devices and connections between electronic devices. In particular a well-defined sub-set of these shapes form the contact areas of each of these devices. By manipulating the coordinate position of these shapes and their size stored digital information in RAM H or on the DISK L the position and shape of the contact areas of the integrated circuit can be manipulated. The instructions to manipulate the geometric shapes in the layout according to the rules described herein also reside in memory in the computer.