This article focuses on enhancing fault tolerance and reliability in electronic systems by providing recommendations for diagnosing and repairing basic semiconductor devices. The study introduces a novel method for diagnosing radio-electronic equipment using a device that localizes faulty elements. The goals of the research include improving fault localization accuracy and reducing repair time. The proposed method is based on the principles of fault isolation and utilizes advanced diagnostic techniques. Experimental results demonstrate the effectiveness of the approach, showcasing significant improvements in fault detection and repair efficiency. The implications of this research suggest that implementing the proposed method can enhance the reliability and maintainability of electronic systems, leading to reduced downtime and improved overall system performance.
Improved fault localization accuracy: The article presents a novel method for diagnosing radio-electronic equipment that focuses on accurately localizing faulty elements, leading to enhanced fault detection and diagnosis precision.
Reduced repair time: The research aims to provide recommendations for repairing basic semiconductor devices, resulting in reduced repair time and improved system availability, ultimately minimizing downtime and improving overall system efficiency.
Enhanced system reliability: By implementing the proposed method and incorporating advanced diagnostic techniques, the article highlights the potential for improving the reliability of electronic systems, leading to increased system performance and reduced maintenance efforts.
Keywords: fault tolerance, reliability, electronic systems, diagnostics, repair.
There are systems that have the ability to function in conditions of failures or failures. Such systems are called fault-tolerant or fault-tolerant. Since the creation of reliable, fault-tolerant systems is one of the key tasks of science and technology, to achieve such goals, various methods are used to increase fault tolerance and, in turn, reliability. To do this, for example, redundancy is used as a way to ensure the reliability of an object through the use of additional means and capabilities that are redundant in relation to the minimum necessary to perform the required functions. Thus, redundancy is the introduction of various kinds of redundancy: structural, temporal, informational, algorithmic, functional, etc.
Passive fault tolerance is used where even short-term interruptions in the system operation are unacceptable. Active fault tolerance takes time to detect, localize failures and the so-called reconfiguration of the system, but it is advantageous in terms of redundancy.
Such structures have the disadvantage of reducing performance by 10-15% due to the introduction of a large number of voting schemes, but design engineers do this by compensating for time costs by other methods.
The organization of repair work of any radio-electronic device in most cases is complex. Troubleshooting, its localization and elimination are carried out, as a rule, with the help of control and diagnostic measuring instruments. After any type of repair and restoration work of radio-electronic equipment, it is necessary to conduct a thorough preliminary check of the performance of its individual blocks or assemblies. In some cases, a step-by-step check of cascades or nodes makes it possible to detect defects that were not previously identified and to check the correctness of the block replacements.
When carrying out diagnostics of the main semiconductor devices, it is also necessary to check the passive elements that specify the electrical modes of operation of the active components. Often, a defect caused by the failure of passive elements is the reason for the loss of operability of the node on active devices. Before making a final decision about replacement, make sure that the board's printed conductors and passive elements are in good condition.
Of course, as recommendations for repair work, it should be noted the need for a comprehensive analysis of the causes that could lead to the appearance of a defect or failure of performance. When the cause is identified, it is necessary to restore the logic of the actions that caused this or that failure, on the basis of which it is easier to predict possible malfunctions of the elements and localize them. If it becomes necessary to replace elements, it should be carried out using original components or the closest functional analogues. In this selection of elements, first of all, the parameters that are most critical for functioning in specific conditions are taken into account. These may include thermal conditions, as well as the maximum current or voltage values of the device used. It is possible to localize a faulty node by external signs of the manifestation of a defect and, accordingly, outline an action plan to identify the malfunction that has occurred.
A block diagram that implements the method of localizing a faulty element when predicting failures in control objects (OC) performed on combinational elements is shown in Fig. 1.
Figure 1. Structural diagram of the device for localizing a faulty element when programming OC failures on combinational elements.
The control circuit synchronizes the test generation generator, the power supply, and the storage and comparison circuits. After zeroing the device, the first set of the minimum predictive test is fed to the input of the tested OC. In this case, the control circuit outputs the rated voltage (Un=1) from the power source and prepares the storage circuit for receiving information.
At the end of the test, the first set of the minimum predictive test continues to affect the tested CC. At the same time, the control circuit outputs a threshold supply voltage Up from the power source, and the input of the memory circuit is turned off and the comparison circuit is connected. After checking, the test generation generator is turned off, and in the comparison circuit, the information recorded in the storage circuit at Un=1 and the information from the output of the checked OC at Up=0 are compared.
A sign of their difference is “logical 1” written in the processing and registration circuit, otherwise “logical 0” is written in the processing and registration circuit. Let us explain the implementation of the forecasting method using the example of the simplest OC, consisting of four series-connected elements (Fig. 2), here, a pulse is applied to the OC input from the pulse generator (Fig. 2 a). Since at the initial time t1 the supply voltage is equal to the nominal value, the pulse will pass through all the elements of the OC (Fig. 2b).
These elements pass from one state to another, and a pulse at the OC output will appear after 4’3 relative to the input (Fig. 2 c) when a pulse appears at the OK output, the supply voltage drops to Up, and the OC input is disconnected from the pulse generator. The duration of the input pulse is automatically set to 43. If at the same time all the elements are in good order, then the pulse duration at the OC output (Fig. 2 c) 2 = 4’3, where ’3 3 is the delay time of one element at the threshold supply voltage (3 >>’3). .
Figure 2. To an explanation of the principle of forecasting.
Thus, according to the results of diagnostics of the main semiconductor devices, it becomes possible to check passive elements that set the operating modes of active components. Therefore, the resulting defect caused by the failure of passive elements can be localized and eliminated before it causes the node to fail in active devices.
As one of the most effective methods for diagnosing electronic equipment, it is possible to apply a step-by-step check of cascades or nodes by the considered method, which will allow you to detect defects that were not detected during general diagnostics, as well as to check the correctness of the replacement of electronic equipment blocks.