Technical Activity Committee on Silent Data Corruption
|Yervant Zorian, Synopsys, Yervant.Zorian@synopsys.com
|Harish Dixit, Meta; Dimitris Gizopoulos, University of Athens;
Recent publications from hyperscalers, such as Meta, Google, etc. have highlighted the “Silent Data Corruption” as an eminent challenge that surfaces in modern data centers as functional errors after years of operation. What might we do in design to combat the complexities and costs of this problem. How do we analyze the root cause? Do we build our new designs to be reactive to SDC after they happen, detecting and potentially correcting faults? or is it better to prognose degrading faults in advance to perform predictive maintenance and intercept before SDC happens? What particular metrics need to be used to meet SDC requirements, in terms of quality and RAS?
This TAC will highlight the different efforts underway, raise public, and allow to gain insights from experts covering different perspectives from hyperscalers, semiconductor suppliers, IP vendors, and research community. The above can be achieved through special session presentations at TTTC conferences, Special issues of relevant publications, and standardization of metrics as necessary.