Student research: Yufu Zhang

news story image

Yufu Zhang, a Ph.D. student in the Department of Electrical and Computer Engineering advised by Assistant Professor Ankur Srivastava (ECE/ISR), conducts research that will improve heat distribution on computer chip, improving their reliability and longevity.

With the ever increasing density and operating frequency of today's nanometer integrated circuits, the operating temperature of a typical chip can easily rise up to 150 degrees Celsius (302 degrees Fahrenheit). This leads to highly unreliable and error-prone chip behavior. What makes the scenario even worse is the variations and randomness induced by the manufacturing process as well as the unpredictable computing workload scheduled on the chip. All these will cause an uneven and hard-to-predict heat distribution on the chip.

To address such thermal problems, the first question that naturally arises is how we can accurately estimate the thermal profile of the chip using runtime information.

Srivastava and Zhang's strategy for thermal data mining is to utilize on-chip temperature sensors which can provide runtime temperature information. However, the location and number of such thermal sensors are highly constrained due to area and power considerations. Moreover, sensors at such nanoscale dimensions are also noisy and error prone. The problem remains as how to effectively estimate the temperature for the entire chip given only a few sensors.

Zhang first looked at the Poisson Equation which is governing the relationship between the heat distribution and power dissipation of the chip and then obtain an accurate model for the underlying thermal problem. From there he uses statistical strategies to explore the correlations between the power dissipation of different chip modules. The idea is that (due to the correlations mentioned earlier) the temperatures at certain sensor locations can be used to obtain thermal information at other locations as well. Thus, when given only a few thermal sensor readings, Zhang can combine them with the power correlation information to effectively construct the thermal profile for the entire chip. Through experiments, this methodology has demonstrated high accuracy compared to other estimation schemes.

Published April 3, 2009