Authors: Han Hoang, Joshua Li
Mentors: Lindsey Kostas, Dhiman Sengupta
Chip design often relies on place-and-route (PnR) tools, which iteratively arrange and connect millions of components (logic gates, wires, etc.) on a chip. While these traditional flows work, they can be time-consuming and labor-intensive, since each design iteration must carefully balance factors like performance, power consumption, and physical layout constraints.
A more efficient alternative is data-driven optimization, where machine learning models predict possible bottlenecks—such as congestion—early in the design cycle. With congestion insights in hand, designers can fine-tune component placement and wiring to reduce wasted resources, speeding up the entire process. A netlist helps in this task by modeling the circuit as a hypergraph: nodes represent components (like logic gates), and hyperedges capture their electrical connections.
DE-HNN (Demand-Estimating Hypergraph Neural Network) [1] is a leading approach for learning from this netlist structure. By using hierarchical virtual nodes to capture both local and long-range interactions in the graph, DE-HNN excels at predicting congestion or “demand.” However, it requires substantial compute power and memory, which can limit its practical use.
Our project focuses on optimizing DE-HNN’s training cost—reducing runtime or memory needs—while preserving, as much as possible, the model’s predictive accuracy on congestion. This balance between efficiency and performance is crucial for making advanced ML-based chip design viable in real-world production flows.
All experiments ran on UC San Diego’s DSMLP cloud system with an NVIDIA RTX A5000 GPU (24 GB VRAM), using PyTorch 2.2.2 and CUDA 12.2. This setup provided enough computational power and memory to handle large netlists and iterative optimization runs.
We worked with six netlists that vary widely in size, ranging from about 460k to 920k nodes, with corresponding nets and edges. The first five netlists were used for training, while the sixth netlist served as our validation and test set.
Below is a summary of the main features used in our model:
Our model predicts demand (an indicator of congestion) by minimizing Mean Squared Error (MSE) on the training set. We monitor the validation loss to detect overfitting. Final results are reported on a dedicated test portion of netlist 6 to assess how well the model generalizes.
We adopted an iterative optimization approach to make our DE-HNN model more resource-efficient while preserving accuracy. Each stage in this process addresses a different factor that could inflate training time or cause overfitting.
Figure 1: Initial loss curves for the baseline model. The training loss keeps decreasing, while the validation loss stabilizes and eventually fluctuates, indicating potential overfitting.
Architecture Adjustments (AA)
Overall, each stage in this optimization pipeline—Early Stopping, Architecture Adjustments, and a Dynamic Learning Rate—targets a different facet of model complexity and convergence. By layering them together, we significantly cut down on training time and resource usage, while preserving DE-HNN’s strong performance in predicting IC congestion.
Our baseline DE-HNN uses:
This baseline achieves:
These results serve as a reference. Our optimizations aim to significantly reduce memory usage and runtime, ideally without sacrificing accuracy.
We tested three main optimization steps—Early Stopping, Architecture Adjustments, and a Dynamic Learning Rate (DLR)—to lower DE-HNN’s training time and memory usage while keeping its predictive performance as high as possible.
Early Stopping halts training once validation loss stops improving within a specified tolerance, preventing unnecessary epochs where the model may overfit.
Figure 2: Early stoppage on baseline model.
In short, the baseline model was over-training well past epoch 15, and Early Stopping helped us achieve better validation performance in less time.
We systematically tested different DE-HNN configurations (2, 3, or 4 layers) paired with various embedding dimensions (8, 16, or 32). Early Stopping was applied to avoid training beyond the point of overfitting. Our goal was to find a balance between performance (Node/Net MSE) and resource savings (training time, memory usage).
Below are key heatmaps illustrating reduction in training cost and model performance in %:
Figure 3: Average % reduction in training cost.
Figure 4: Average % reduction in model performance.
Figure 5: Average % reduction across computational cost and model performance.
We picked the model with 4 layers, 8 dimensions to be our Grid Search optimal model. The loss curves for this model is shown below:
Figure 6: Validation loss remains stable, indicating less overfitting.
In summary:
Lastly, we introduced a Cyclical Learning Rate (CLR) that oscillates between minimum and maximum bounds in each cycle, helping the model skip local minima and converge faster.
Figure 7: CLR further reduces the required epochs, though Net MSE sees a trade-off.
Despite the slight Net MSE compromise, DLR significantly shortened training, suggesting a valuable option for faster convergence when node-level demand accuracy is paramount.
By stacking these optimizations, our final optimized model has achieved these statistics:
These results confirm that DE-HNN can be made significantly more practical by controlling overfitting (ES), reducing model complexity (AA), and speeding up convergence (DLR)—all while maintaining strong predictive accuracy for node demand.
Through a series of iterative optimizations, we transformed DE-HNN from a high-performing but computationally intensive model into a far more efficient and nearly equally accurate tool for predicting IC congestion:
These results highlight the trade-off between performance and computational resources in real-world applications. By systematically combining Early Stopping, Architecture Adjustments, and a Dynamic Learning Rate, we’ve shown that DE-HNN can be scaled down to meet resource constraints while retaining most of its predictive power. This lays the groundwork for faster, more cost-effective congestion prediction in IC design.
[1] Luo, Zhishang, Truong Son Hy, Puoya Tabaghi, Donghyeon Koh, Michael Defferrard, Elahe Rezaei, Ryan Carey, Rhett Davis, Rajeev Jain, and Yusu Wang. (2024). DE-HNN: An effective neural model for Circuit Netlist representation. arXiv preprint, arXiv:2404.00477.
[2] Smith, Leslie N. 2015. “No More Pesky Learning Rate Guessing Games.” CoRR arXiv:1506.01186