CloudCast: Multi-Cloud Data Transfer
Optimize data broadcast across cloud regions with heterogeneous pricing and bandwidth. The goal is to minimize total egress cost while meeting transfer requirements.Problem Description
Given:- A source region with data to broadcast
- Multiple destination regions that need the data
- Network graph with bandwidth and cost per GB between regions
- Data partitioned into chunks
Initial Program
Evaluator
The evaluator validates routing correctness and simulates the transfer to compute cost:Running the Example
What to optimize: Evolution should discover relay strategies (e.g., routing through intermediate regions with cheaper egress) instead of direct point-to-point transfers.
Expert Parallelism Load Balancer (EPLB)
Balance load across GPUs in Mixture-of-Experts (MoE) model inference by deciding expert replication and placement.Problem Description
Given:- Load statistics for each logical expert across layers
- GPU cluster topology (nodes, GPUs per node)
- Number of expert groups
Initial Program
Evaluator
Running the Example
All Systems Benchmarks
Model Placement (Prism)
Model Placement (Prism)
Path:
benchmarks/ADRS/prism/Assign LLM models to a GPU cluster to minimize worst-case KV-cache pressure. Each GPU has 80GB memory. Lower pressure = more headroom for serving.Metric: Minimize max(pressure_ratio) across all GPUsLLM-SQL Column Reordering
LLM-SQL Column Reordering
Path:
benchmarks/ADRS/llm_sql/Reorder table columns to maximize prefix-cache hit rates when serializing rows into LLM prompts. Consecutive rows sharing leading column values can reuse cached prefixes.Metric: Maximize prefix cache hit rateTransaction Scheduling
Transaction Scheduling
Path:
benchmarks/ADRS/txn_scheduling/Schedule database transactions with read/write dependencies to minimize total makespan while respecting conflict constraints.Metric: Minimize completion timeKey Concepts
System Constraints
Solutions must respect physical limits: bandwidth, memory, dependencies
Multi-Objective
Often trade-offs between cost, latency, throughput, and fairness
Simulation
Evaluators simulate system behavior rather than deploy real infrastructure
Real Workloads
Use traces from production systems for realistic evaluation
Installation
Systems benchmarks require additional dependencies:Tips for Systems Benchmarks
Start with Baselines
The initial programs implement simple strategies (greedy, shortest path). Evolution will discover better heuristics.
Check Validation
Systems evaluators have strict validation. Review evaluator code to understand what makes a solution valid.
Next Steps
Math Examples
Explore math benchmarks
Algorithm Examples
See competitive programming
Create Custom
Build your own benchmark