Skip to main content
SkyDiscover’s architecture lets you plug in custom search algorithms without rewriting the generate-evaluate loop. You only implement what changes—everything else is inherited.

Two Levels of Customization

There are two levels depending on what you need:

Database Only

Customize add() and sample() for different parent selection or storage logic

Database + Controller

Override run_discovery() for cross-iteration behavior like stagnation response or acceptance gating

Level 1: Database Only

Subclass ProgramDatabase and implement two abstract methods. The default controller runs the loop unchanged.

Complete Implementation Example

Here’s the full implementation of TopK search (56 lines):
skydiscover/search/topk/database.py
import logging
from typing import List, Optional, Tuple

from skydiscover.config import DatabaseConfig
from skydiscover.search.base_database import Program, ProgramDatabase

logger = logging.getLogger(__name__)

class TopKDatabase(ProgramDatabase):
    """Database for top-k programs"""

    def __init__(self, name: str, config: DatabaseConfig):
        super().__init__(name, config)
        self.initial_program = None

    def add(self, program: Program, iteration: Optional[int] = None, **kwargs) -> str:
        """Add a program to the database."""
        # Store the initial program
        if iteration == 0 or program.iteration_found == 0:
            self.initial_program = program

        # Store the program
        self.programs[program.id] = program

        # Track last iteration
        if iteration is not None:
            self.last_iteration = max(self.last_iteration, iteration)

        # Save to disk if configured
        if self.config.db_path:
            self._save_program(program)

        # Update best program tracking (required)
        self._update_best_program(program)

        logger.debug(f"Added program {program.id} to top-k database")
        return program.id

    def sample(
        self, num_context_programs: Optional[int] = 4, **kwargs
    ) -> Tuple[Program, List[Program]]:
        """
        Sample a program and context programs for the next discovery step.

        Top-K sampling strategy:
        - Parent: Top 1 program (best program)
        - Context programs: Next K programs (ranks 2 to K+1)
        """
        if not self.programs:
            raise ValueError("Cannot sample: no programs in database")

        # Get top (K+1) programs: top 1 for parent, next K for context
        total_needed = num_context_programs + 1
        top_programs = self.get_top_programs(total_needed)

        if len(top_programs) < 2:
            parent = top_programs[0]
            context_programs = [top_programs[0]]
        else:
            parent = top_programs[0]
            context_programs = top_programs[1:min(len(top_programs), num_context_programs + 1)]

        return parent, context_programs

Key Points

  1. Store the program: self.programs[program.id] = program
  2. Update iteration tracking: self.last_iteration = max(self.last_iteration, iteration)
  3. Persist to disk: self._save_program(program) (if config.db_path is set)
  4. Call self._update_best_program(program) — required for tracking global best
Returns (parent, context_programs) where:
  • parent is a single Program to mutate
  • context_programs is a List[Program] shown as examples to the LLM
Optionally, you can return dict-wrapped results to add metadata:
  • ({"island_3": parent}, {"top_performers": context_programs})
  • self.programsdict[str, Program] storing all programs
  • self._update_best_program(program) — updates global best (call in add())
  • self._save_program(program) — persists to disk
  • self.get_top_programs(n) — returns top N by score
  • self.get_best_program() — returns highest-scoring program

Program Dataclass

Every program has these fields:
@dataclass
class Program:
    # Identification
    id: str                          # UUID
    solution: str                    # Source code or prompt text
    language: str = "python"

    # Performance
    metrics: Dict[str, Any]          # Evaluation results (includes combined_score)

    # Lineage
    parent_id: Optional[str] = None  # ID of parent program
    iteration_found: int = 0         # Iteration that produced this

    # Metadata
    metadata: Dict[str, Any]         # Arbitrary extra data
    artifacts: Dict[str, Any]        # Evaluation artifacts
    timestamp: float

Registration

Add your algorithm to skydiscover/search/route.py:
route.py
from skydiscover.search.my_algo.database import MyDatabase
from skydiscover.search.registry import register_database

register_database("my_algo", MyDatabase)
That’s all. Now --search my_algo works:
skydiscover-run initial.py evaluator.py --search my_algo -i 100
Simple algorithms at this level: topk/ (56 lines), best_of_n/ (85 lines), beam_search/ (527 lines)

Level 2: Database + Controller

Use this when you need behavior that spans across iterations:
  • Tracking improvement history
  • Reacting to stagnation
  • Filtering results before they enter the population
  • Multi-island search with migration
  • Acceptance gating
The key point: you do not rewrite generate-evaluate logic. You call _run_iteration(), which runs the full sample → prompt → LLM → evaluate cycle, then decide what to do with the result.

Controller Template

from skydiscover.search.default_discovery_controller import (
    DiscoveryController,
    DiscoveryControllerInput
)

class MyController(DiscoveryController):

    def __init__(self, controller_input: DiscoveryControllerInput):
        super().__init__(controller_input)
        # Custom initialization here
        self.stagnation_count = 0

    async def run_discovery(
        self,
        start_iteration: int,
        max_iterations: int,
        checkpoint_callback=None,
        post_process_result=True,
        retry_times=3,
    ):
        """
        Custom discovery loop with cross-iteration logic.
        """
        for iteration in range(start_iteration, start_iteration + max_iterations):
            if self.shutdown_event.is_set():
                break

            # Run full generate-evaluate cycle
            result = await self._run_iteration(iteration, retry_times=retry_times)

            if result.error:
                continue

            # Optional: filter before storing
            if self._should_accept(result):
                self._process_iteration_result(
                    result,
                    iteration,
                    checkpoint_callback
                )

            # Cross-iteration logic
            if self._is_stagnating():
                self._trigger_diversification()

        return self.database.get_best_program()

    def _should_accept(self, result):
        """Custom acceptance criterion."""
        child = result.child_program_dict
        parent = self.database.get(result.parent_id)
        return child["metrics"]["combined_score"] > parent.metrics["combined_score"]

    def _is_stagnating(self):
        """Detect stagnation."""
        return self.stagnation_count > 10

    def _trigger_diversification(self):
        """Custom diversification strategy."""
        self.stagnation_count = 0
        # ... your logic here

Controller Primitives

_run_iteration(iteration)
async
Runs the full sample → prompt → LLM → evaluate cycle for one iteration. Returns SerializableResult.
_process_iteration_result(result, iteration, cb)
method
Stores result to database, logs metrics, and triggers checkpoint callback.
database.get_best_program()
method
Returns the best program seen so far.
shutdown_event.is_set()
property
Returns True when graceful shutdown is requested.

SerializableResult Fields

@dataclass
class SerializableResult:
    error: Optional[str] = None                # Error message if failed
    child_program_dict: Optional[dict] = None  # Generated program as dict
    parent_id: Optional[str] = None            # Parent program ID
    other_context_ids: List[str] = []          # Context program IDs
    iteration_time: float = 0.0                # Execution time
    prompt: Optional[dict] = None              # System and user prompts
    llm_response: Optional[str] = None         # Raw LLM output
    iteration: int = 0                         # Iteration number

Registration

Register both database and controller:
route.py
from skydiscover.search.my_algo.database import MyDatabase
from skydiscover.search.my_algo.controller import MyController
from skydiscover.search.registry import register_database, register_controller

register_database("my_algo", MyDatabase)
register_controller("my_algo", MyController)
Complex algorithms at this level: adaevolve/ (multi-island UCB search), gepa_native/ (acceptance gating + merge), evox/ (co-evolves the search algorithm itself)

Custom Configuration

If your algorithm has custom settings, add a dataclass in skydiscover/config.py:
config.py
from dataclasses import dataclass

@dataclass
class MyDatabaseConfig(DatabaseConfig):
    my_param: float = 1.0
    population_size: int = 100
Add it to the config type mapping:
config.py
_DB_CONFIG_BY_TYPE = {
    "topk": DatabaseConfig,
    "beam_search": DatabaseConfig,
    "my_algo": MyDatabaseConfig,  # <-- Add this
}
Now users can configure it in config.yaml:
config.yaml
search:
  type: my_algo
  database:
    my_param: 2.0
    population_size: 200
Access in your database:
def __init__(self, name: str, config: MyDatabaseConfig):
    super().__init__(name, config)
    self.my_param = config.my_param
    self.population_size = config.population_size

Complete Reference

ProgramDatabase API

MethodAbstract?Purpose
add(program, iteration)✅ YesStore a scored program
sample(num_context_programs)✅ YesSelect parent and context
save(path, iteration)NoCheckpoint to disk
load(path)NoRestore from checkpoint
_update_best_program(program)NoTrack best program (call from add)
get_best_program()NoReturn highest-scoring program
get_top_programs(n)NoReturn top N by score
get(program_id)NoRetrieve by ID
log_status()NoLog database summary
get_statistics()NoReturn stats dict for prompt context

DiscoveryController API

MethodAbstract?Purpose
run_discovery(...)✅ YesMain discovery loop
_run_iteration(iteration)NoSingle generate-evaluate cycle
_process_iteration_result(...)NoStore and log result
request_shutdown()NoTrigger graceful shutdown

Complete Registration Example

route.py
# Level 1: Database only (uses default DiscoveryController)
register_database("topk", TopKDatabase)
register_database("best_of_n", BestOfNDatabase)
register_database("beam_search", BeamSearchDatabase)

# Level 2: Database + Controller
register_database("adaevolve", AdaEvolveDatabase)
register_controller("adaevolve", AdaEvolveController)

register_database("gepa_native", GEPANativeDatabase)
register_controller("gepa_native", GEPANativeController)

# Level 2: Controller + dynamic database
register_controller("evox", CoEvolutionController)
register_database("evox_meta", SearchStrategyDatabase)

Next Steps

Custom Benchmarks

Add your own optimization tasks

Context Builders

Customize prompt generation