Dag Mcts

NonArchimedeanMachineLearning.DAGMCTSConfigType
DAGMCTSConfig

Configuration parameters for the DAG-MCTS optimizer.

Fields

  • num_simulations::Int: Number of MCTS simulations to run per step
  • exploration_constant::Float64: UCT exploration constant c (usually √2 ≈ 1.41)
  • degree::Int: Degree for child polydisc generation (passed to children function)
  • value_transform::Function: Transform from loss to value (default: sigmoid_transform())
  • persist_table::Bool: Whether to persist the transposition table across optimization steps
  • selection_mode::SelectionMode: Strategy for selecting the next step (VisitCount, BestValue, or BestLoss)
  • track_parents::Bool: Whether to retain parent pointers for debugging and verification

Design Decision (recorded for future experimentation)

The persist_table option allows experimenting with:

  • persist_table=true: Reuse learned information across steps (may grow large)
  • persist_table=false: Fresh search each step (like standard MCTS)

TODO: Add support for the following features from standard MCTS:

- strict mode for single-branch descent

- max_children limit

source
NonArchimedeanMachineLearning.DAGMCTSConfigMethod
DAGMCTSConfig(; kwargs...)

Create a DAG-MCTS configuration with sensible defaults.

Keyword Arguments

  • num_simulations::Int=100: Number of simulations per step
  • exploration_constant::Float64=1.41: UCT exploration constant
  • degree::Int=1: Child generation degree
  • value_transform::Function=sigmoid_transform(): Loss to value transformation (see sigmoid_transform, tanh_transform, negation_transform)
  • persist_table::Bool=true: Whether to persist transposition table across steps
  • selection_mode::SelectionMode=VisitCount: Child selection strategy (VisitCount, BestValue, or BestLoss)
  • track_parents::Bool=false: Whether to track parent pointers (needed for debug verification; off by default for performance)
source
NonArchimedeanMachineLearning.DAGMCTSNodeType
DAGMCTSNode{S,T,N}

A node in the DAG-MCTS search graph.

Unlike standard MCTS nodes, DAG nodes can have multiple parents since the same polydisc state can be reached via different action sequences.

Fields

  • polydisc::ValuationPolydisc{S,T,N}: The polydisc at this node
  • parents::Vector{DAGMCTSNode{S,T,N}}: All parent nodes (can be multiple in a DAG)
  • children::Vector{DAGMCTSNode{S,T,N}}: Child nodes indexed by action (1-based)
  • visits::Int: Total visit count N(s) aggregated from all paths
  • total_value::Float64: Sum of all values Q(s) backpropagated through this node
  • is_expanded::Bool: Whether this node's children have been generated
  • is_terminal::Bool: True if expansion produces no children (precision limit reached)
  • is_solved::Bool: True if terminal, or expanded with all children solved
  • proven_value::Float64: Exact value once solved (NaN if unsolved)
  • unsolved_children_count::Int: Number of children not yet marked solved

Design Decision (recorded for future experimentation)

We track all parents in a vector rather than having no parent information. This is more expensive but useful for analysis and debugging. The alternative would be to track no parent info and rely purely on the explicit path during traversal.

source
NonArchimedeanMachineLearning.DAGMCTSStateType
DAGMCTSState{S,T,N}

State maintained across DAG-MCTS optimization steps.

Fields

  • root::DAGMCTSNode{S,T,N}: The current root node of the search graph
  • transposition_table::Dict{HashedPolydisc{S,T,N}, DAGMCTSNode{S,T,N}}: Global table mapping hashed polydisc states to node instances
  • step_count::Int: Number of optimization steps taken
  • best_node::Union{DAGMCTSNode{S,T,N}, Nothing}: Running tracker of the best node by average value
  • best_value::Float64: Best average value seen so far
  • best_root_child::Union{DAGMCTSNode{S,T,N}, Nothing}: Which direct child of root leads to best_node
  • best_root_action::Int: Action index of bestrootchild
  • min_loss_node::Union{DAGMCTSNode{S,T,N}, Nothing}: Node with minimum raw loss evaluation
  • min_loss::Float64: Minimum raw loss seen so far
  • min_loss_root_child::Union{DAGMCTSNode{S,T,N}, Nothing}: Which direct child of root leads to minlossnode
source
NonArchimedeanMachineLearning.backpropagate!Function
backpropagate!(path::Vector{<:DAGMCTSNode}, value::Float64)

Backpropagate a value through all nodes in the explicit path.

This is the "Stack Method" for DAG backpropagation - we iterate through the specific path taken during this traversal rather than following parent pointers (which would be ambiguous in a DAG).

Arguments

  • path: Vector of nodes from root to leaf representing this traversal's path
  • value: The transformed value to backpropagate
  • eval_node: The leaf node where the loss was evaluated
  • loss_value: The raw loss value at the leaf (before value_transform)

Note

This updates the global N(s) and Q(s) statistics in each shared node instance.

source
NonArchimedeanMachineLearning.check_solved!Method
check_solved!(node::DAGMCTSNode)

Check if a DAG node should be marked as solved. A node is solved when expanded AND all children are solved. (DAG-MCTS always generates all children, so is_expanded implies fully expanded.)

Sets proven_value to the max of children's proven_values (single-agent optimization).

Returns true if the node was newly marked as solved.

source
NonArchimedeanMachineLearning.dag_mcts_descentMethod
dag_mcts_descent(loss::Loss, param::ValuationPolydisc{S,T,N}, state::DAGMCTSState{S,T,N}, config::DAGMCTSConfig) where {S,T,N}

Perform one step of DAG-MCTS optimization.

This function follows the same interface as other optimizers (greedydescent, mctsdescent, etc.), making it compatible with OptimSetup.

Arguments

  • loss::Loss: The loss function structure
  • param::ValuationPolydisc{S,T,N}: Current parameter values
  • state::DAGMCTSState{S,T,N}: DAG-MCTS state (includes transposition table)
  • config::DAGMCTSConfig: Configuration parameters

Returns

Tuple{ValuationPolydisc{S,T,N}, DAGMCTSState{S,T,N}, Bool}: New parameters, updated state, and convergence status

source
NonArchimedeanMachineLearning.dag_mcts_descent_initMethod
dag_mcts_descent_init(param::ValuationPolydisc{S,T,N}, loss::Loss, config::DAGMCTSConfig=DAGMCTSConfig()) where {S,T,N}

Initialize an optimization setup for DAG-MCTS.

Arguments

  • param::ValuationPolydisc{S,T,N}: Initial parameter values
  • loss::Loss: The loss function structure
  • config::DAGMCTSConfig: DAG-MCTS configuration (uses defaults if not provided)

Returns

OptimSetup: Configured optimization setup for DAG-MCTS

Example

# Set up DAG-MCTS optimizer with persistent transposition table
config = DAGMCTSConfig(num_simulations=200, persist_table=true)
optim = dag_mcts_descent_init(param, loss, config)

# Run optimization
for i in 1:100
    step!(optim)
    if i % 10 == 0
        println("Step \$i: Loss = ", eval_loss(optim))
        println("  Unique nodes: ", length(optim.state.transposition_table))
    end
end
source
NonArchimedeanMachineLearning.dag_mcts_searchMethod
dag_mcts_search(root::DAGMCTSNode{S,T,N}, table::Dict, loss::Loss, config::DAGMCTSConfig) where {S,T,N}

Run DAG-MCTS from a root node and return the best child.

Performs config.num_simulations iterations of DAG-MCTS. Returns the best child according to the configured selection mode.

Returns

Tuple of (best_polydisc, best_node, converged).

source
NonArchimedeanMachineLearning.dag_mcts_simulation!Method
dag_mcts_simulation!(root::DAGMCTSNode{S,T,N}, table::Dict, loss::Loss, config::DAGMCTSConfig) where {S,T,N}

Perform one complete DAG-MCTS simulation.

The four phases of MCTS adapted for DAG:

  1. Selection: Follow UCT to a leaf, maintaining explicit path stack
  2. Expansion: Expand leaf using transposition table for deduplication
  3. Evaluation: Compute value at the leaf (or a child)
  4. Backpropagation: Update statistics along the explicit path

Returns

Float64: The value obtained from this simulation

source
NonArchimedeanMachineLearning.evaluate_nodeMethod
evaluate_node(node::DAGMCTSNode{S,T,N}, loss::Loss, config::DAGMCTSConfig) where {S,T,N}

Evaluate a node using the loss function and transform to value.

Returns

Float64: The transformed value (higher is better)

source
NonArchimedeanMachineLearning.expand_node!Method
expand_node!(node::DAGMCTSNode{S,T,N}, table::Dict, config::DAGMCTSConfig) where {S,T,N}

Expand a node by generating its child polydiscs and linking them in the DAG.

Uses the transposition table to detect if any child polydisc already exists, linking to the existing node if so (the "Lookup & Link" step).

Arguments

  • node: The node to expand
  • table: The transposition table
  • config: Configuration parameters

Returns

Nothing (modifies node.children in place)

source
NonArchimedeanMachineLearning.get_dag_statsMethod
get_dag_stats(state::DAGMCTSState)

Get statistics about the DAG structure.

Returns

NamedTuple with:

  • unique_nodes: Number of unique nodes in transposition table
  • total_visits: Sum of all visit counts
  • multi_parent_nodes: Number of nodes with multiple parents (true transpositions)
source
NonArchimedeanMachineLearning.get_or_create_node!Method
get_or_create_node!(table::Dict, polydisc::ValuationPolydisc{S,T,N}, parent::Union{DAGMCTSNode{S,T,N}, Nothing}=nothing) where {S,T,N}

Look up a polydisc in the transposition table, creating a new node if not found.

This is the core "Lookup & Link" operation for DAG-MCTS:

  1. Wrap polydisc in HashedPolydisc (computes hash once)
  2. Check if hashed polydisc is in transposition_table via get! (single hash probe)
  3. If found: retrieve existing node instance; if not: create new node
  4. Link the parent if provided

Arguments

  • table: The transposition table (keyed by HashedPolydisc)
  • polydisc: The polydisc state to look up
  • parent: Optional parent node to link

Returns

DAGMCTSNode{S,T,N}: The node for this polydisc (existing or newly created)

source
NonArchimedeanMachineLearning.propagate_solved_up_dag!Method
propagate_solved_up_dag!(node::DAGMCTSNode, path::Vector{<:DAGMCTSNode})

Propagate solved status upward in the DAG after node has been marked solved.

When parent pointers are available (track_parents=true), uses BFS to notify ALL parents across the DAG. When parent pointers are unavailable, propagates only along the explicit simulation path (pragmatic fallback — some parents may not be notified immediately, but will catch up in future simulations).

source
NonArchimedeanMachineLearning.rebuild_table_from_subtree!Method
rebuild_table_from_subtree!(state::DAGMCTSState{S,T,N}, config::DAGMCTSConfig) where {S,T,N}

Rebuild the transposition table to contain only nodes reachable from state.root.

Performs a BFS traversal down from the root, collecting all reachable nodes into a fresh dictionary. The old table is replaced, allowing Julia's GC to reclaim any unreachable nodes. If config.track_parents is enabled, parent arrays of surviving nodes are filtered to remove references to pruned ancestors.

This is the core of the "DAG-first" architecture: the DAG structure is maintained by node.children pointers, and the transposition table is merely an ephemeral index rebuilt each step.

source
NonArchimedeanMachineLearning.select_best_child_dagMethod
select_best_child_dag(root::DAGMCTSNode, table::Dict, config::DAGMCTSConfig)

Select the best child of root according to the configured selection mode.

Selection Modes

  • VisitCount: Returns child with highest visit count (standard MCTS, robust)
  • BestValue: Finds node with best average value in DAG, returns root's child leading to it (greedy)
  • BestLoss: Finds leaf with minimum raw loss evaluation, returns root's child leading to it

Returns

The selected child node.

source
NonArchimedeanMachineLearning.select_pathMethod
select_path(root::DAGMCTSNode, exploration_constant::Float64)

Select a path from root to a leaf node using UCT.

Unlike standard MCTS, we maintain an explicit path stack since nodes don't have single parent pointers.

Arguments

  • root: The root node to start from
  • exploration_constant: UCT exploration constant

Returns

Vector{DAGMCTSNode}: The path from root to leaf (inclusive)

source
NonArchimedeanMachineLearning.trace_to_root_childMethod
trace_to_root_child(target::DAGMCTSNode, root::DAGMCTSNode, table::Dict)

Trace back from a node to find which direct child of root lies on a path to it.

Uses parent pointers to trace upward from the target to root, which is O(depth) instead of O(nodes). In a DAG, nodes can have multiple parents, so at each step we pick any parent that is an ancestor of root (BFS upward).

Returns

The direct child of root that can reach node, or nothing if not found.

source
NonArchimedeanMachineLearning.uct_scoreMethod
uct_score(node::DAGMCTSNode, parent_visits::Int, exploration_constant::Float64)

Compute the UCT score for a node.

UCT(s) = Q(s)/N(s) + c * √(ln(N(parent)) / N(s))

where Q(s) and N(s) are the global statistics from the shared node instance, and N(parent) is the visit count of the parent in the current traversal.

Arguments

  • node: The child node being scored
  • parent_visits: Visit count of the parent node (N(parent))
  • exploration_constant: The exploration constant c

Returns

Float64: The UCT score (higher = should explore)

source
NonArchimedeanMachineLearning.verify_transposition_tableMethod
verify_transposition_table(state::DAGMCTSState)

Verify the integrity of the transposition table.

Checks that:

  1. All nodes in the table have polydiscs equal to their keys (via isequal)
  2. All children of expanded nodes are in the table
  3. Parent relationships are consistent

Returns

Bool: true if table is consistent, false otherwise

source