DAG-MCTS

DAGMCTSConfig supports two internal variants:

  • PathStatsDAGMCTS (default): visit counts and values are tracked for root-to-node paths, selection uses DUCB, and a transposition cache shares objective evaluations by polydisc.
  • NodeStatsDAGMCTS: visit counts and values are stored directly on shared DAG nodes.

Select the variant when constructing the optimizer:

path_stats_config = DAGMCTSConfig(variant = PathStatsDAGMCTS)
node_stats_config = DAGMCTSConfig(variant = NodeStatsDAGMCTS)
NonArchimedeanMachineLearning.DAGMCTSConfigType
DAGMCTSConfig

Configuration parameters for the DAG-MCTS optimizer.

Fields

  • num_simulations::Int: Number of MCTS simulations to run per step
  • exploration_constant::Float64: UCT exploration constant c (usually √2 ≈ 1.41)
  • degree::Int: Degree for child polydisc generation (passed to children function)
  • value_transform::Function: Transform from loss to value (default: sigmoid_transform())
  • persist_table::Bool: Whether to persist the transposition table across optimization steps
  • selection_mode::SelectionMode: Strategy for selecting the next step (VisitCount, BestValue, or BestLoss)
  • track_parents::Bool: Whether to retain parent pointers for debugging and verification
  • variant::DAGMCTSVariant: Internal DAG-MCTS algorithm. The default is PathStatsDAGMCTS.

Design Decision (recorded for future experimentation)

The persist_table option allows experimenting with:

  • persist_table=true: Reuse learned information across steps (may grow large)
  • persist_table=false: Fresh search each step
source
NonArchimedeanMachineLearning.DAGMCTSConfigMethod
DAGMCTSConfig(; kwargs...)

Create a DAG-MCTS configuration with default settings.

Keyword Arguments

  • num_simulations::Int=100: Number of simulations per step
  • exploration_constant::Float64=1.41: UCT exploration constant
  • degree::Int=1: Child generation degree
  • value_transform::Function=DEFAULT_VALUE_TRANSFORM: Loss to value transformation (see sigmoid_transform, tanh_transform, negation_transform)
  • persist_table: Whether to persist the transposition table across steps. Defaults to false for the path-stat variant and true for the node-stat variant.
  • selection_mode: Child selection strategy (VisitCount, BestValue, or BestLoss). Defaults to BestValue for the path-stat variant and VisitCount for the node-stat variant.
  • track_parents::Bool=false: Whether to track parent pointers (needed for debug verification; off by default for performance)
  • variant::DAGMCTSVariant=PathStatsDAGMCTS: Select the dag-mcts implementation.
source
NonArchimedeanMachineLearning.DAGMCTSNodeType
DAGMCTSNode{S,T,N}

A node in the node-statistics DAG-MCTS search graph.

Unlike standard MCTS nodes, DAG nodes can have multiple parents since the same polydisc state can be reached via different action sequences.

Fields

  • polydisc::ValuationPolydisc{S,T,N}: The polydisc at this node
  • parents::Vector{DAGMCTSNode{S,T,N}}: All parent nodes (can be multiple in a DAG)
  • children::Vector{DAGMCTSNode{S,T,N}}: Child nodes indexed by action (1-based)
  • visits::Int: Total visit count N(s) aggregated from all paths
  • total_value::Float64: Sum of all values Q(s) backpropagated through this node
  • is_expanded::Bool: Whether this node's children have been generated
  • is_terminal::Bool: True if expansion produces no children (precision limit reached)
  • is_solved::Bool: True if terminal, or expanded with all children solved
  • proven_value::Float64: Exact value once solved (NaN if unsolved)
  • unsolved_children_count::Int: Number of children not yet marked solved

Design Decision (recorded for future experimentation)

We track all parents in a vector rather than having no parent information. This is more expensive but useful for analysis and debugging. The alternative would be to track no parent info and rely purely on the explicit path during traversal.

source
NonArchimedeanMachineLearning.DAGMCTSPathNodeType
DAGMCTSPathNode{S,T,N}

A node in the path-style DAG-MCTS search graph.

This node stores only graph structure plus lightweight diagnostic aggregate statistics. DUCB selection and backpropagation use DAGMCTSPathStats keyed by root-to-node paths, not these node aggregates.

source
NonArchimedeanMachineLearning.DAGMCTSPathStateType
DAGMCTSPathState{S,T,N}

State for the path-style DAG-MCTS variant.

The graph is stored in transposition_table, objective evaluations are cached in evaluation_cache, and MCTS statistics are stored in path_stats keyed by root-to-node paths.

source
NonArchimedeanMachineLearning.DAGMCTSStateType
DAGMCTSState{S,T,N}

State maintained across DAG-MCTS optimization steps.

Fields

  • root::DAGMCTSNode{S,T,N}: The current root node of the search graph
  • transposition_table::Dict{HashedPolydisc{S,T,N}, DAGMCTSNode{S,T,N}}: Global table mapping hashed polydisc states to node instances
  • step_count::Int: Number of optimization steps taken
  • best_node::Union{DAGMCTSNode{S,T,N}, Nothing}: Running tracker of the best node by average value
  • best_value::Float64: Best average value seen so far
  • best_root_child::Union{DAGMCTSNode{S,T,N}, Nothing}: Which direct child of root leads to best_node
  • best_root_action::Int: Action index of bestrootchild
  • min_loss_node::Union{DAGMCTSNode{S,T,N}, Nothing}: Node with minimum raw loss evaluation
  • min_loss::Float64: Minimum raw loss seen so far
  • min_loss_root_child::Union{DAGMCTSNode{S,T,N}, Nothing}: Which direct child of root leads to minlossnode
source
NonArchimedeanMachineLearning.DAGMCTSVariantType
DAGMCTSVariant

Selects the internal DAG-MCTS algorithm.

Values

  • PathStatsDAGMCTS: Search statistics are stored on root-to-node paths and selection uses DUCB. Node evaluations are shared through a transposition cache.
  • NodeStatsDAGMCTS: Search statistics are stored directly on shared DAG nodes.
source
NonArchimedeanMachineLearning._dag_mcts_node_descent_initMethod
dag_mcts_descent_init(param::ValuationPolydisc{S,T,N}, loss::Loss, config::DAGMCTSConfig=DAGMCTSConfig()) where {S,T,N}

Initialize an optimization setup for DAG-MCTS.

Arguments

  • param::ValuationPolydisc{S,T,N}: Initial parameter values
  • loss::Loss: The loss function structure
  • config::DAGMCTSConfig: DAG-MCTS configuration (uses defaults if not provided)

Returns

OptimSetup: Configured optimization setup for DAG-MCTS

Example

# Set up DAG-MCTS optimizer with persistent transposition table
config = DAGMCTSConfig(num_simulations=200, persist_table=true)
optim = dag_mcts_descent_init(param, loss, config)

# Run optimization
for i in 1:100
    step!(optim)
    if i % 10 == 0
        println("Step \$i: Loss = ", eval_loss(optim))
        println("  Unique nodes: ", length(optim.state.transposition_table))
    end
end
source
NonArchimedeanMachineLearning.backpropagate!Function
backpropagate!(path::Vector{<:DAGMCTSNode}, value::Float64)

Backpropagate a value through all nodes in the explicit path.

This is the Path-wise approach for DAG backpropagation - we iterate through the specific path taken during this traversal rather than following parent pointers (this is ambiguous in a DAG!).

Arguments

  • path: Vector of nodes from root to leaf representing this traversal's path
  • value: The transformed value to backpropagate
  • eval_node: The leaf node where the loss was evaluated
  • loss_value: The raw loss value at the leaf (before value_transform)

Note

This updates the global N(s) and Q(s) statistics in each shared node instance.

source
NonArchimedeanMachineLearning.check_solved!Method
check_solved!(node::DAGMCTSNode)

Check if a DAG node should be marked as solved. A node is solved when expanded AND all children are solved. (DAG-MCTS always generates all children, so is_expanded implies fully expanded.)

Sets proven_value to the max of children's proven_values.

Returns true if the node was newly marked as solved.

source
NonArchimedeanMachineLearning.dag_mcts_descentMethod
dag_mcts_descent(loss::Loss, param::ValuationPolydisc{S,T,N}, state::DAGMCTSState{S,T,N}, config::DAGMCTSConfig) where {S,T,N}

Perform one step of DAG-MCTS optimization.

This function follows the same interface as other optimizers (greedydescent, mctsdescent, etc.), making it compatible with OptimSetup.

Arguments

  • loss::Loss: The loss function structure
  • param::ValuationPolydisc{S,T,N}: Current parameter values
  • state::DAGMCTSState{S,T,N}: DAG-MCTS state (includes transposition table)
  • config::DAGMCTSConfig: Configuration parameters

Returns

Tuple{ValuationPolydisc{S,T,N}, DAGMCTSState{S,T,N}, Bool}: New parameters, updated state, and convergence status

source
NonArchimedeanMachineLearning.dag_mcts_path_simulation!Method
dag_mcts_path_simulation!(root, table, loss, config, state)

Run one path-style DAG-MCTS simulation.

Selection uses DUCB on path extensions, evaluation is cached by polydisc in the transposition cache, and backpropagation updates every prefix of the sampled path.

source
NonArchimedeanMachineLearning.dag_mcts_searchMethod
dag_mcts_search(root::DAGMCTSNode{S,T,N}, table::Dict, loss::Loss, config::DAGMCTSConfig) where {S,T,N}

Run DAG-MCTS from a root node and return the best child.

Performs config.num_simulations iterations of DAG-MCTS. Returns the best child according to the configured selection mode.

Returns

Tuple of (best_polydisc, best_node, converged).

source
NonArchimedeanMachineLearning.dag_mcts_simulation!Method
dag_mcts_simulation!(root::DAGMCTSNode{S,T,N}, table::Dict, loss::Loss, config::DAGMCTSConfig) where {S,T,N}

Perform one complete DAG-MCTS simulation.

The four phases of MCTS adapted for DAG:

  1. Selection: Follow UCT to a leaf, maintaining explicit path stack
  2. Expansion: Expand leaf using transposition table for deduplication
  3. Evaluation: Compute value at the leaf (or a child)
  4. Backpropagation: Update statistics along the explicit path

Returns

Float64: The value obtained from this simulation

source
NonArchimedeanMachineLearning.ducb_scoreMethod
ducb_score(path_stats, path_key, child, exploration_constant)

Compute the path-statistics DAG upper confidence bound

\[DUCB(P, y) = Q(P \cdot y) + c \sqrt{\log N(P) / N(P \cdot y)}.\]

Unvisited path extensions are assigned Inf.

source
NonArchimedeanMachineLearning.evaluate_nodeMethod
evaluate_node(node::DAGMCTSNode{S,T,N}, loss::Loss, config::DAGMCTSConfig) where {S,T,N}

Evaluate a node using the loss function and transform to value.

Returns

Float64: The transformed value (higher is better)

source
NonArchimedeanMachineLearning.expand_node!Method
expand_node!(node::DAGMCTSNode{S,T,N}, table::Dict, config::DAGMCTSConfig) where {S,T,N}

Expand a node by generating its child polydiscs and linking them in the DAG.

Uses the transposition table to detect if any child polydisc already exists, linking to the existing node if so (the "Lookup & Link" step).

Arguments

  • node: The node to expand
  • table: The transposition table
  • config: Configuration parameters

Returns

Nothing (modifies node.children in place)

source
NonArchimedeanMachineLearning.get_dag_statsMethod
get_dag_stats(state::DAGMCTSState)

Get statistics about the DAG structure.

Returns

NamedTuple with:

  • unique_nodes: Number of unique nodes in transposition table
  • total_visits: Sum of all visit counts
  • multi_parent_nodes: Number of nodes with multiple parents (true transpositions)
source
NonArchimedeanMachineLearning.get_or_create_node!Method
get_or_create_node!(table::Dict, polydisc::ValuationPolydisc{S,T,N}, parent::Union{DAGMCTSNode{S,T,N}, Nothing}=nothing) where {S,T,N}

Look up a polydisc in the transposition table, creating a new node if not found.

This is the core "Lookup & Link" operation for DAG-MCTS:

  1. Wrap polydisc in HashedPolydisc (computes hash once)
  2. Check if hashed polydisc is in transposition_table via get! (single hash probe)
  3. If found: retrieve existing node instance; if not: create new node
  4. Link the parent if provided

Arguments

  • table: The transposition table (keyed by HashedPolydisc)
  • polydisc: The polydisc state to look up
  • parent: Optional parent node to link

Returns

DAGMCTSNode{S,T,N}: The node for this polydisc (existing or newly created)

source
NonArchimedeanMachineLearning.propagate_solved_up_dag!Method
propagate_solved_up_dag!(node::DAGMCTSNode, path::Vector{<:DAGMCTSNode})

Propagate solved status upward in the DAG after node has been marked solved.

When parent pointers are available (track_parents=true), uses BFS to notify ALL parents across the DAG. When parent pointers are unavailable, propagates only along the explicit simulation path (pragmatic fallback — some parents may not be notified immediately, but will catch up in future simulations).

source
NonArchimedeanMachineLearning.rebuild_table_from_subtree!Method
rebuild_table_from_subtree!(state::DAGMCTSState{S,T,N}, config::DAGMCTSConfig) where {S,T,N}

Rebuild the transposition table to contain only nodes reachable from state.root.

Performs a BFS traversal down from the root, collecting all reachable nodes into a fresh dictionary. The old table is replaced, allowing Julia's GC to reclaim any unreachable nodes. If config.track_parents is enabled, parent arrays of surviving nodes are filtered to remove references to pruned ancestors.

This is the core of the "DAG-first" architecture: the DAG structure is maintained by node.children pointers, and the transposition table is merely an ephemeral index rebuilt each step.

source
NonArchimedeanMachineLearning.select_best_child_dagMethod
select_best_child_dag(root::DAGMCTSNode, table::Dict, config::DAGMCTSConfig)

Select the best child of root according to the configured selection mode.

Selection Modes

  • VisitCount: Returns child with highest visit count (standard MCTS)
  • BestValue: Finds node with best average value in DAG, returns root's child leading to it (greedy)
  • BestLoss: Finds leaf with minimum raw loss evaluation, returns root's child leading to it

Returns

The selected child node.

source
NonArchimedeanMachineLearning.select_pathMethod
select_path(root::DAGMCTSNode, exploration_constant::Float64)

Select a path from root to a leaf node using UCT.

Unlike standard MCTS, we maintain an explicit path stack since nodes don't have single parent pointers.

Arguments

  • root: The root node to start from
  • exploration_constant: UCT exploration constant

Returns

Vector{DAGMCTSNode}: The path from root to leaf (inclusive)

source
NonArchimedeanMachineLearning.trace_to_root_childMethod
trace_to_root_child(target::DAGMCTSNode, root::DAGMCTSNode, table::Dict)

Trace back from a node to find which direct child of root lies on a path to it.

Uses parent pointers to trace upward from the target to root, which is O(depth) instead of O(nodes). In a DAG, nodes can have multiple parents, so at each step we pick any parent that is an ancestor of root (BFS upward).

Returns

The direct child of root that can reach node, or nothing if not found.

source
NonArchimedeanMachineLearning.uct_scoreMethod
uct_score(node::DAGMCTSNode, parent_visits::Int, exploration_constant::Float64)

Compute the UCT score for a node.

UCT(s) = Q(s)/N(s) + c * √(ln(N(parent)) / N(s))

where Q(s) and N(s) are the global statistics from the shared node instance, and N(parent) is the visit count of the parent in the current traversal.

Arguments

  • node: The child node being scored
  • parent_visits: Visit count of the parent node (N(parent))
  • exploration_constant: The exploration constant c

Returns

Float64: The UCT score (higher = should explore)

source
NonArchimedeanMachineLearning.verify_transposition_tableMethod
verify_transposition_table(state::DAGMCTSState)

Verify the integrity of the transposition table.

Checks that:

  1. All nodes in the table have polydiscs equal to their keys (via isequal)
  2. All children of expanded nodes are in the table
  3. Parent relationships are consistent

Returns

Bool: true if table is consistent, false otherwise

source