Dag Mcts

NonArchimedeanMachineLearning.DAGMCTSConfig — Type

DAGMCTSConfig

Configuration parameters for the DAG-MCTS optimizer.

Fields

num_simulations::Int: Number of MCTS simulations to run per step
exploration_constant::Float64: UCT exploration constant c (usually √2 ≈ 1.41)
degree::Int: Degree for child polydisc generation (passed to children function)
value_transform::Function: Transform from loss to value (default: sigmoid_transform())
persist_table::Bool: Whether to persist the transposition table across optimization steps
selection_mode::SelectionMode: Strategy for selecting the next step (VisitCount, BestValue, or BestLoss)
track_parents::Bool: Whether to retain parent pointers for debugging and verification

Design Decision (recorded for future experimentation)

The persist_table option allows experimenting with:

persist_table=true: Reuse learned information across steps (may grow large)
persist_table=false: Fresh search each step (like standard MCTS)

TODO: Add support for the following features from standard MCTS:

- strict mode for single-branch descent

- max_children limit

source

NonArchimedeanMachineLearning.DAGMCTSConfig — Method

DAGMCTSConfig(; kwargs...)

Create a DAG-MCTS configuration with sensible defaults.

Keyword Arguments

num_simulations::Int=100: Number of simulations per step
exploration_constant::Float64=1.41: UCT exploration constant
degree::Int=1: Child generation degree
value_transform::Function=sigmoid_transform(): Loss to value transformation (see sigmoid_transform, tanh_transform, negation_transform)
persist_table::Bool=true: Whether to persist transposition table across steps
selection_mode::SelectionMode=VisitCount: Child selection strategy (VisitCount, BestValue, or BestLoss)
track_parents::Bool=false: Whether to track parent pointers (needed for debug verification; off by default for performance)

source

NonArchimedeanMachineLearning.DAGMCTSNode — Type

DAGMCTSNode{S,T,N}

A node in the DAG-MCTS search graph.

Unlike standard MCTS nodes, DAG nodes can have multiple parents since the same polydisc state can be reached via different action sequences.

Fields

polydisc::ValuationPolydisc{S,T,N}: The polydisc at this node
parents::Vector{DAGMCTSNode{S,T,N}}: All parent nodes (can be multiple in a DAG)
children::Vector{DAGMCTSNode{S,T,N}}: Child nodes indexed by action (1-based)
visits::Int: Total visit count N(s) aggregated from all paths
total_value::Float64: Sum of all values Q(s) backpropagated through this node
is_expanded::Bool: Whether this node's children have been generated
is_terminal::Bool: True if expansion produces no children (precision limit reached)
is_solved::Bool: True if terminal, or expanded with all children solved
proven_value::Float64: Exact value once solved (NaN if unsolved)
unsolved_children_count::Int: Number of children not yet marked solved

Design Decision (recorded for future experimentation)

We track all parents in a vector rather than having no parent information. This is more expensive but useful for analysis and debugging. The alternative would be to track no parent info and rely purely on the explicit path during traversal.

source

NonArchimedeanMachineLearning.DAGMCTSNode — Method

DAGMCTSNode(polydisc::ValuationPolydisc{S,T,N}) where {S,T,N}

Create a new DAG-MCTS node with the given polydisc and no parents.

source

NonArchimedeanMachineLearning.DAGMCTSState — Type

DAGMCTSState{S,T,N}

State maintained across DAG-MCTS optimization steps.

Fields

root::DAGMCTSNode{S,T,N}: The current root node of the search graph
transposition_table::Dict{HashedPolydisc{S,T,N}, DAGMCTSNode{S,T,N}}: Global table mapping hashed polydisc states to node instances
step_count::Int: Number of optimization steps taken
best_node::Union{DAGMCTSNode{S,T,N}, Nothing}: Running tracker of the best node by average value
best_value::Float64: Best average value seen so far
best_root_child::Union{DAGMCTSNode{S,T,N}, Nothing}: Which direct child of root leads to best_node
best_root_action::Int: Action index of bestrootchild
min_loss_node::Union{DAGMCTSNode{S,T,N}, Nothing}: Node with minimum raw loss evaluation
min_loss::Float64: Minimum raw loss seen so far
min_loss_root_child::Union{DAGMCTSNode{S,T,N}, Nothing}: Which direct child of root leads to minlossnode

source

NonArchimedeanMachineLearning.average_value — Method

average_value(node::DAGMCTSNode)

Compute the average value Q(s)/N(s) of a node. Returns 0.0 if node has not been visited.

source

NonArchimedeanMachineLearning.backpropagate! — Function

backpropagate!(path::Vector{<:DAGMCTSNode}, value::Float64)

Backpropagate a value through all nodes in the explicit path.

This is the "Stack Method" for DAG backpropagation - we iterate through the specific path taken during this traversal rather than following parent pointers (which would be ambiguous in a DAG).

Arguments

path: Vector of nodes from root to leaf representing this traversal's path
value: The transformed value to backpropagate
eval_node: The leaf node where the loss was evaluated
loss_value: The raw loss value at the leaf (before value_transform)

Note

This updates the global N(s) and Q(s) statistics in each shared node instance.

source

NonArchimedeanMachineLearning.check_solved! — Method

check_solved!(node::DAGMCTSNode)

Check if a DAG node should be marked as solved. A node is solved when expanded AND all children are solved. (DAG-MCTS always generates all children, so is_expanded implies fully expanded.)

Sets proven_value to the max of children's proven_values (single-agent optimization).

Returns true if the node was newly marked as solved.

source

NonArchimedeanMachineLearning.dag_mcts_descent — Method

dag_mcts_descent(loss::Loss, param::ValuationPolydisc{S,T,N}, state::DAGMCTSState{S,T,N}, config::DAGMCTSConfig) where {S,T,N}

Perform one step of DAG-MCTS optimization.

This function follows the same interface as other optimizers (greedydescent, mctsdescent, etc.), making it compatible with OptimSetup.

Arguments

loss::Loss: The loss function structure
param::ValuationPolydisc{S,T,N}: Current parameter values
state::DAGMCTSState{S,T,N}: DAG-MCTS state (includes transposition table)
config::DAGMCTSConfig: Configuration parameters

Returns

Tuple{ValuationPolydisc{S,T,N}, DAGMCTSState{S,T,N}, Bool}: New parameters, updated state, and convergence status

source

NonArchimedeanMachineLearning.dag_mcts_descent_init — Method

dag_mcts_descent_init(param::ValuationPolydisc{S,T,N}, loss::Loss, config::DAGMCTSConfig=DAGMCTSConfig()) where {S,T,N}

Initialize an optimization setup for DAG-MCTS.

Arguments

param::ValuationPolydisc{S,T,N}: Initial parameter values
loss::Loss: The loss function structure
config::DAGMCTSConfig: DAG-MCTS configuration (uses defaults if not provided)

Returns

OptimSetup: Configured optimization setup for DAG-MCTS

Example

# Set up DAG-MCTS optimizer with persistent transposition table
config = DAGMCTSConfig(num_simulations=200, persist_table=true)
optim = dag_mcts_descent_init(param, loss, config)

# Run optimization
for i in 1:100
    step!(optim)
    if i % 10 == 0
        println("Step \$i: Loss = ", eval_loss(optim))
        println("  Unique nodes: ", length(optim.state.transposition_table))
    end
end

source

NonArchimedeanMachineLearning.dag_mcts_search — Method

dag_mcts_search(root::DAGMCTSNode{S,T,N}, table::Dict, loss::Loss, config::DAGMCTSConfig) where {S,T,N}

Run DAG-MCTS from a root node and return the best child.

Performs config.num_simulations iterations of DAG-MCTS. Returns the best child according to the configured selection mode.

Returns

Tuple of (best_polydisc, best_node, converged).

source

NonArchimedeanMachineLearning.dag_mcts_simulation! — Method

dag_mcts_simulation!(root::DAGMCTSNode{S,T,N}, table::Dict, loss::Loss, config::DAGMCTSConfig) where {S,T,N}

Perform one complete DAG-MCTS simulation.

The four phases of MCTS adapted for DAG:

Selection: Follow UCT to a leaf, maintaining explicit path stack
Expansion: Expand leaf using transposition table for deduplication
Evaluation: Compute value at the leaf (or a child)
Backpropagation: Update statistics along the explicit path

Returns

Float64: The value obtained from this simulation

source

NonArchimedeanMachineLearning.evaluate_node — Method

evaluate_node(node::DAGMCTSNode{S,T,N}, loss::Loss, config::DAGMCTSConfig) where {S,T,N}

Evaluate a node using the loss function and transform to value.

Returns

Float64: The transformed value (higher is better)

source

NonArchimedeanMachineLearning.expand_node! — Method

expand_node!(node::DAGMCTSNode{S,T,N}, table::Dict, config::DAGMCTSConfig) where {S,T,N}

Expand a node by generating its child polydiscs and linking them in the DAG.

Uses the transposition table to detect if any child polydisc already exists, linking to the existing node if so (the "Lookup & Link" step).

Arguments

node: The node to expand
table: The transposition table
config: Configuration parameters

Returns

Nothing (modifies node.children in place)

source

NonArchimedeanMachineLearning.find_best_node_in_dag — Method

find_best_node_in_dag(root::DAGMCTSNode, table::Dict)

Recursively find the node with the best average value in the entire DAG.

Returns the node with highest average_value, considering only visited nodes.

source

NonArchimedeanMachineLearning.get_dag_stats — Method

get_dag_stats(state::DAGMCTSState)

Get statistics about the DAG structure.

Returns

NamedTuple with:

unique_nodes: Number of unique nodes in transposition table
total_visits: Sum of all visit counts
multi_parent_nodes: Number of nodes with multiple parents (true transpositions)

source

NonArchimedeanMachineLearning.get_or_create_node! — Method

get_or_create_node!(table::Dict, polydisc::ValuationPolydisc{S,T,N}, parent::Union{DAGMCTSNode{S,T,N}, Nothing}=nothing) where {S,T,N}

Look up a polydisc in the transposition table, creating a new node if not found.

This is the core "Lookup & Link" operation for DAG-MCTS:

Wrap polydisc in HashedPolydisc (computes hash once)
Check if hashed polydisc is in transposition_table via get! (single hash probe)
If found: retrieve existing node instance; if not: create new node
Link the parent if provided

Arguments

table: The transposition table (keyed by HashedPolydisc)
polydisc: The polydisc state to look up
parent: Optional parent node to link

Returns

DAGMCTSNode{S,T,N}: The node for this polydisc (existing or newly created)

source

NonArchimedeanMachineLearning.print_dag_stats — Function

print_dag_stats(state::DAGMCTSState, max_depth::Int=3)

Print statistics about the DAG-MCTS structure for debugging.

source

NonArchimedeanMachineLearning.propagate_solved_up_dag! — Method

propagate_solved_up_dag!(node::DAGMCTSNode, path::Vector{<:DAGMCTSNode})

Propagate solved status upward in the DAG after node has been marked solved.

When parent pointers are available (track_parents=true), uses BFS to notify ALL parents across the DAG. When parent pointers are unavailable, propagates only along the explicit simulation path (pragmatic fallback — some parents may not be notified immediately, but will catch up in future simulations).

source

NonArchimedeanMachineLearning.rebuild_table_from_subtree! — Method

rebuild_table_from_subtree!(state::DAGMCTSState{S,T,N}, config::DAGMCTSConfig) where {S,T,N}

Rebuild the transposition table to contain only nodes reachable from state.root.

Performs a BFS traversal down from the root, collecting all reachable nodes into a fresh dictionary. The old table is replaced, allowing Julia's GC to reclaim any unreachable nodes. If config.track_parents is enabled, parent arrays of surviving nodes are filtered to remove references to pruned ancestors.

This is the core of the "DAG-first" architecture: the DAG structure is maintained by node.children pointers, and the transposition table is merely an ephemeral index rebuilt each step.

source

NonArchimedeanMachineLearning.select_best_child_dag — Method

select_best_child_dag(root::DAGMCTSNode, table::Dict, config::DAGMCTSConfig)

Select the best child of root according to the configured selection mode.

Selection Modes

VisitCount: Returns child with highest visit count (standard MCTS, robust)
BestValue: Finds node with best average value in DAG, returns root's child leading to it (greedy)
BestLoss: Finds leaf with minimum raw loss evaluation, returns root's child leading to it

Returns

The selected child node.

source

NonArchimedeanMachineLearning.select_child — Method

select_child(node::DAGMCTSNode, exploration_constant::Float64)

Select the child with the highest UCT score.

Returns

Tuple of (actionindex, childnode) for the best child

source

NonArchimedeanMachineLearning.select_path — Method

select_path(root::DAGMCTSNode, exploration_constant::Float64)

Select a path from root to a leaf node using UCT.

Unlike standard MCTS, we maintain an explicit path stack since nodes don't have single parent pointers.

Arguments

root: The root node to start from
exploration_constant: UCT exploration constant

Returns

Vector{DAGMCTSNode}: The path from root to leaf (inclusive)

source

NonArchimedeanMachineLearning.trace_to_root_child — Method

trace_to_root_child(target::DAGMCTSNode, root::DAGMCTSNode, table::Dict)

Trace back from a node to find which direct child of root lies on a path to it.

Uses parent pointers to trace upward from the target to root, which is O(depth) instead of O(nodes). In a DAG, nodes can have multiple parents, so at each step we pick any parent that is an ancestor of root (BFS upward).

Returns

The direct child of root that can reach node, or nothing if not found.

source

NonArchimedeanMachineLearning.uct_score — Method

uct_score(node::DAGMCTSNode, parent_visits::Int, exploration_constant::Float64)

Compute the UCT score for a node.

UCT(s) = Q(s)/N(s) + c * √(ln(N(parent)) / N(s))

where Q(s) and N(s) are the global statistics from the shared node instance, and N(parent) is the visit count of the parent in the current traversal.

Arguments

node: The child node being scored
parent_visits: Visit count of the parent node (N(parent))
exploration_constant: The exploration constant c

Returns

Float64: The UCT score (higher = should explore)

source

NonArchimedeanMachineLearning.verify_transposition_table — Method

verify_transposition_table(state::DAGMCTSState)

Verify the integrity of the transposition table.

Checks that:

All nodes in the table have polydiscs equal to their keys (via isequal)
All children of expanded nodes are in the table
Parent relationships are consistent

Returns

Bool: true if table is consistent, false otherwise

source