Preporato
NCP-AAINVIDIAAgentic AIAgent PlanningAI ReasoningReActChain-of-ThoughtTree of Thoughts

Agent Planning: ReAct vs CoT vs Tree of Thoughts (NCP-AAI)

Preporato TeamApril 1, 202635 min readNCP-AAI
Agent Planning: ReAct vs CoT vs Tree of Thoughts (NCP-AAI)

Planning -- the ability to break down complex goals into executable steps -- is what separates advanced agentic AI systems from simple chatbots. The NVIDIA NCP-AAI certification heavily emphasizes planning strategies, as they determine an agent's capability to solve multi-step problems, reason about consequences, and optimize action sequences. This comprehensive guide covers every planning paradigm tested on the NCP-AAI exam: the three foundational reasoning strategies (Chain-of-Thought, ReAct, and Tree of Thoughts), five classical planning approaches (forward, backward, HTN, partial-order, and continual), advanced algorithms (A* and MCTS), NVIDIA NeMo Agent Toolkit planning modules, multi-agent planning patterns, and common exam traps you need to avoid.

Start Here

New to NCP-AAI? Start with our Complete NCP-AAI Certification Guide for exam overview, domains, and study paths. Then use our NCP-AAI Cheat Sheet for quick reference and How to Pass NCP-AAI for exam strategies.

Why Planning Matters for Agentic AI

Planning enables agents to:

  • Decompose complex tasks into manageable sub-tasks
  • Reason about action sequences before execution
  • Anticipate obstacles and plan contingencies
  • Optimize for goals (shortest path, lowest cost, highest success rate)
  • Handle ambiguous or underspecified requests
  • Allocate resources efficiently across parallel workstreams
  • Adapt dynamically when circumstances change

NCP-AAI Coverage:

  • Agent Design and Cognition domain (15%): Planning algorithms, reasoning patterns, goal management
  • Agent Development domain (15%): Implementing planning mechanisms, framework selection
  • Agent Architecture domain (15%): Choosing appropriate planning strategies, multi-agent coordination

The Planning Challenge

Without planning, agents exhibit three critical failure modes:

  • Myopic behavior: Short-sighted decisions without considering future consequences
  • Action thrashing: Inefficient trial-and-error without strategic thinking
  • Goal confusion: Losing track of the original objective in multi-step tasks

Example -- Flight Booking Without vs. With Planning:

Task: "Book a flight to Paris for next week"

Without Planning (Bad):
Agent: "What dates work for you?"
User: "Monday to Friday"
Agent: "Let me search flights..."
Agent: "Oh, I need your departure city. What city?"
User: "San Francisco"
Agent: "Searching... Oh, I need your budget. What's your budget?"
→ Inefficient, poor user experience (3 round trips)

With Planning (Good):
Agent: "To book your flight, I need:
  1. Departure city
  2. Travel dates
  3. Budget range
  4. Seating preference
Can you provide these details?"
→ Strategic, efficient information gathering (1 round trip)

Core Planning Capabilities Tested on NCP-AAI

The exam tests your understanding of five core planning capabilities:

  • Task decomposition: Breaking complex requests into ordered subtasks
  • Multi-step orchestration: Sequencing actions with correct dependencies
  • Conditional branching: Adapting plans based on runtime conditions
  • Error recovery: Replanning when tasks fail (fallback strategies)
  • Goal optimization: Finding efficient paths to objectives under constraints

Preparing for NCP-AAI? Practice with 455+ exam questions

Chain-of-Thought (CoT) Prompting

Definition: Chain-of-Thought prompting elicits step-by-step reasoning from LLMs by showing examples or instructing the model to "think through" problems. Introduced by Wei et al. (2022) in "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models," CoT demonstrated that prompting a 540B-parameter model (PaLM) with just eight chain-of-thought exemplars achieved state-of-the-art accuracy on the GSM8K benchmark of math word problems.

How CoT Works

CoT decomposes complex reasoning into explicit intermediate steps, making the model's thought process transparent and verifiable. Rather than jumping from question to answer, the model generates a reasoning trace that walks through each logical step.

Basic CoT (Few-Shot)

Provide examples of step-by-step reasoning to teach the model the format:

from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI

few_shot_cot_template = """
Example 1:
Question: A bakery makes 48 cupcakes. If they pack 6 cupcakes per box, how many boxes do they need?
Reasoning:
- Total cupcakes: 48
- Cupcakes per box: 6
- Calculation: 48 / 6 = 8
Answer: 8 boxes

Example 2:
Question: If a car travels at 60 mph for 2.5 hours, how far does it travel?
Reasoning:
- Speed: 60 miles per hour
- Time: 2.5 hours
- Formula: Distance = Speed x Time
- Calculation: 60 x 2.5 = 150
Answer: 150 miles

Now solve this:
Question: {question}
Reasoning:
"""

llm = ChatOpenAI(model="gpt-4", temperature=0)
chain = PromptTemplate(input_variables=["question"], template=few_shot_cot_template) | llm

result = chain.invoke({
    "question": "If a store sells 15 items per hour and is open 8 hours per day, how many items are sold in a week?"
})

# Output:
# - Items per hour: 15
# - Hours per day: 8
# - Items per day: 15 x 8 = 120
# - Days per week: 7
# - Items per week: 120 x 7 = 840
# Answer: 840 items per week

Zero-Shot CoT

No examples needed -- just append "Let's think step by step" to the prompt. Kojima et al. (2022) showed in "Large Language Models are Zero-Shot Reasoners" that this simple addition dramatically improves reasoning performance without any exemplars:

zero_shot_cot_prompt = """
Question: {question}

Let's think step by step:
"""

# Results from Kojima et al. (2022) using text-davinci-002:
# MultiArith: 17.7% → 78.7% accuracy
# GSM8K:      10.4% → 40.7% accuracy
# Similar improvements observed with PaLM 540B

The versatility of this single prompt across diverse reasoning tasks -- arithmetic, symbolic, commonsense, and logical -- hints at untapped zero-shot cognitive capabilities in large language models.

Self-Consistency CoT

Generate multiple CoT reasoning paths and select the answer that appears most frequently (majority vote). This reduces the impact of any single flawed reasoning chain:

class SelfConsistencyCoT:
    def __init__(self, llm, num_samples=5):
        self.llm = llm
        self.num_samples = num_samples

    def solve(self, question):
        answers = []
        for _ in range(self.num_samples):
            # Generate with temperature > 0 for diverse reasoning paths
            response = self.llm.predict(
                f"Question: {question}\nLet's think step by step:",
                temperature=0.7
            )
            answer = self.extract_answer(response)
            answers.append(answer)

        # Majority vote
        from collections import Counter
        most_common = Counter(answers).most_common(1)[0][0]
        return most_common

# Typically improves accuracy by 5-15% over single-pass CoT

CoT Strengths and Limitations

AspectStrengthsLimitations
Use CasesMath, logic puzzles, multi-step reasoningReal-world actions, tool use
TransparencyShows full reasoning processReasoning can be confidently wrong
LatencySingle LLM call (lowest cost)Longer output = more tokens
ReliabilityDeterministic reasoning pathProne to compounding errors
GroundingInternal knowledge onlyCannot access external information

Key Concept

CoT is best for reasoning-heavy, action-light tasks (analysis, planning, explanation) but insufficient for action-heavy tasks (API calls, tool use, multi-step execution). On the NCP-AAI exam, if a scenario involves tool calls or external interactions, CoT alone is the wrong answer -- look for ReAct instead.

CoT Variants Summary

The NCP-AAI exam may present scenarios where you need to choose between CoT variants. Here is a quick reference:

VariantDescriptionWhen to UseCost
Few-Shot CoTProvide 2-5 worked examples before the questionWhen you have good exemplars and need reliable formatting1 call (longer prompt)
Zero-Shot CoTAppend "Let's think step by step" with no examplesQuick reasoning boost without crafting examples1 call (short prompt)
Self-Consistency CoTGenerate k diverse reasoning paths, majority voteWhen reliability matters more than costk calls (k=5-10 typical)
Manual CoTHand-craft optimal reasoning chains for specific domainsDomain-specific applications with known optimal reasoning1 call (curated prompt)

Key Research Results to Remember:

  • Wei et al. (2022): Few-shot CoT with PaLM 540B achieved state-of-the-art on GSM8K math benchmarks, demonstrating that reasoning emerges in sufficiently large models with appropriate prompting.
  • Kojima et al. (2022): Zero-shot CoT ("Let's think step by step") improved MultiArith accuracy from 17.7% to 78.7% with InstructGPT, showing that no exemplars are needed to unlock reasoning capabilities.
  • Wang et al. (2022): Self-Consistency improved CoT accuracy by 5-15% across benchmarks by sampling multiple reasoning paths and taking the majority vote, addressing the fragility of single-chain reasoning.

When CoT Fails: Understanding Limitations

CoT has well-documented failure modes that the NCP-AAI exam may test:

  1. Faithful but wrong reasoning: The model can generate a logical-looking chain of reasoning that arrives at the wrong answer. The steps appear sound, but a subtle error early in the chain propagates forward.

  2. Overthinking simple problems: For straightforward factual lookups or pattern matching, CoT adds unnecessary tokens and latency without improving accuracy. Use DIRECT mode for these.

  3. No external grounding: CoT operates entirely on the model's internal knowledge. If the information is outdated, incomplete, or hallucinated, the entire reasoning chain is built on a faulty foundation. This is precisely why ReAct was developed.

  4. Length sensitivity: Very long reasoning chains (10+ steps) can lose coherence, with the model forgetting earlier constraints or introducing contradictions. For truly complex problems, hierarchical decomposition (HTN) may be more appropriate.


ReAct: Reasoning + Acting

Definition: ReAct (Reasoning and Acting) interleaves reasoning traces with action execution, allowing agents to dynamically adjust plans based on environment feedback. Introduced by Yao et al. (2022) in "ReAct: Synergizing Reasoning and Acting in Language Models," the framework addresses a key limitation of CoT: while chain-of-thought uses only internal representations, ReAct grounds reasoning in real-world observations by interleaving thought with action.

Why ReAct Matters

The core insight of ReAct is that reasoning and acting are complementary:

  • Reasoning traces help the model induce, track, and update action plans, as well as handle exceptions
  • Actions allow the model to interface with external sources (knowledge bases, APIs, environments) to gather additional information

Yao et al. evaluated ReAct on four benchmarks -- HotPotQA, Fever, ALFWorld, and WebShop -- and found that ReAct outperforms vanilla action generation while being competitive with CoT. The best results came from combining ReAct with CoT, using both internal knowledge and externally obtained information.

The ReAct Loop

Thought: [Reasoning about what to do next]
Action: [Tool/function to execute]
Action Input: [Arguments for the tool]
Observation: [Result from executing the action]
... (repeat Thought/Action/Observation)
Thought: I now know the final answer
Final Answer: [Response to user]

ReAct Implementation

from langchain.agents import create_react_agent, AgentExecutor
from langchain.tools import Tool
from langchain_openai import ChatOpenAI

# Define tools
def search_web(query: str) -> str:
    """Search the web for information"""
    return f"Search results for '{query}': ..."

def calculator(expression: str) -> str:
    """Evaluate mathematical expressions"""
    try:
        return str(eval(expression))
    except Exception as e:
        return f"Error: {e}"

tools = [
    Tool(name="Search", func=search_web, description="Search the web for current information"),
    Tool(name="Calculator", func=calculator, description="Perform mathematical calculations")
]

# ReAct prompt template
react_prompt = """
Answer the following question as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought: {agent_scratchpad}
"""

# Create agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = create_react_agent(llm, tools, react_prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=10)

# Execute
result = executor.invoke({"input": "What is the population of Tokyo multiplied by 2?"})

Execution Trace:

Thought: I need to find Tokyo's population, then multiply by 2
Action: Search
Action Input: "Tokyo population 2024"
Observation: Tokyo's population is approximately 14 million

Thought: Now I need to multiply 14 million by 2
Action: Calculator
Action Input: "14000000 * 2"
Observation: 28000000

Thought: I now know the final answer
Final Answer: The population of Tokyo (14 million) multiplied by 2 is 28 million.

ReAct Variants

1. ReAct with Self-Correction

The agent detects failures and adjusts its approach:

react_selfcorrect_prompt = """
... (standard ReAct format)

If an action fails or returns unexpected results, reconsider your approach:
Thought: That didn't work as expected. Let me try a different approach.
Action: [Alternative action]
...
"""

# Example execution trace:
# Thought: I'll search for "Tokyo population"
# Action: Search
# Action Input: "Tokyo population"
# Observation: Error: Too many results, be more specific
#
# Thought: That didn't work. Let me be more specific with the year.
# Action: Search
# Action Input: "Tokyo population 2024 census"
# Observation: Tokyo's population is approximately 14 million
# → Self-correction leads to success

2. ReAct with Reflection

The agent evaluates its own reasoning quality after task completion:

react_reflection_prompt = """
... (standard ReAct)

After completing the task, reflect:
Reflection: [Evaluate the quality of your reasoning and actions]
Improvements: [What could be done better next time]
"""

# Example:
# Final Answer: 28 million
#
# Reflection: My approach was effective -- I systematically gathered
# information and performed calculations. The search query could
# have been more precise initially.
#
# Improvements: Include the year in initial search queries to
# avoid ambiguity. Consider verifying with a second source.

3. ReAct + CoT Hybrid

Combine internal CoT reasoning with external ReAct actions for best results. In the original paper, Yao et al. found that combining ReAct with CoT outperformed either approach alone on HotPotQA and Fever benchmarks. The hybrid approach works by first attempting CoT internal reasoning, then switching to ReAct when the model detects that its internal knowledge is insufficient or outdated:

class ReActCoTHybrid:
    def __init__(self, llm, tools):
        self.llm = llm
        self.tools = tools

    def solve(self, question):
        # Step 1: Attempt CoT internal reasoning
        cot_response = self.llm.predict(f"""
        Question: {question}

        First, reason about what you already know (internal knowledge).
        Rate your confidence in your knowledge on a scale of 1-10.

        Internal reasoning:
        """)

        confidence = self.extract_confidence(cot_response)

        if confidence >= 8:
            # High confidence: use CoT answer directly (saves tool calls)
            return self.llm.predict(f"""
            Based on this reasoning: {cot_response}
            Provide the final answer:
            """)
        else:
            # Low confidence: switch to ReAct for external verification
            return self.react_agent.run(f"""
            Question: {question}
            Initial reasoning (may be incomplete): {cot_response}
            Verify and complete this answer using available tools.
            """)

# This pattern:
# - Saves tool calls when internal knowledge is sufficient
# - Falls back to grounded actions when knowledge gaps exist
# - Combines the speed of CoT with the accuracy of ReAct

This hybrid pattern is particularly valuable in production because it reduces API costs. Many agent queries can be answered with internal knowledge alone (saving tool call latency and cost), while complex or factual queries automatically escalate to tool-augmented reasoning. On the NCP-AAI exam, questions about optimizing agent costs while maintaining accuracy often point to this hybrid approach.

ReAct Advantages for NCP-AAI

  1. Grounded in reality: Actions provide real feedback, preventing hallucinations
  2. Transparent reasoning: Thought traces are interpretable and debuggable
  3. Dynamic adaptation: Can adjust strategy based on observations
  4. Tool integration: Natural fit for function calling and API interactions
  5. Explicit and traceable: Unlike black-box planning, every decision is logged

NCP-AAI Exam Focus: ReAct is the default planning strategy for production agents due to its balance of reasoning and action. It is the most heavily tested planning framework on the exam.

Exam Trap

A common NCP-AAI mistake is assuming ReAct is always the best choice. While ReAct is the default for production agents, it has key limitations: linear planning (no alternative exploration), error accumulation (early mistakes compound), and high token costs from verbose output. Know when to combine ReAct with other strategies or when ToT or HTN is the better fit.

ReAct Limitations

ChallengeImpactMitigation
Verbose outputHigh token costsUse cheaper models for reasoning steps
Linear planningDoes not explore alternativesCombine with ToT for branching
Error accumulationEarly mistakes compoundAdd reflection/self-correction
Max iterationsCan timeout on complex tasksSet appropriate limits, add fallbacks
Single pathCommits to first viable approachUse ToT when comparison is needed

Tree of Thoughts (ToT): Exploring Multiple Reasoning Paths

Definition: Tree of Thoughts generates multiple reasoning paths (branches), evaluates them, and selects the most promising direction -- enabling search-based planning. Introduced by Yao et al. (2023) in "Tree of Thoughts: Deliberate Problem Solving with Large Language Models" (NeurIPS 2023), ToT generalizes CoT by allowing LMs to explore coherent units of text ("thoughts") as intermediate problem-solving steps, with the ability to look ahead, evaluate, and backtrack.

Key Result

On the Game of 24 benchmark, GPT-4 with standard CoT prompting solved only 4% of tasks, while ToT achieved a 74% success rate -- a dramatic improvement demonstrating the power of deliberate exploration over linear reasoning.

ToT Concepts

1. Thought Decomposition -- Break the problem into intermediate steps (thoughts), where each thought is a coherent unit of reasoning.

2. Thought Generation -- Generate multiple candidate thoughts at each step using the LLM.

3. State Evaluation -- Evaluate how promising each thought path is (how likely it leads to a correct solution). The LLM itself can serve as the evaluator, scoring each path on a numeric scale.

4. Search Algorithm -- Navigate the tree of possibilities:

  • Breadth-First Search (BFS): Explore all options at each level before going deeper
  • Depth-First Search (DFS): Explore one path deeply before backtracking
  • Best-First Search: Prioritize the most promising paths using evaluation scores

ToT Implementation

from langchain_openai import ChatOpenAI

class TreeOfThoughts:
    def __init__(self, llm, max_depth=3, branching_factor=3):
        self.llm = llm
        self.max_depth = max_depth
        self.branching_factor = branching_factor

    def generate_thoughts(self, problem, current_state, depth):
        """Generate multiple candidate next thoughts"""
        prompt = f"""
        Problem: {problem}
        Current reasoning: {current_state}

        Generate {self.branching_factor} different possible next steps
        in solving this problem. Format each as a numbered option:
        """
        response = self.llm.predict(prompt)
        thoughts = self._parse_thoughts(response)
        return thoughts

    def evaluate_thought(self, problem, thought_sequence):
        """Evaluate how promising a thought sequence is (0-10)"""
        prompt = f"""
        Problem: {problem}
        Reasoning so far: {thought_sequence}

        On a scale of 0-10, how likely is this reasoning path
        to lead to a correct solution?
        Consider:
        - Logical coherence
        - Progress toward the goal
        - Avoiding dead ends

        Score (0-10):
        """
        response = self.llm.predict(prompt)
        score = float(response.strip())
        return score

    def _breadth_first_search(self, problem):
        """Explore all paths level by level"""
        queue = [("", 0)]  # (thought_sequence, depth)
        best_path = None
        best_score = -1

        while queue:
            current_state, depth = queue.pop(0)

            if depth >= self.max_depth:
                score = self.evaluate_thought(problem, current_state)
                if score > best_score:
                    best_score = score
                    best_path = current_state
                continue

            thoughts = self.generate_thoughts(problem, current_state, depth)
            for thought in thoughts:
                new_state = current_state + "\n" + thought
                queue.append((new_state, depth + 1))

        return best_path, best_score

    def _depth_first_search(self, problem, current_state="", depth=0, threshold=3):
        """Explore one path deeply, backtrack if score is too low"""
        if depth >= self.max_depth:
            score = self.evaluate_thought(problem, current_state)
            return current_state, score

        thoughts = self.generate_thoughts(problem, current_state, depth)
        best_path = None
        best_score = -1

        for thought in thoughts:
            new_state = current_state + "\n" + thought
            # Prune: skip paths that score below threshold
            mid_score = self.evaluate_thought(problem, new_state)
            if mid_score < threshold:
                continue  # Backtrack

            path, score = self._depth_first_search(
                problem, new_state, depth + 1, threshold
            )
            if score > best_score:
                best_score = score
                best_path = path

        return best_path, best_score

    def solve(self, problem, algorithm="bfs"):
        """Solve problem using Tree of Thoughts"""
        if algorithm == "bfs":
            best_path, score = self._breadth_first_search(problem)
        else:
            best_path, score = self._depth_first_search(problem)

        # Generate final answer from best path
        final_prompt = f"""
        Problem: {problem}
        Best reasoning path found:
        {best_path}

        Based on this reasoning, provide the final answer:
        """
        answer = self.llm.predict(final_prompt)
        return answer, best_path, score

# Usage
llm = ChatOpenAI(model="gpt-4", temperature=0.7)
tot = TreeOfThoughts(llm, max_depth=3, branching_factor=3)

problem = "Design a microservices architecture for an e-commerce platform."
answer, reasoning, score = tot.solve(problem)

ToT Execution Example

Problem: "Plan a 3-day trip to New York City on a $1000 budget"

Root: "Plan NYC trip, 3 days, $1000"
│
├── Branch 1: "Focus on free attractions (museums, parks, walking tours)"
│   ├── Branch 1.1: "Budget hotel ($100/night), subway pass"
│   │   └── Branch 1.1.1: "Day 1: Central Park + Met, Day 2: Brooklyn
│   │       Bridge + 9/11 Memorial, Day 3: High Line + Chelsea Market"
│   │       [Score: 8/10]
│   ├── Branch 1.2: "Hostel ($50/night), more budget for activities"
│   └── Branch 1.3: "Airbnb in Queens ($80/night), authentic experience"
│
├── Branch 2: "Prioritize iconic paid attractions"
│   ├── Branch 2.1: "Buy CityPass ($140), budget accommodation"
│   │   └── Branch 2.1.1: "Day 1: ESB + Top of Rock, Day 2:
│   │       Statue of Liberty..." [Score: 7/10]
│   └── Branch 2.2: "Focus on 2-3 major attractions, skip tourist traps"
│
└── Branch 3: "Cultural experience (Broadway, dining, neighborhoods)"
    ├── Branch 3.1: "One Broadway show ($150), street food"
    │   └── Branch 3.1.1: "Day 1: Broadway + Times Square, Day 2:
    │       Chinatown + Little Italy..." [Score: 6/10]
    └── Branch 3.2: "Skip Broadway, focus on food tours"

Best Path Selected: Branch 1.1.1 (Score: 8/10)
Reasoning: Maximizes experiences within budget by focusing on free
attractions while maintaining comfort with a budget hotel.

ToT Cost Analysis

The computational cost of ToT grows exponentially with depth and branching factor:

ToT Advantages and Limitations

Advantages:

  1. Explores alternatives: Considers multiple strategies before committing
  2. Handles complexity: Effective for problems with many valid approaches
  3. Avoids local optima: Can backtrack from dead ends
  4. Self-evaluation: Explicitly assesses reasoning quality at each step

Limitations:

ChallengeImpactSolution
High costMany LLM calls (exponential)Prune low-scoring branches early
LatencySlow for real-time applicationsUse for planning phase only
Evaluation difficultyHard to score thought qualityTrain value model or use heuristics
Overkill for simple tasksUnnecessary complexityUse CoT or ReAct when one path suffices

When to Use ReAct vs. CoT vs. ToT: Decision Framework

Choosing the right planning strategy is one of the most frequently tested skills on the NCP-AAI exam. Use this decision framework:

Strategy Selection Decision Tree

Does the task require external tool calls or API interactions?
├── YES → Does it require comparing multiple approaches?
│         ├── YES → ReAct + ToT hybrid
│         └── NO  → ReAct (default for production agents)
└── NO  → Is there a single correct answer (math, logic)?
          ├── YES → Chain-of-Thought
          └── NO  → Are there multiple valid approaches to evaluate?
                    ├── YES → Tree of Thoughts
                    └── NO  → Chain-of-Thought

Quick Reference Matrix

Planning Strategy Selection Guide

ScenarioBest StrategyWhyExam Frequency
Customer service resolving tickets with API callsReActRequires tool use + adaptive reasoningVery High
Solving a math word problemChain-of-ThoughtPure reasoning, no actions neededHigh
Designing system architecture (multiple valid approaches)Tree of ThoughtsMust explore and compare alternativesMedium
Flight booking with search + comparison + bookingReActSequential tool calls with reasoningVery High
Sudoku or constraint-satisfaction puzzleTree of ThoughtsRequires branching and backtrackingMedium
Code review and explanationChain-of-ThoughtReasoning-only, no external actionsMedium
Multi-hop question answering with knowledge retrievalReActNeeds external knowledge + reasoningHigh
Strategic business planning with trade-offsTree of ThoughtsMultiple valid strategies to evaluateLow

Cost and Latency Comparison

Planning Strategies: Performance Comparison

StrategyLLM CallsToken CostLatencyTransparencyNCP-AAI Weight
Chain-of-Thought1 callLowLow (seconds)High (full trace)Medium
Self-Consistency CoTk calls (k=5-10)MediumMediumHighLow
ReActN calls (1 per step)MediumMedium (seconds-minutes)High (thought + action)Very High
Tree of Thoughts (BFS)b^d calls (exponential)HighHigh (minutes)Medium (best path shown)Medium
Tree of Thoughts (DFS)b*d calls (with pruning)Medium-HighMedium-HighMediumMedium

Exam Trap

When the exam presents a scenario involving tool calls or external API interactions, Chain-of-Thought alone is always the wrong answer. CoT only performs reasoning without actions. If the scenario requires executing searches, database queries, or API calls, the correct answer involves ReAct or a planning framework that supports action execution. This is one of the most common traps on the NCP-AAI.


Five Classical Planning Approaches

Beyond the three LLM-native strategies above, the NCP-AAI exam tests your knowledge of classical AI planning approaches that provide the theoretical foundation for agent planning systems.

Definition: Start from the current state and explore actions forward until the goal state is reached.

Current State → Action 1 → State 2 → Action 2 → ... → Goal State

Implementation:

class ForwardPlanner:
    def plan(self, start, goal):
        state = start
        plan = []

        while state != goal:
            # Find action that moves toward goal
            action = self.select_best_action(state, goal)
            plan.append(action)
            state = self.apply_action(state, action)

        return plan

# Example
planner = ForwardPlanner()
plan = planner.plan(start="home", goal="office")
# Result: ["walk_to_car", "drive_to_office", "park_car", "enter_building"]

Advantages: Intuitive, easy to implement, works well when the action space is small. Disadvantages: Can be inefficient for large search spaces; explores many irrelevant actions.

2. Backward Planning (Regression)

Definition: Start from the goal state and work backward to determine the required preconditions at each step.

Goal State ← Action N ← State N-1 ← ... ← Current State

Implementation:

class BackwardPlanner:
    def plan(self, current_state, goal):
        required_states = [goal]
        plan = []

        while required_states[-1] != current_state:
            needed_state = required_states[-1]
            action = self.find_producer_action(needed_state)
            plan.insert(0, action)
            preconditions = self.get_preconditions(action)
            required_states.append(preconditions)

        return plan

# Example: Software deployment
plan = backward_planner.plan(
    current_state="code_written",
    goal="app_deployed"
)
# Result: ["run_tests", "build_docker_image", "push_to_registry", "deploy_to_k8s"]

When to Use: When the goal has fewer achievable states than the start state, or when preconditions are well-defined. Also effective for multi-hop question answering where you decompose from the final question backward to sub-questions.

Exam Application -- Multi-Hop QA Example:

Question: "What is the capital of the country where the 2024 Olympics were held?"

Backward decomposition:
  Goal: Capital of country X
  ← Requires: Country X where 2024 Olympics were held
    ← Requires: Location of 2024 Olympics → Paris
      ← Derives: Country → France
        ← Derives: Capital of France → Paris

Forward execution of decomposed plan:
  Step 1: Search "2024 Olympics location" → Paris
  Step 2: Search "What country is Paris in?" → France
  Step 3: Search "Capital of France" → Paris
  Answer: Paris

This backward-then-forward pattern -- decompose the problem backward from the goal, then execute forward -- is a common NCP-AAI exam pattern. The question often asks which reasoning technique is demonstrated (backward chaining) or which execution framework supports it (ReAct for the forward execution with tool calls).

3. Hierarchical Task Network (HTN) Planning

Definition: Decompose high-level abstract tasks into primitive actions using predefined decomposition methods. HTN is the most common planning approach in production agentic systems and one of the most frequently tested topics on the NCP-AAI exam.

The SHOP2 algorithm (Nau et al., 2003) is the canonical HTN planner, notable for supporting partially ordered subtask decomposition -- meaning subtasks within a method do not all need a fixed execution order.

Structure:

┌─────────────────────────────────────┐
│   High-Level Goal (Abstract Task)   │
└──────────────┬──────────────────────┘
               │
        ┌──────┴──────┐
        v             v
    Subtask 1     Subtask 2
        │             │
     ┌──┴──┐       ┌──┴──┐
     v     v       v     v
   Action Action Action Action
   (Primitive -- directly executable)

Implementation:

class HTNPlanner:
    def __init__(self):
        self.methods = {
            "book_travel": [
                ["book_flight", "book_hotel", "rent_car"],      # Method 1
                ["book_train", "book_hotel"]                     # Method 2 (alternative)
            ],
            "book_flight": [
                ["search_flights", "select_flight", "complete_payment"]
            ],
            "book_hotel": [
                ["search_hotels", "reserve_room", "confirm_booking"]
            ]
        }
        self.primitive_actions = {
            "search_flights", "select_flight", "complete_payment",
            "search_hotels", "reserve_room", "confirm_booking",
            "rent_car_online", "book_train"
        }

    def decompose(self, task):
        """Recursively decompose task into primitive actions"""
        if task in self.primitive_actions:
            return [task]

        for method in self.methods.get(task, []):
            plan = []
            valid = True
            for subtask in method:
                sub_plan = self.decompose(subtask)
                if sub_plan is None:
                    valid = False
                    break
                plan.extend(sub_plan)
            if valid and self.is_valid_plan(plan):
                return plan

        return None  # No valid decomposition found

# Usage
planner = HTNPlanner()
plan = planner.decompose("book_travel")
# Result: ["search_flights", "select_flight", "complete_payment",
#          "search_hotels", "reserve_room", "confirm_booking", "rent_car_online"]

Key Concept

HTN planning is the most common planning approach for production agentic systems. It decomposes abstract goals into concrete primitive actions using predefined methods, making it ideal for well-structured domains like trip planning, software deployment, business workflows, and customer service escalation. On the NCP-AAI exam, if a question describes a complex workflow with clear hierarchical structure, HTN is likely the correct answer.

Task Decomposition Matrix

Use this matrix to determine the right decomposition strategy:

Task CharacteristicDecomposition ApproachExample
Fixed sequence of stepsSequential HTNSoftware deployment pipeline
Steps can run in parallelPartial-order HTN (SHOP2)Cooking: boil water + chop vegetables simultaneously
Multiple valid methodsHTN with method selectionTravel: fly vs. train vs. drive
Dynamic environmentHTN + continual replanningWarehouse robot navigation
Unknown structureLLM-based decompositionNovel user requests

4. Partial-Order Planning (POP)

Definition: Plan actions without committing to a specific execution order until necessary. Actions are only ordered when one depends on the output of another, enabling parallelism.

class PartialOrderPlanner:
    def __init__(self):
        self.actions = []
        self.orderings = []     # List of (before, after) constraints
        self.causal_links = []  # (producer, condition, consumer)

    def add_action(self, action):
        self.actions.append(action)

    def add_ordering(self, before, after):
        """Enforce: 'before' must execute before 'after'"""
        self.orderings.append((before, after))

    def get_parallel_groups(self):
        """Find actions that can execute in parallel"""
        return self.topological_sort_with_levels(self.actions, self.orderings)

# Example: Dinner preparation
planner = PartialOrderPlanner()
planner.add_action("chop_vegetables")
planner.add_action("boil_water")
planner.add_action("cook_pasta")
planner.add_action("make_sauce")
planner.add_action("serve")

# Only add necessary ordering constraints
planner.add_ordering("boil_water", "cook_pasta")
planner.add_ordering("chop_vegetables", "make_sauce")
planner.add_ordering("cook_pasta", "serve")
planner.add_ordering("make_sauce", "serve")

# Parallel groups:
# Group 1 (parallel): ["boil_water", "chop_vegetables"]
# Group 2 (parallel): ["cook_pasta", "make_sauce"]
# Group 3: ["serve"]
# Total time: 3 sequential groups instead of 5 sequential actions

Advantages: Enables parallel execution, more flexible scheduling, reduces total execution time. Disadvantages: Complex constraint management, harder to debug than sequential plans.

Why POP Matters for Agentic AI: In multi-agent systems, partial-order planning directly maps to task parallelism. If two subtasks have no ordering constraint between them, they can be assigned to different agents and executed simultaneously. This is why POP is foundational to scalable agent architectures. On the NCP-AAI exam, any question about maximizing throughput or minimizing wall-clock execution time for independent subtasks points to partial-order planning.

Causal Links and Threat Resolution: In formal POP, a causal link (A --[p]--> B) means action A produces condition p that action B requires. A "threat" occurs when a third action C could undo condition p between A and B. The planner resolves threats by either promoting C before A or demoting C after B. While the NCP-AAI does not require formal proofs, understanding that POP must protect causal dependencies from interference is important for architecture questions about concurrent agent actions.

5. Continual Planning (Interleaved Planning and Execution)

Definition: Plan, execute, observe, replan -- a continuous cycle where planning happens during execution and adapts to real-world feedback. This is the closest classical analogue to the ReAct pattern.

Plan → Execute Step 1 → Observe → Replan → Execute Step 2 → Observe → ...

Implementation:

class ContinualPlanner:
    def execute_with_replanning(self, goal):
        plan = self.create_initial_plan(goal)
        execution_log = []

        while not self.goal_achieved(goal):
            if not plan:
                return "Failed: no valid plan found"

            # Execute next action
            action = plan.pop(0)
            result = self.execute_action(action)
            execution_log.append((action, result))

            # Check for unexpected outcomes
            if result.unexpected or result.failed:
                # Replan from current state with updated beliefs
                plan = self.replan(
                    current_state=self.get_current_state(),
                    goal=goal,
                    failed_action=action,
                    history=execution_log
                )

            # Update world model with new observations
            self.update_beliefs(result)

        return "Goal achieved"

When to Use: Dynamic environments where conditions change frequently (robotics, real-time systems, live data processing). On the NCP-AAI exam, if the scenario describes a changing environment, the answer is almost never a fully upfront planning approach.

Classical Approaches Comparison

Five Classical Planning Approaches

ApproachBest ForKey AdvantageKey Limitation
Forward PlanningSimple, small search spacesIntuitive implementationInefficient for large state spaces
Backward PlanningGoal-directed, well-defined preconditionsEfficient precondition analysisRequires well-defined goal states
HTN PlanningComplex hierarchical workflowsMost common in production agentic AIRequires predefined decomposition methods
Partial-Order (POP)Parallelizable, independent subtasksEnables parallel executionComplex constraint management
Continual PlanningDynamic, changing environmentsAdapts to real-time conditionsHigher computational overhead

Exam Trap

On the NCP-AAI exam, watch out for scenarios where over-planning is the trap answer. If the question describes a dynamic environment with frequent changes, the answer is almost never a fully upfront planning approach (like pure forward or backward planning). Look for continual planning or ReAct-based replanning strategies instead. Conversely, if the domain is well-structured with known decomposition rules, HTN is preferred over LLM-based planning.


Advanced Planning Algorithms

A* Planning (Optimal Pathfinding)

A* finds the lowest-cost path from a start state to a goal state using a heuristic function to guide the search. It guarantees an optimal solution when the heuristic is admissible (never overestimates the true cost).

Implementation:

import heapq

class AStarPlanner:
    def plan(self, start, goal):
        frontier = [(0, start)]       # Priority queue: (f_score, state)
        came_from = {start: None}
        g_score = {start: 0}          # Actual cost from start

        while frontier:
            current_f, current = heapq.heappop(frontier)

            if current == goal:
                return self.reconstruct_path(came_from, start, goal)

            for action, next_state in self.get_successors(current):
                new_g = g_score[current] + self.action_cost(action)

                if next_state not in g_score or new_g < g_score[next_state]:
                    g_score[next_state] = new_g
                    f_score = new_g + self.heuristic(next_state, goal)
                    heapq.heappush(frontier, (f_score, next_state))
                    came_from[next_state] = (current, action)

        return None  # No path found

    def heuristic(self, state, goal):
        """Must be admissible: never overestimates true cost"""
        # Example: Manhattan distance for grid-based planning
        return abs(state.x - goal.x) + abs(state.y - goal.y)

A Complexity:*

  • Time: O(b^d) worst case, but typically much better with a good heuristic
  • Space: O(b^d) -- stores all expanded nodes
  • Optimality: Guaranteed when h(n) is admissible

Key Properties of A for the NCP-AAI Exam:*

PropertyDescriptionExam Relevance
CompletenessA* will always find a solution if one exists (given finite branching)Know that A* never gives up prematurely
OptimalityGuaranteed optimal when h(n) is admissibleUnderstand admissibility = never overestimates
ConsistencyIf h(n) <= cost(n, n') + h(n'), A* is optimally efficientStronger than admissibility; means no node is re-expanded
Space complexityO(b^d) -- stores all expanded nodes in memoryThis is A*'s primary limitation for large state spaces

When A is the Wrong Choice:* A* requires an explicit state space with well-defined transitions. For open-ended planning problems where the state space is not enumerable (e.g., creative writing, open-ended research), LLM-based planning or ToT is more appropriate. On the NCP-AAI exam, if the scenario lacks a clear state-transition model, A* is likely a distractor answer.

Use Cases for NCP-AAI: Navigation, resource allocation with cost optimization, finding optimal action sequences in well-defined state spaces, robotic motion planning.

Monte Carlo Tree Search (MCTS)

MCTS builds a search tree incrementally through random simulations, balancing exploration of new paths against exploitation of known good paths. It is the algorithm behind AlphaGo and many game-playing agents.

Four Phases:

1. SELECTION      2. EXPANSION     3. SIMULATION     4. BACKPROPAGATION
   ┌─●─┐            ┌─●─┐           ┌─●─┐              ┌─●─┐
   │   │            │   │           │   │              │   │
   ●   ●            ●   ●           ●   ●              ●   ●
   │                │   │           │   │              │   │
   ●← (UCB1)       ●   ○←NEW      ●   ○              ●   ○←UPDATE
                                       │                  │
                                       ↓ random          ↑ reward
                                    (playout)         (propagate)

Implementation:

import math
import random

class MCTSNode:
    def __init__(self, state, parent=None, action=None):
        self.state = state
        self.parent = parent
        self.action = action
        self.children = []
        self.visits = 0
        self.reward = 0.0
        self.untried_actions = state.get_legal_actions()

    def ucb1_score(self, exploration_param=1.414):
        """UCB1: balance exploitation vs exploration"""
        if self.visits == 0:
            return float('inf')
        exploitation = self.reward / self.visits
        exploration = exploration_param * math.sqrt(
            math.log(self.parent.visits) / self.visits
        )
        return exploitation + exploration

    def select_child(self):
        """Select child with highest UCB1 score"""
        return max(self.children, key=lambda c: c.ucb1_score())

    def is_fully_expanded(self):
        return len(self.untried_actions) == 0

class MCTSPlanner:
    def __init__(self, n_iterations=1000):
        self.n_iterations = n_iterations

    def plan(self, root_state):
        root = MCTSNode(root_state)

        for _ in range(self.n_iterations):
            node = root
            state = root_state.clone()

            # Phase 1: Selection (traverse tree using UCB1)
            while node.is_fully_expanded() and node.children:
                node = node.select_child()
                state.apply_action(node.action)

            # Phase 2: Expansion (add one new child)
            if node.untried_actions:
                action = random.choice(node.untried_actions)
                node.untried_actions.remove(action)
                state.apply_action(action)
                child = MCTSNode(state, parent=node, action=action)
                node.children.append(child)
                node = child

            # Phase 3: Simulation (random playout to terminal state)
            sim_state = state.clone()
            while not sim_state.is_terminal():
                action = random.choice(sim_state.get_legal_actions())
                sim_state.apply_action(action)
            reward = sim_state.get_reward()

            # Phase 4: Backpropagation (update statistics up the tree)
            while node is not None:
                node.visits += 1
                node.reward += reward
                node = node.parent

        # Return action of most-visited child (most robust choice)
        return max(root.children, key=lambda c: c.visits).action

Use Cases: Game playing, exploration tasks with uncertain outcomes, high-stakes decisions where simulation is possible, robotic planning.

LLM-Based Planning

Use LLMs directly to generate structured plans from natural language goals:

import json

def llm_plan(goal, context, llm):
    prompt = f"""
    You are a task planning AI. Break down the following goal
    into a step-by-step plan.

    Goal: {goal}
    Current context: {context}

    Provide a detailed plan in JSON format:
    {{
        "steps": [
            {{"id": 1, "action": "...", "reasoning": "...", "dependencies": []}},
            {{"id": 2, "action": "...", "reasoning": "...", "dependencies": [1]}}
        ]
    }}
    """
    response = llm.predict(prompt)
    plan = json.loads(response)
    return plan["steps"]

# Example
plan = llm_plan(
    goal="Deploy a new microservice to production",
    context="Current env: staging, tests passing, Docker image built",
    llm=llm
)

Advantages: Natural language I/O, handles novel situations, minimal domain engineering. Disadvantages: Non-deterministic, can hallucinate invalid plans, expensive per call.

Comparing Advanced Algorithms

AlgorithmTypeOptimalityCostBest For
A*Heuristic searchOptimal (with admissible h)O(b^d) time and spaceKnown state spaces with cost optimization
MCTSSimulation-basedConverges to optimalO(n_iterations) per decisionUncertain outcomes, game-like scenarios
LLM PlanningGenerativeNo guaranteePer-token API costNovel situations, natural language goals
BFS/DFSUninformed searchComplete (BFS) / Not guaranteed (DFS)O(b^d) / O(b*d)Simple state spaces, proof of concept

Integrating Planning with Memory

Planning does not happen in isolation. Production agents combine planning with memory systems to improve plan quality over time. This integration is an important NCP-AAI concept that bridges the Planning and Memory domains.

Short-Term Memory for Active Plans

During plan execution, the agent maintains a working memory of the current plan state, completed steps, pending steps, and intermediate results:

class PlanningWithMemory:
    def __init__(self, planner, executor, memory):
        self.planner = planner
        self.executor = executor
        self.memory = memory  # Working memory

    def execute_plan(self, goal):
        plan = self.planner.create_plan(goal)

        # Store plan in working memory
        self.memory.store("current_plan", plan)
        self.memory.store("completed_steps", [])
        self.memory.store("plan_status", "in_progress")

        for step in plan:
            # Check if plan needs revision based on new information
            if self.memory.get("needs_replan"):
                completed = self.memory.get("completed_steps")
                plan = self.planner.replan(goal, completed)
                self.memory.store("current_plan", plan)
                self.memory.store("needs_replan", False)

            result = self.executor.run(step)
            self.memory.append("completed_steps", (step, result))

            # Store observations for future planning
            self.memory.store_observation(step, result)

        self.memory.store("plan_status", "completed")

Long-Term Memory for Plan Reuse

Successful plans can be cached and retrieved for similar future tasks, dramatically reducing planning latency:

class PlanMemoryStore:
    def __init__(self, vector_store):
        self.vector_store = vector_store

    def store_successful_plan(self, goal, plan, outcome):
        """Store a successful plan for future retrieval"""
        embedding = self.embed(goal)
        self.vector_store.add(
            embedding=embedding,
            metadata={"goal": goal, "plan": plan, "outcome": outcome}
        )

    def retrieve_similar_plan(self, new_goal, threshold=0.85):
        """Find a previously successful plan for a similar goal"""
        embedding = self.embed(new_goal)
        results = self.vector_store.search(embedding, top_k=3)

        for result in results:
            if result.similarity >= threshold:
                return result.metadata["plan"]  # Reuse with adaptation

        return None  # No similar plan found; create new plan

This pattern -- plan memoization via semantic similarity -- is especially powerful for enterprise agents that handle recurring types of requests. On the NCP-AAI exam, questions about improving planning efficiency often have "plan caching" or "plan reuse" as the correct answer.


Hierarchical Planning: Plan-and-Execute Pattern

The Plan-and-Execute pattern separates strategic planning (slow, thoughtful) from tactical execution (fast, reactive). A high-level planner creates the overall strategy, then a ReAct agent executes each step.

class HierarchicalPlanner:
    def __init__(self, planner_llm, executor_agent):
        self.planner = planner_llm
        self.executor = executor_agent

    def solve(self, goal):
        # Step 1: Generate high-level plan
        plan_prompt = f"""
        Goal: {goal}

        Create a step-by-step plan to achieve this goal.
        Each step should be a concrete, executable subgoal.

        Plan:
        """
        plan = self.planner.predict(plan_prompt)
        steps = self._parse_plan(plan)

        # Step 2: Execute each step with ReAct agent
        results = []
        for i, step in enumerate(steps):
            print(f"Executing Step {i+1}: {step}")
            try:
                result = self.executor.run(step)
                results.append({"step": step, "result": result, "status": "success"})
            except Exception as e:
                results.append({"step": step, "error": str(e), "status": "failed"})
                # Replan from current state
                remaining = self._replan(goal, results, steps[i+1:])
                steps = steps[:i+1] + remaining

        # Step 3: Synthesize final answer
        synthesis_prompt = f"""
        Goal: {goal}
        Execution Results: {results}

        Synthesize a final answer:
        """
        return self.planner.predict(synthesis_prompt)

Benefits:

  • Separates strategic planning from execution
  • Reduces cognitive load on the executor agent
  • Enables replanning if individual steps fail
  • Scales to complex multi-step workflows

Master These Concepts with Practice

Our NCP-AAI practice bundle includes:

  • 7 full practice exams (455+ questions)
  • Detailed explanations for every answer
  • Domain-by-domain performance tracking

30-day money-back guarantee

Multi-Agent Planning

Centralized Planning

A single planner coordinates all agents, assigning sub-goals and resolving conflicts globally:

class CentralizedPlanner:
    def plan_for_agents(self, agents, global_goal):
        sub_goals = self.decompose_goal(global_goal, len(agents))
        plans = {}
        for agent, sub_goal in zip(agents, sub_goals):
            plans[agent.id] = self.create_plan(agent, sub_goal)
        return self.coordinate_plans(plans)  # Resolve conflicts

Pros: Globally optimal coordination, no conflicting actions. Cons: Single point of failure, does not scale to many agents.

Decentralized Planning

Each agent plans independently and coordinates through communication:

class DecentralizedAgent:
    def plan_and_coordinate(self, goal):
        my_plan = self.plan(goal)

        # Share intentions with neighbors
        for neighbor in self.neighbors:
            neighbor.receive_intention(self.id, my_plan)

        # Receive and incorporate neighbor plans
        neighbor_plans = self.receive_intentions()

        # Adjust plan to avoid conflicts
        return self.resolve_conflicts(my_plan, neighbor_plans)

Pros: Scalable, robust to individual agent failure. Cons: Suboptimal (no global view), requires communication protocols.

Hierarchical Multi-Level Planning

High-level planner assigns strategic goals; low-level planners handle tactical execution:

┌──────────────────────────┐
│   High-Level Planner     │  (Strategic goals)
└────────────┬─────────────┘
             │
      ┌──────┴──────┐
      v             v
  Mid-Level 1   Mid-Level 2     (Tactical plans)
      │             │
   ┌──┴──┐      ┌──┴──┐
   v     v      v     v
  Low   Low    Low   Low        (Primitive actions)

Use Case: Large-scale systems (factory automation, traffic management, enterprise workflows).


NVIDIA Platform Tools for Planning

NeMo Agent Toolkit Planning Modules

The NVIDIA NeMo Agent Toolkit provides built-in support for multiple planning strategies through configurable planning modules:

from nemo_agent import Agent, PlanningStrategy

agent = Agent(
    model="nvidia/llama-3-70b-nemo",
    planning_strategy=PlanningStrategy.REACT,
    max_planning_steps=10,
    planning_timeout=30  # seconds
)

Three Planning Strategies in NeMo Agent Toolkit:

StrategyConstantBehaviorWhen to Use
ReActPlanningStrategy.REACTExplicit reasoning + action loop with tool callsMost production agents (default)
CoTPlanningStrategy.COTStep-by-step reasoning without action executionAnalysis, explanation, reasoning-only tasks
DirectPlanningStrategy.DIRECTNo intermediate planning; single-pass responseSimple queries, low-latency requirements

Key Concept

For the NCP-AAI exam, know the three NeMo Agent Toolkit planning strategies: REACT (reasoning + actions, most flexible), COT (reasoning only, no tool calls), and DIRECT (fastest, no planning overhead). The exam tests your ability to select the appropriate strategy based on task requirements.

NeMo Agent Toolkit ReAct Configuration

# ReAct agent with tools
from nemo_agent import Agent, Tool

search_tool = Tool(
    name="wikipedia_search",
    description="Search Wikipedia for factual information",
    endpoint="https://api.wikipedia.org/search"
)

agent = Agent(
    model="nvidia/llama-3-70b-nemo",
    planning_strategy=PlanningStrategy.REACT,
    tools=[search_tool],
    max_planning_steps=10,
    planning_timeout=30
)

result = agent.run("What year was the Eiffel Tower built and how tall is it?")
# Agent uses ReAct loop: Thought → Action (search) → Observation → Thought → Answer

NVIDIA AIQ Toolkit Agent Graph

Visualize and execute agent workflows as directed graphs with conditional edges:

agent_graph:
  nodes:
    - id: search
      type: tool_call
      tool: search_database
    - id: analyze
      type: llm_reasoning
      prompt: "Analyze search results"
    - id: respond
      type: tool_call
      tool: send_response
    - id: fallback
      type: llm_reasoning
      prompt: "Generate response from cached data"
  edges:
    - from: search
      to: analyze
      condition: "search.status == success"
    - from: search
      to: fallback
      condition: "search.status == failure"    # Error recovery
    - from: analyze
      to: respond
    - from: fallback
      to: respond

NCP-AAI Exam Focus: Understand conditional edges (branching logic based on node outputs) and how they enable error recovery through fallback paths.

LangChain PlanAndExecute Agent

from langchain.agents import PlanAndExecute, load_agent_executor, load_chat_planner

planner = load_chat_planner(llm)
executor = load_agent_executor(llm, tools, verbose=True)

agent = PlanAndExecute(planner=planner, executor=executor)
result = agent.run("Book a flight from SF to NYC and reserve a hotel")

LlamaIndex Workflow Engine

from llama_index.core import Workflow

workflow = Workflow()
workflow.add_step("research", research_agent)
workflow.add_step("plan", planning_agent)
workflow.add_step("execute", execution_agent)
workflow.add_dependency("research", "plan")
workflow.add_dependency("plan", "execute")

result = workflow.run(input="Analyze competitor products and recommend strategy")

Best Practices for Production Planning Systems

Eight Rules for Production Planning

  1. Set planning timeouts to prevent infinite search loops
  2. Cache common plans for repeated tasks (plan memoization)
  3. Implement plan validation before execution (check preconditions)
  4. Use hierarchical planning for complex multi-step tasks
  5. Enable replanning for dynamic environments (continual planning)
  6. Monitor plan execution and log decisions for debugging
  7. Balance planning time vs. execution quality -- do not over-plan simple tasks
  8. Add fallback strategies for when primary plans fail

Planning Performance Optimization

class OptimizedPlanner:
    def __init__(self):
        self.plan_cache = {}
        self.timeout = 5.0  # seconds
        self.max_retries = 3

    def plan_with_optimizations(self, goal):
        # 1. Check cache first
        cache_key = self._normalize_goal(goal)
        if cache_key in self.plan_cache:
            cached_plan = self.plan_cache[cache_key]
            if self.is_still_valid(cached_plan):
                return cached_plan

        # 2. Plan with timeout
        plan = self._timeout_call(self.plan, args=(goal,), timeout=self.timeout)

        # 3. Validate before caching
        if plan and self.validate_plan(plan):
            self.plan_cache[cache_key] = plan

        return plan

    def validate_plan(self, plan):
        """Check preconditions and executability"""
        for step in plan:
            if not self.check_preconditions(step):
                return False
        return True

Common Planning Pitfalls

Avoid these mistakes both in production systems and on the NCP-AAI exam:

PitfallDescriptionSolution
Over-planningSpending too much time planning vs. executingSet timeouts; use DIRECT mode for simple tasks
Ignoring uncertaintyAssuming the environment is staticUse continual planning or ReAct
No replanningFailing to adapt when plans failAdd fallback strategies and error recovery
Invalid preconditionsAssuming preconditions that do not holdValidate preconditions before each step
Brittle plansPlans that fail on minor deviationsBuild in tolerance and alternative paths
Infinite loopsCircular dependencies or goal conflictsSet max iterations and detect cycles
Wrong strategyUsing ToT for simple tasks or CoT for tool-heavy tasksApply the decision framework above

NCP-AAI Exam Preparation

Key Planning Concepts by Domain

Agent Design and Cognition (15%):

  1. CoT prompting techniques (zero-shot, few-shot, self-consistency)
  2. ReAct loop structure and execution trace
  3. ToT search algorithms (BFS, DFS, best-first)
  4. Five classical planning approaches (forward, backward, HTN, POP, continual)
  5. Goal decomposition and task hierarchies

Agent Development (15%):

  1. Implementing ReAct agents in LangChain
  2. NeMo Agent Toolkit planning modules (REACT, COT, DIRECT)
  3. Custom planning loops and plan validation
  4. Integrating planning with memory and tools
  5. Handling planning failures, retries, and fallbacks

Agent Architecture (15%):

  1. Choosing the right planning strategy for the scenario
  2. Multi-agent planning (centralized vs. decentralized)
  3. Hierarchical plan-and-execute patterns
  4. A* and MCTS for optimization problems

Common NCP-AAI Exam Traps for Planning Questions

Exam Traps to Avoid

Trap 1: CoT for tool-heavy tasks. If the scenario involves API calls, database queries, or external tools, CoT alone is always wrong. CoT cannot execute actions.

Trap 2: ReAct is always best. ReAct is the default, but not always optimal. For pure reasoning (math, logic), CoT is cheaper and faster. For problems requiring exploration of alternatives, ToT is superior.

Trap 3: Over-planning in dynamic environments. If the question describes a changing environment, do not choose a fully upfront planning approach. Continual planning or ReAct with replanning is correct.

Trap 4: Ignoring HTN for structured workflows. When the domain has clear task hierarchies (e.g., "deploy software," "book travel"), HTN is the intended answer, not generic LLM planning.

Trap 5: Confusing POP with sequential planning. If the question asks about maximizing parallelism or minimizing total execution time, partial-order planning is the answer, not forward planning.

Practice Questions

Hands-On Practice Scenarios


Error Recovery and Replanning Strategies

Error recovery is a critical planning capability tested on the NCP-AAI exam. When a plan step fails, the agent must decide between several recovery strategies:

Recovery Strategy Hierarchy

Plan Step Fails
│
├── 1. Retry (transient error?)
│     └── Same action, same parameters, up to N retries
│
├── 2. Alternative action (same goal, different method)
│     └── Try a different tool/API that achieves the same sub-goal
│
├── 3. Partial replan (adjust remaining steps)
│     └── Keep completed steps, replan from current state
│
├── 4. Full replan (start over with new strategy)
│     └── Discard current plan, create entirely new approach
│
└── 5. Graceful degradation (reduce scope)
      └── Achieve a subset of the original goal, inform user of limitations

Implementation Pattern

class ResilientPlanner:
    def execute_with_recovery(self, goal, plan):
        for i, step in enumerate(plan):
            success = False

            # Level 1: Retry
            for attempt in range(self.max_retries):
                result = self.execute_action(step)
                if result.success:
                    success = True
                    break

            if not success:
                # Level 2: Alternative action
                alternatives = self.get_alternative_actions(step)
                for alt_action in alternatives:
                    result = self.execute_action(alt_action)
                    if result.success:
                        success = True
                        break

            if not success:
                # Level 3: Partial replan
                completed = plan[:i]
                remaining_goal = self.compute_remaining_goal(goal, completed)
                new_plan = self.planner.replan(
                    current_state=self.get_state(),
                    goal=remaining_goal,
                    constraints={"avoid": [step]}  # Don't repeat failed approach
                )
                if new_plan:
                    return self.execute_with_recovery(remaining_goal, new_plan)

                # Level 5: Graceful degradation
                return self.degrade_gracefully(goal, completed)

        return "Goal achieved"

Error Recovery Exam Scenarios

The NCP-AAI frequently tests error recovery with scenarios like:

  • Database timeout: Agent switches to cached data (fallback strategy)
  • API rate limit: Agent queues requests and implements exponential backoff (retry with delay)
  • Tool unavailable: Agent uses an alternative tool that provides similar functionality (alternative action)
  • Partial results: Agent adjusts plan to work with incomplete data (graceful degradation)
  • Conflicting information: Agent adds a verification step before proceeding (plan amendment)

Key Concept

On the NCP-AAI exam, error recovery questions typically ask: "What planning feature enables recovery when step X fails?" The answer framework is: (1) retry for transient errors, (2) conditional branching/fallback for known failure modes, (3) replanning for unexpected failures, and (4) graceful degradation when no recovery is possible.


Week-by-Week Approach

Week 1: Foundations

  • Learn CoT, ReAct, and ToT frameworks and their research origins
  • Implement basic CoT prompts (zero-shot and few-shot)
  • Understand when to use each strategy (decision framework)

Week 2: Classical Planning

  • Study forward, backward, HTN, POP, and continual planning
  • Implement a simple HTN planner for a real-world domain (trip planning, software deployment)
  • Understand A* optimality guarantees and MCTS exploration-exploitation trade-off

Week 3: NVIDIA Tools and Production Patterns

  • Study NeMo Agent Toolkit planning modules (REACT, COT, DIRECT)
  • Learn AIQ Toolkit agent graph configuration with conditional edges
  • Practice hierarchical plan-and-execute patterns with LangChain

Week 4: Integration and Exam Practice

  • Study planning + memory integration patterns
  • Practice error recovery and replanning scenarios
  • Take Preporato practice tests focused on planning questions
  • Review common exam traps and distractor patterns

Week 5: Multi-Agent and Advanced Topics

  • Study centralized vs. decentralized multi-agent planning
  • Review MCTS UCB1 formula and A* heuristic properties
  • Final review with timed practice exams

Master Planning Strategies with Preporato

Excel at planning and reasoning questions on the NCP-AAI exam. Preporato's comprehensive practice bundle covers:

  • 150+ planning and reasoning questions across all strategies (ReAct, CoT, ToT, HTN, MCTS)
  • Scenario-based problems requiring you to select optimal planning approaches
  • NVIDIA tool usage questions (NeMo Agent Toolkit, AIQ Toolkit)
  • Error recovery challenges (replanning, fallback strategies, conditional branching)
  • Code-based questions on LangChain, NeMo Agent Toolkit, and custom planners

Summary

Planning strategies determine an agent's problem-solving capability. Here is the complete hierarchy of what the NCP-AAI exam tests:

LLM-Native Strategies:

  • Chain-of-Thought: Step-by-step reasoning for logic-heavy, action-free tasks (lowest cost)
  • ReAct: Interleaved reasoning and actions for dynamic, tool-based tasks (NCP-AAI favorite)
  • Tree of Thoughts: Multi-path exploration for strategic, optimization problems (highest cost)

Classical Planning Approaches:

  • Forward Planning: Intuitive start-to-goal search for small state spaces
  • Backward Planning: Goal-directed regression for well-defined preconditions
  • HTN Planning: Hierarchical decomposition for structured workflows (most common in production)
  • Partial-Order Planning: Flexible ordering for parallelizable tasks
  • Continual Planning: Interleaved planning and execution for dynamic environments

Advanced Algorithms:

  • A*: Optimal pathfinding with admissible heuristics
  • MCTS: Simulation-based search with UCB1 exploration-exploitation balance

NVIDIA Tools:

  • NeMo Agent Toolkit: REACT, COT, and DIRECT planning modules
  • AIQ Toolkit: Agent graphs with conditional edges for workflow orchestration

Key Takeaways Checklist

0/11 completed

Ready to Pass the NCP-AAI Exam?

Join thousands who passed with Preporato practice tests

Instant access30-day guaranteeUpdated monthly