NCP-AAINVIDIAAgentic AIAgent PlanningAI ReasoningReActChain-of-ThoughtTree of Thoughts

Agent Planning: ReAct vs CoT vs Tree of Thoughts (NCP-AAI)

Preporato TeamApril 1, 202635 min readNCP-AAI

Planning -- the ability to break down complex goals into executable steps -- is what separates advanced agentic AI systems from simple chatbots. The NVIDIA NCP-AAI certification heavily emphasizes planning strategies, as they determine an agent's capability to solve multi-step problems, reason about consequences, and optimize action sequences. This comprehensive guide covers every planning paradigm tested on the NCP-AAI exam: the three foundational reasoning strategies (Chain-of-Thought, ReAct, and Tree of Thoughts), five classical planning approaches (forward, backward, HTN, partial-order, and continual), advanced algorithms (A* and MCTS), NVIDIA NeMo Agent Toolkit planning modules, multi-agent planning patterns, and common exam traps you need to avoid.

Start Here

New to NCP-AAI? Start with our Complete NCP-AAI Certification Guide for exam overview, domains, and study paths. Then use our NCP-AAI Cheat Sheet for quick reference and How to Pass NCP-AAI for exam strategies.

Why Planning Matters for Agentic AI

Planning enables agents to:

Decompose complex tasks into manageable sub-tasks
Reason about action sequences before execution
Anticipate obstacles and plan contingencies
Optimize for goals (shortest path, lowest cost, highest success rate)
Handle ambiguous or underspecified requests
Allocate resources efficiently across parallel workstreams
Adapt dynamically when circumstances change

NCP-AAI Coverage:

Agent Design and Cognition domain (15%): Planning algorithms, reasoning patterns, goal management
Agent Development domain (15%): Implementing planning mechanisms, framework selection
Agent Architecture domain (15%): Choosing appropriate planning strategies, multi-agent coordination

The Planning Challenge

Without planning, agents exhibit three critical failure modes:

Myopic behavior: Short-sighted decisions without considering future consequences
Action thrashing: Inefficient trial-and-error without strategic thinking
Goal confusion: Losing track of the original objective in multi-step tasks

Example -- Flight Booking Without vs. With Planning:

Task: "Book a flight to Paris for next week"

Without Planning (Bad):
Agent: "What dates work for you?"
User: "Monday to Friday"
Agent: "Let me search flights..."
Agent: "Oh, I need your departure city. What city?"
User: "San Francisco"
Agent: "Searching... Oh, I need your budget. What's your budget?"
→ Inefficient, poor user experience (3 round trips)

With Planning (Good):
Agent: "To book your flight, I need:
  1. Departure city
  2. Travel dates
  3. Budget range
  4. Seating preference
Can you provide these details?"
→ Strategic, efficient information gathering (1 round trip)

Core Planning Capabilities Tested on NCP-AAI

The exam tests your understanding of five core planning capabilities:

Task decomposition: Breaking complex requests into ordered subtasks
Multi-step orchestration: Sequencing actions with correct dependencies
Conditional branching: Adapting plans based on runtime conditions
Error recovery: Replanning when tasks fail (fallback strategies)
Goal optimization: Finding efficient paths to objectives under constraints

Preparing for NCP-AAI? Practice with 455+ exam questions

Try Free View Bundle - $19.99

Chain-of-Thought (CoT) Prompting

Definition: Chain-of-Thought prompting elicits step-by-step reasoning from LLMs by showing examples or instructing the model to "think through" problems. Introduced by Wei et al. (2022) in "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models," CoT demonstrated that prompting a 540B-parameter model (PaLM) with just eight chain-of-thought exemplars achieved state-of-the-art accuracy on the GSM8K benchmark of math word problems.

How CoT Works

CoT decomposes complex reasoning into explicit intermediate steps, making the model's thought process transparent and verifiable. Rather than jumping from question to answer, the model generates a reasoning trace that walks through each logical step.

Basic CoT (Few-Shot)

Provide examples of step-by-step reasoning to teach the model the format:

from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI

few_shot_cot_template = """
Example 1:
Question: A bakery makes 48 cupcakes. If they pack 6 cupcakes per box, how many boxes do they need?
Reasoning:
- Total cupcakes: 48
- Cupcakes per box: 6
- Calculation: 48 / 6 = 8
Answer: 8 boxes

Example 2:
Question: If a car travels at 60 mph for 2.5 hours, how far does it travel?
Reasoning:
- Speed: 60 miles per hour
- Time: 2.5 hours
- Formula: Distance = Speed x Time
- Calculation: 60 x 2.5 = 150
Answer: 150 miles

Now solve this:
Question: {question}
Reasoning:
"""

llm = ChatOpenAI(model="gpt-4", temperature=0)
chain = PromptTemplate(input_variables=["question"], template=few_shot_cot_template) | llm

result = chain.invoke({
    "question": "If a store sells 15 items per hour and is open 8 hours per day, how many items are sold in a week?"
})

# Output:
# - Items per hour: 15
# - Hours per day: 8
# - Items per day: 15 x 8 = 120
# - Days per week: 7
# - Items per week: 120 x 7 = 840
# Answer: 840 items per week

Zero-Shot CoT

No examples needed -- just append "Let's think step by step" to the prompt. Kojima et al. (2022) showed in "Large Language Models are Zero-Shot Reasoners" that this simple addition dramatically improves reasoning performance without any exemplars:

zero_shot_cot_prompt = """
Question: {question}

Let's think step by step:
"""

# Results from Kojima et al. (2022) using text-davinci-002:
# MultiArith: 17.7% → 78.7% accuracy
# GSM8K:      10.4% → 40.7% accuracy
# Similar improvements observed with PaLM 540B

The versatility of this single prompt across diverse reasoning tasks -- arithmetic, symbolic, commonsense, and logical -- hints at untapped zero-shot cognitive capabilities in large language models.

Self-Consistency CoT

Generate multiple CoT reasoning paths and select the answer that appears most frequently (majority vote). This reduces the impact of any single flawed reasoning chain:

class SelfConsistencyCoT:
    def __init__(self, llm, num_samples=5):
        self.llm = llm
        self.num_samples = num_samples

    def solve(self, question):
        answers = []
        for _ in range(self.num_samples):
            # Generate with temperature > 0 for diverse reasoning paths
            response = self.llm.predict(
                f"Question: {question}\nLet's think step by step:",
                temperature=0.7
            )
            answer = self.extract_answer(response)
            answers.append(answer)

        # Majority vote
        from collections import Counter
        most_common = Counter(answers).most_common(1)[0][0]
        return most_common

# Typically improves accuracy by 5-15% over single-pass CoT

CoT Strengths and Limitations

Aspect	Strengths	Limitations
Use Cases	Math, logic puzzles, multi-step reasoning	Real-world actions, tool use
Transparency	Shows full reasoning process	Reasoning can be confidently wrong
Latency	Single LLM call (lowest cost)	Longer output = more tokens
Reliability	Deterministic reasoning path	Prone to compounding errors
Grounding	Internal knowledge only	Cannot access external information

Key Concept

CoT is best for reasoning-heavy, action-light tasks (analysis, planning, explanation) but insufficient for action-heavy tasks (API calls, tool use, multi-step execution). On the NCP-AAI exam, if a scenario involves tool calls or external interactions, CoT alone is the wrong answer -- look for ReAct instead.

CoT Variants Summary

The NCP-AAI exam may present scenarios where you need to choose between CoT variants. Here is a quick reference:

Variant	Description	When to Use	Cost
Few-Shot CoT	Provide 2-5 worked examples before the question	When you have good exemplars and need reliable formatting	1 call (longer prompt)
Zero-Shot CoT	Append "Let's think step by step" with no examples	Quick reasoning boost without crafting examples	1 call (short prompt)
Self-Consistency CoT	Generate k diverse reasoning paths, majority vote	When reliability matters more than cost	k calls (k=5-10 typical)
Manual CoT	Hand-craft optimal reasoning chains for specific domains	Domain-specific applications with known optimal reasoning	1 call (curated prompt)

Key Research Results to Remember:

Wei et al. (2022): Few-shot CoT with PaLM 540B achieved state-of-the-art on GSM8K math benchmarks, demonstrating that reasoning emerges in sufficiently large models with appropriate prompting.
Kojima et al. (2022): Zero-shot CoT ("Let's think step by step") improved MultiArith accuracy from 17.7% to 78.7% with InstructGPT, showing that no exemplars are needed to unlock reasoning capabilities.
Wang et al. (2022): Self-Consistency improved CoT accuracy by 5-15% across benchmarks by sampling multiple reasoning paths and taking the majority vote, addressing the fragility of single-chain reasoning.

When CoT Fails: Understanding Limitations

CoT has well-documented failure modes that the NCP-AAI exam may test:

Faithful but wrong reasoning: The model can generate a logical-looking chain of reasoning that arrives at the wrong answer. The steps appear sound, but a subtle error early in the chain propagates forward.
Overthinking simple problems: For straightforward factual lookups or pattern matching, CoT adds unnecessary tokens and latency without improving accuracy. Use DIRECT mode for these.
No external grounding: CoT operates entirely on the model's internal knowledge. If the information is outdated, incomplete, or hallucinated, the entire reasoning chain is built on a faulty foundation. This is precisely why ReAct was developed.
Length sensitivity: Very long reasoning chains (10+ steps) can lose coherence, with the model forgetting earlier constraints or introducing contradictions. For truly complex problems, hierarchical decomposition (HTN) may be more appropriate.

ReAct: Reasoning + Acting

Definition: ReAct (Reasoning and Acting) interleaves reasoning traces with action execution, allowing agents to dynamically adjust plans based on environment feedback. Introduced by Yao et al. (2022) in "ReAct: Synergizing Reasoning and Acting in Language Models," the framework addresses a key limitation of CoT: while chain-of-thought uses only internal representations, ReAct grounds reasoning in real-world observations by interleaving thought with action.

Why ReAct Matters

The core insight of ReAct is that reasoning and acting are complementary:

Reasoning traces help the model induce, track, and update action plans, as well as handle exceptions
Actions allow the model to interface with external sources (knowledge bases, APIs, environments) to gather additional information

Yao et al. evaluated ReAct on four benchmarks -- HotPotQA, Fever, ALFWorld, and WebShop -- and found that ReAct outperforms vanilla action generation while being competitive with CoT. The best results came from combining ReAct with CoT, using both internal knowledge and externally obtained information.

The ReAct Loop

Thought: [Reasoning about what to do next]
Action: [Tool/function to execute]
Action Input: [Arguments for the tool]
Observation: [Result from executing the action]
... (repeat Thought/Action/Observation)
Thought: I now know the final answer
Final Answer: [Response to user]

ReAct Implementation

from langchain.agents import create_react_agent, AgentExecutor
from langchain.tools import Tool
from langchain_openai import ChatOpenAI

# Define tools
def search_web(query: str) -> str:
    """Search the web for information"""
    return f"Search results for '{query}': ..."

def calculator(expression: str) -> str:
    """Evaluate mathematical expressions"""
    try:
        return str(eval(expression))
    except Exception as e:
        return f"Error: {e}"

tools = [
    Tool(name="Search", func=search_web, description="Search the web for current information"),
    Tool(name="Calculator", func=calculator, description="Perform mathematical calculations")
]

# ReAct prompt template
react_prompt = """
Answer the following question as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought: {agent_scratchpad}
"""

# Create agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = create_react_agent(llm, tools, react_prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=10)

# Execute
result = executor.invoke({"input": "What is the population of Tokyo multiplied by 2?"})

Execution Trace:

Thought: I need to find Tokyo's population, then multiply by 2
Action: Search
Action Input: "Tokyo population 2024"
Observation: Tokyo's population is approximately 14 million

Thought: Now I need to multiply 14 million by 2
Action: Calculator
Action Input: "14000000 * 2"
Observation: 28000000

Thought: I now know the final answer
Final Answer: The population of Tokyo (14 million) multiplied by 2 is 28 million.

ReAct Variants

1. ReAct with Self-Correction

The agent detects failures and adjusts its approach:

react_selfcorrect_prompt = """
... (standard ReAct format)

If an action fails or returns unexpected results, reconsider your approach:
Thought: That didn't work as expected. Let me try a different approach.
Action: [Alternative action]
...
"""

# Example execution trace:
# Thought: I'll search for "Tokyo population"
# Action: Search
# Action Input: "Tokyo population"
# Observation: Error: Too many results, be more specific
#
# Thought: That didn't work. Let me be more specific with the year.
# Action: Search
# Action Input: "Tokyo population 2024 census"
# Observation: Tokyo's population is approximately 14 million
# → Self-correction leads to success

2. ReAct with Reflection

The agent evaluates its own reasoning quality after task completion:

react_reflection_prompt = """
... (standard ReAct)

After completing the task, reflect:
Reflection: [Evaluate the quality of your reasoning and actions]
Improvements: [What could be done better next time]
"""

# Example:
# Final Answer: 28 million
#
# Reflection: My approach was effective -- I systematically gathered
# information and performed calculations. The search query could
# have been more precise initially.
#
# Improvements: Include the year in initial search queries to
# avoid ambiguity. Consider verifying with a second source.

3. ReAct + CoT Hybrid

Combine internal CoT reasoning with external ReAct actions for best results. In the original paper, Yao et al. found that combining ReAct with CoT outperformed either approach alone on HotPotQA and Fever benchmarks. The hybrid approach works by first attempting CoT internal reasoning, then switching to ReAct when the model detects that its internal knowledge is insufficient or outdated:

class ReActCoTHybrid:
    def __init__(self, llm, tools):
        self.llm = llm
        self.tools = tools

    def solve(self, question):
        # Step 1: Attempt CoT internal reasoning
        cot_response = self.llm.predict(f"""
        Question: {question}

        First, reason about what you already know (internal knowledge).
        Rate your confidence in your knowledge on a scale of 1-10.

        Internal reasoning:
        """)

        confidence = self.extract_confidence(cot_response)

        if confidence >= 8:
            # High confidence: use CoT answer directly (saves tool calls)
            return self.llm.predict(f"""
            Based on this reasoning: {cot_response}
            Provide the final answer:
            """)
        else:
            # Low confidence: switch to ReAct for external verification
            return self.react_agent.run(f"""
            Question: {question}
            Initial reasoning (may be incomplete): {cot_response}
            Verify and complete this answer using available tools.
            """)

# This pattern:
# - Saves tool calls when internal knowledge is sufficient
# - Falls back to grounded actions when knowledge gaps exist
# - Combines the speed of CoT with the accuracy of ReAct

This hybrid pattern is particularly valuable in production because it reduces API costs. Many agent queries can be answered with internal knowledge alone (saving tool call latency and cost), while complex or factual queries automatically escalate to tool-augmented reasoning. On the NCP-AAI exam, questions about optimizing agent costs while maintaining accuracy often point to this hybrid approach.

ReAct Advantages for NCP-AAI

Grounded in reality: Actions provide real feedback, preventing hallucinations
Transparent reasoning: Thought traces are interpretable and debuggable
Dynamic adaptation: Can adjust strategy based on observations
Tool integration: Natural fit for function calling and API interactions
Explicit and traceable: Unlike black-box planning, every decision is logged

NCP-AAI Exam Focus: ReAct is the default planning strategy for production agents due to its balance of reasoning and action. It is the most heavily tested planning framework on the exam.

Exam Trap

A common NCP-AAI mistake is assuming ReAct is always the best choice. While ReAct is the default for production agents, it has key limitations: linear planning (no alternative exploration), error accumulation (early mistakes compound), and high token costs from verbose output. Know when to combine ReAct with other strategies or when ToT or HTN is the better fit.

ReAct Limitations

Challenge	Impact	Mitigation
Verbose output	High token costs	Use cheaper models for reasoning steps
Linear planning	Does not explore alternatives	Combine with ToT for branching
Error accumulation	Early mistakes compound	Add reflection/self-correction
Max iterations	Can timeout on complex tasks	Set appropriate limits, add fallbacks
Single path	Commits to first viable approach	Use ToT when comparison is needed

Tree of Thoughts (ToT): Exploring Multiple Reasoning Paths

Definition: Tree of Thoughts generates multiple reasoning paths (branches), evaluates them, and selects the most promising direction -- enabling search-based planning. Introduced by Yao et al. (2023) in "Tree of Thoughts: Deliberate Problem Solving with Large Language Models" (NeurIPS 2023), ToT generalizes CoT by allowing LMs to explore coherent units of text ("thoughts") as intermediate problem-solving steps, with the ability to look ahead, evaluate, and backtrack.

Key Result

On the Game of 24 benchmark, GPT-4 with standard CoT prompting solved only 4% of tasks, while ToT achieved a 74% success rate -- a dramatic improvement demonstrating the power of deliberate exploration over linear reasoning.

ToT Concepts

1. Thought Decomposition -- Break the problem into intermediate steps (thoughts), where each thought is a coherent unit of reasoning.

2. Thought Generation -- Generate multiple candidate thoughts at each step using the LLM.

3. State Evaluation -- Evaluate how promising each thought path is (how likely it leads to a correct solution). The LLM itself can serve as the evaluator, scoring each path on a numeric scale.

4. Search Algorithm -- Navigate the tree of possibilities:

Breadth-First Search (BFS): Explore all options at each level before going deeper
Depth-First Search (DFS): Explore one path deeply before backtracking
Best-First Search: Prioritize the most promising paths using evaluation scores

ToT Implementation

from langchain_openai import ChatOpenAI

class TreeOfThoughts:
    def __init__(self, llm, max_depth=3, branching_factor=3):
        self.llm = llm
        self.max_depth = max_depth
        self.branching_factor = branching_factor

    def generate_thoughts(self, problem, current_state, depth):
        """Generate multiple candidate next thoughts"""
        prompt = f"""
        Problem: {problem}
        Current reasoning: {current_state}

        Generate {self.branching_factor} different possible next steps
        in solving this problem. Format each as a numbered option:
        """
        response = self.llm.predict(prompt)
        thoughts = self._parse_thoughts(response)
        return thoughts

    def evaluate_thought(self, problem, thought_sequence):
        """Evaluate how promising a thought sequence is (0-10)"""
        prompt = f"""
        Problem: {problem}
        Reasoning so far: {thought_sequence}

        On a scale of 0-10, how likely is this reasoning path
        to lead to a correct solution?
        Consider:
        - Logical coherence
        - Progress toward the goal
        - Avoiding dead ends

        Score (0-10):
        """
        response = self.llm.predict(prompt)
        score = float(response.strip())
        return score

    def _breadth_first_search(self, problem):
        """Explore all paths level by level"""
        queue = [("", 0)]  # (thought_sequence, depth)
        best_path = None
        best_score = -1

        while queue:
            current_state, depth = queue.pop(0)

            if depth >= self.max_depth:
                score = self.evaluate_thought(problem, current_state)
                if score > best_score:
                    best_score = score
                    best_path = current_state
                continue

            thoughts = self.generate_thoughts(problem, current_state, depth)
            for thought in thoughts:
                new_state = current_state + "\n" + thought
                queue.append((new_state, depth + 1))

        return best_path, best_score

    def _depth_first_search(self, problem, current_state="", depth=0, threshold=3):
        """Explore one path deeply, backtrack if score is too low"""
        if depth >= self.max_depth:
            score = self.evaluate_thought(problem, current_state)
            return current_state, score

        thoughts = self.generate_thoughts(problem, current_state, depth)
        best_path = None
        best_score = -1

        for thought in thoughts:
            new_state = current_state + "\n" + thought
            # Prune: skip paths that score below threshold
            mid_score = self.evaluate_thought(problem, new_state)
            if mid_score < threshold:
                continue  # Backtrack

            path, score = self._depth_first_search(
                problem, new_state, depth + 1, threshold
            )
            if score > best_score:
                best_score = score
                best_path = path

        return best_path, best_score

    def solve(self, problem, algorithm="bfs"):
        """Solve problem using Tree of Thoughts"""
        if algorithm == "bfs":
            best_path, score = self._breadth_first_search(problem)
        else:
            best_path, score = self._depth_first_search(problem)

        # Generate final answer from best path
        final_prompt = f"""
        Problem: {problem}
        Best reasoning path found:
        {best_path}

        Based on this reasoning, provide the final answer:
        """
        answer = self.llm.predict(final_prompt)
        return answer, best_path, score

# Usage
llm = ChatOpenAI(model="gpt-4", temperature=0.7)
tot = TreeOfThoughts(llm, max_depth=3, branching_factor=3)

problem = "Design a microservices architecture for an e-commerce platform."
answer, reasoning, score = tot.solve(problem)

ToT Execution Example

Problem: "Plan a 3-day trip to New York City on a $1000 budget"

Root: "Plan NYC trip, 3 days, $1000"
│
├── Branch 1: "Focus on free attractions (museums, parks, walking tours)"
│   ├── Branch 1.1: "Budget hotel ($100/night), subway pass"
│   │   └── Branch 1.1.1: "Day 1: Central Park + Met, Day 2: Brooklyn
│   │       Bridge + 9/11 Memorial, Day 3: High Line + Chelsea Market"
│   │       [Score: 8/10]
│   ├── Branch 1.2: "Hostel ($50/night), more budget for activities"
│   └── Branch 1.3: "Airbnb in Queens ($80/night), authentic experience"
│
├── Branch 2: "Prioritize iconic paid attractions"
│   ├── Branch 2.1: "Buy CityPass ($140), budget accommodation"
│   │   └── Branch 2.1.1: "Day 1: ESB + Top of Rock, Day 2:
│   │       Statue of Liberty..." [Score: 7/10]
│   └── Branch 2.2: "Focus on 2-3 major attractions, skip tourist traps"
│
└── Branch 3: "Cultural experience (Broadway, dining, neighborhoods)"
    ├── Branch 3.1: "One Broadway show ($150), street food"
    │   └── Branch 3.1.1: "Day 1: Broadway + Times Square, Day 2:
    │       Chinatown + Little Italy..." [Score: 6/10]
    └── Branch 3.2: "Skip Broadway, focus on food tours"

Best Path Selected: Branch 1.1.1 (Score: 8/10)
Reasoning: Maximizes experiences within budget by focusing on free
attractions while maintaining comfort with a budget hotel.

ToT Cost Analysis

The computational cost of ToT grows exponentially with depth and branching factor:

ToT Advantages and Limitations

Advantages:

Explores alternatives: Considers multiple strategies before committing
Handles complexity: Effective for problems with many valid approaches
Avoids local optima: Can backtrack from dead ends
Self-evaluation: Explicitly assesses reasoning quality at each step

Limitations:

Challenge	Impact	Solution
High cost	Many LLM calls (exponential)	Prune low-scoring branches early
Latency	Slow for real-time applications	Use for planning phase only
Evaluation difficulty	Hard to score thought quality	Train value model or use heuristics
Overkill for simple tasks	Unnecessary complexity	Use CoT or ReAct when one path suffices

When to Use ReAct vs. CoT vs. ToT: Decision Framework

Choosing the right planning strategy is one of the most frequently tested skills on the NCP-AAI exam. Use this decision framework:

Strategy Selection Decision Tree

Does the task require external tool calls or API interactions?
├── YES → Does it require comparing multiple approaches?
│         ├── YES → ReAct + ToT hybrid
│         └── NO  → ReAct (default for production agents)
└── NO  → Is there a single correct answer (math, logic)?
          ├── YES → Chain-of-Thought
          └── NO  → Are there multiple valid approaches to evaluate?
                    ├── YES → Tree of Thoughts
                    └── NO  → Chain-of-Thought

Quick Reference Matrix

Planning Strategy Selection Guide

Scenario	Best Strategy	Why	Exam Frequency
Customer service resolving tickets with API calls	ReAct	Requires tool use + adaptive reasoning	Very High
Solving a math word problem	Chain-of-Thought	Pure reasoning, no actions needed	High
Designing system architecture (multiple valid approaches)	Tree of Thoughts	Must explore and compare alternatives	Medium
Flight booking with search + comparison + booking	ReAct	Sequential tool calls with reasoning	Very High
Sudoku or constraint-satisfaction puzzle	Tree of Thoughts	Requires branching and backtracking	Medium
Code review and explanation	Chain-of-Thought	Reasoning-only, no external actions	Medium
Multi-hop question answering with knowledge retrieval	ReAct	Needs external knowledge + reasoning	High
Strategic business planning with trade-offs	Tree of Thoughts	Multiple valid strategies to evaluate	Low

Cost and Latency Comparison

Planning Strategies: Performance Comparison

Strategy	LLM Calls	Token Cost	Latency	Transparency	NCP-AAI Weight
Chain-of-Thought	1 call	Low	Low (seconds)	High (full trace)	Medium
Self-Consistency CoT	k calls (k=5-10)	Medium	Medium	High	Low
ReAct	N calls (1 per step)	Medium	Medium (seconds-minutes)	High (thought + action)	Very High
Tree of Thoughts (BFS)	b^d calls (exponential)	High	High (minutes)	Medium (best path shown)	Medium
Tree of Thoughts (DFS)	b*d calls (with pruning)	Medium-High	Medium-High	Medium	Medium

Exam Trap

When the exam presents a scenario involving tool calls or external API interactions, Chain-of-Thought alone is always the wrong answer. CoT only performs reasoning without actions. If the scenario requires executing searches, database queries, or API calls, the correct answer involves ReAct or a planning framework that supports action execution. This is one of the most common traps on the NCP-AAI.

Five Classical Planning Approaches

Beyond the three LLM-native strategies above, the NCP-AAI exam tests your knowledge of classical AI planning approaches that provide the theoretical foundation for agent planning systems.

1. Forward Planning (Progressive Search)

Definition: Start from the current state and explore actions forward until the goal state is reached.

Current State → Action 1 → State 2 → Action 2 → ... → Goal State

Implementation:

class ForwardPlanner:
    def plan(self, start, goal):
        state = start
        plan = []

        while state != goal:
            # Find action that moves toward goal
            action = self.select_best_action(state, goal)
            plan.append(action)
            state = self.apply_action(state, action)

        return plan

# Example
planner = ForwardPlanner()
plan = planner.plan(start="home", goal="office")
# Result: ["walk_to_car", "drive_to_office", "park_car", "enter_building"]

Advantages: Intuitive, easy to implement, works well when the action space is small. Disadvantages: Can be inefficient for large search spaces; explores many irrelevant actions.

2. Backward Planning (Regression)

Definition: Start from the goal state and work backward to determine the required preconditions at each step.

Goal State ← Action N ← State N-1 ← ... ← Current State

Implementation:

class BackwardPlanner:
    def plan(self, current_state, goal):
        required_states = [goal]
        plan = []

        while required_states[-1] != current_state:
            needed_state = required_states[-1]
            action = self.find_producer_action(needed_state)
            plan.insert(0, action)
            preconditions = self.get_preconditions(action)
            required_states.append(preconditions)

        return plan

# Example: Software deployment
plan = backward_planner.plan(
    current_state="code_written",
    goal="app_deployed"
)
# Result: ["run_tests", "build_docker_image", "push_to_registry", "deploy_to_k8s"]

When to Use: When the goal has fewer achievable states than the start state, or when preconditions are well-defined. Also effective for multi-hop question answering where you decompose from the final question backward to sub-questions.

Exam Application -- Multi-Hop QA Example:

Question: "What is the capital of the country where the 2024 Olympics were held?"

Backward decomposition:
  Goal: Capital of country X
  ← Requires: Country X where 2024 Olympics were held
    ← Requires: Location of 2024 Olympics → Paris
      ← Derives: Country → France
        ← Derives: Capital of France → Paris

Forward execution of decomposed plan:
  Step 1: Search "2024 Olympics location" → Paris
  Step 2: Search "What country is Paris in?" → France
  Step 3: Search "Capital of France" → Paris
  Answer: Paris

This backward-then-forward pattern -- decompose the problem backward from the goal, then execute forward -- is a common NCP-AAI exam pattern. The question often asks which reasoning technique is demonstrated (backward chaining) or which execution framework supports it (ReAct for the forward execution with tool calls).

3. Hierarchical Task Network (HTN) Planning

Definition: Decompose high-level abstract tasks into primitive actions using predefined decomposition methods. HTN is the most common planning approach in production agentic systems and one of the most frequently tested topics on the NCP-AAI exam.

The SHOP2 algorithm (Nau et al., 2003) is the canonical HTN planner, notable for supporting partially ordered subtask decomposition -- meaning subtasks within a method do not all need a fixed execution order.

Structure:

┌─────────────────────────────────────┐
│   High-Level Goal (Abstract Task)   │
└──────────────┬──────────────────────┘
               │
        ┌──────┴──────┐
        v             v
    Subtask 1     Subtask 2
        │             │
     ┌──┴──┐       ┌──┴──┐
     v     v       v     v
   Action Action Action Action
   (Primitive -- directly executable)

Implementation:

class HTNPlanner:
    def __init__(self):
        self.methods = {
            "book_travel": [
                ["book_flight", "book_hotel", "rent_car"],      # Method 1
                ["book_train", "book_hotel"]                     # Method 2 (alternative)
            ],
            "book_flight": [
                ["search_flights", "select_flight", "complete_payment"]
            ],
            "book_hotel": [
                ["search_hotels", "reserve_room", "confirm_booking"]
            ]
        }
        self.primitive_actions = {
            "search_flights", "select_flight", "complete_payment",
            "search_hotels", "reserve_room", "confirm_booking",
            "rent_car_online", "book_train"
        }

    def decompose(self, task):
        """Recursively decompose task into primitive actions"""
        if task in self.primitive_actions:
            return [task]

        for method in self.methods.get(task, []):
            plan = []
            valid = True
            for subtask in method:
                sub_plan = self.decompose(subtask)
                if sub_plan is None:
                    valid = False
                    break
                plan.extend(sub_plan)
            if valid and self.is_valid_plan(plan):
                return plan

        return None  # No valid decomposition found

# Usage
planner = HTNPlanner()
plan = planner.decompose("book_travel")
# Result: ["search_flights", "select_flight", "complete_payment",
#          "search_hotels", "reserve_room", "confirm_booking", "rent_car_online"]

Key Concept

HTN planning is the most common planning approach for production agentic systems. It decomposes abstract goals into concrete primitive actions using predefined methods, making it ideal for well-structured domains like trip planning, software deployment, business workflows, and customer service escalation. On the NCP-AAI exam, if a question describes a complex workflow with clear hierarchical structure, HTN is likely the correct answer.

Task Decomposition Matrix

Use this matrix to determine the right decomposition strategy:

Task Characteristic	Decomposition Approach	Example
Fixed sequence of steps	Sequential HTN	Software deployment pipeline
Steps can run in parallel	Partial-order HTN (SHOP2)	Cooking: boil water + chop vegetables simultaneously
Multiple valid methods	HTN with method selection	Travel: fly vs. train vs. drive
Dynamic environment	HTN + continual replanning	Warehouse robot navigation
Unknown structure	LLM-based decomposition	Novel user requests

4. Partial-Order Planning (POP)

Definition: Plan actions without committing to a specific execution order until necessary. Actions are only ordered when one depends on the output of another, enabling parallelism.

class PartialOrderPlanner:
    def __init__(self):
        self.actions = []
        self.orderings = []     # List of (before, after) constraints
        self.causal_links = []  # (producer, condition, consumer)

    def add_action(self, action):
        self.actions.append(action)

    def add_ordering(self, before, after):
        """Enforce: 'before' must execute before 'after'"""
        self.orderings.append((before, after))

    def get_parallel_groups(self):
        """Find actions that can execute in parallel"""
        return self.topological_sort_with_levels(self.actions, self.orderings)

# Example: Dinner preparation
planner = PartialOrderPlanner()
planner.add_action("chop_vegetables")
planner.add_action("boil_water")
planner.add_action("cook_pasta")
planner.add_action("make_sauce")
planner.add_action("serve")

# Only add necessary ordering constraints
planner.add_ordering("boil_water", "cook_pasta")
planner.add_ordering("chop_vegetables", "make_sauce")
planner.add_ordering("cook_pasta", "serve")
planner.add_ordering("make_sauce", "serve")

# Parallel groups:
# Group 1 (parallel): ["boil_water", "chop_vegetables"]
# Group 2 (parallel): ["cook_pasta", "make_sauce"]
# Group 3: ["serve"]
# Total time: 3 sequential groups instead of 5 sequential actions

Advantages: Enables parallel execution, more flexible scheduling, reduces total execution time. Disadvantages: Complex constraint management, harder to debug than sequential plans.

Why POP Matters for Agentic AI: In multi-agent systems, partial-order planning directly maps to task parallelism. If two subtasks have no ordering constraint between them, they can be assigned to different agents and executed simultaneously. This is why POP is foundational to scalable agent architectures. On the NCP-AAI exam, any question about maximizing throughput or minimizing wall-clock execution time for independent subtasks points to partial-order planning.

Causal Links and Threat Resolution: In formal POP, a causal link (A --[p]--> B) means action A produces condition p that action B requires. A "threat" occurs when a third action C could undo condition p between A and B. The planner resolves threats by either promoting C before A or demoting C after B. While the NCP-AAI does not require formal proofs, understanding that POP must protect causal dependencies from interference is important for architecture questions about concurrent agent actions.

5. Continual Planning (Interleaved Planning and Execution)

Definition: Plan, execute, observe, replan -- a continuous cycle where planning happens during execution and adapts to real-world feedback. This is the closest classical analogue to the ReAct pattern.

Plan → Execute Step 1 → Observe → Replan → Execute Step 2 → Observe → ...

Implementation:

class ContinualPlanner:
    def execute_with_replanning(self, goal):
        plan = self.create_initial_plan(goal)
        execution_log = []

        while not self.goal_achieved(goal):
            if not plan:
                return "Failed: no valid plan found"

            # Execute next action
            action = plan.pop(0)
            result = self.execute_action(action)
            execution_log.append((action, result))

            # Check for unexpected outcomes
            if result.unexpected or result.failed:
                # Replan from current state with updated beliefs
                plan = self.replan(
                    current_state=self.get_current_state(),
                    goal=goal,
                    failed_action=action,
                    history=execution_log
                )

            # Update world model with new observations
            self.update_beliefs(result)

        return "Goal achieved"

When to Use: Dynamic environments where conditions change frequently (robotics, real-time systems, live data processing). On the NCP-AAI exam, if the scenario describes a changing environment, the answer is almost never a fully upfront planning approach.

Classical Approaches Comparison

Five Classical Planning Approaches

Approach	Best For	Key Advantage	Key Limitation
Forward Planning	Simple, small search spaces	Intuitive implementation	Inefficient for large state spaces
Backward Planning	Goal-directed, well-defined preconditions	Efficient precondition analysis	Requires well-defined goal states
HTN Planning	Complex hierarchical workflows	Most common in production agentic AI	Requires predefined decomposition methods
Partial-Order (POP)	Parallelizable, independent subtasks	Enables parallel execution	Complex constraint management
Continual Planning	Dynamic, changing environments	Adapts to real-time conditions	Higher computational overhead

Exam Trap

On the NCP-AAI exam, watch out for scenarios where over-planning is the trap answer. If the question describes a dynamic environment with frequent changes, the answer is almost never a fully upfront planning approach (like pure forward or backward planning). Look for continual planning or ReAct-based replanning strategies instead. Conversely, if the domain is well-structured with known decomposition rules, HTN is preferred over LLM-based planning.

Advanced Planning Algorithms

A* Planning (Optimal Pathfinding)

A* finds the lowest-cost path from a start state to a goal state using a heuristic function to guide the search. It guarantees an optimal solution when the heuristic is admissible (never overestimates the true cost).

Implementation:

import heapq

class AStarPlanner:
    def plan(self, start, goal):
        frontier = [(0, start)]       # Priority queue: (f_score, state)
        came_from = {start: None}
        g_score = {start: 0}          # Actual cost from start

        while frontier:
            current_f, current = heapq.heappop(frontier)

            if current == goal:
                return self.reconstruct_path(came_from, start, goal)

            for action, next_state in self.get_successors(current):
                new_g = g_score[current] + self.action_cost(action)

                if next_state not in g_score or new_g < g_score[next_state]:
                    g_score[next_state] = new_g
                    f_score = new_g + self.heuristic(next_state, goal)
                    heapq.heappush(frontier, (f_score, next_state))
                    came_from[next_state] = (current, action)

        return None  # No path found

    def heuristic(self, state, goal):
        """Must be admissible: never overestimates true cost"""
        # Example: Manhattan distance for grid-based planning
        return abs(state.x - goal.x) + abs(state.y - goal.y)

A Complexity:*

Time: O(b^d) worst case, but typically much better with a good heuristic
Space: O(b^d) -- stores all expanded nodes
Optimality: Guaranteed when h(n) is admissible

Key Properties of A for the NCP-AAI Exam:*

Property	Description	Exam Relevance
Completeness	A* will always find a solution if one exists (given finite branching)	Know that A* never gives up prematurely
Optimality	Guaranteed optimal when h(n) is admissible	Understand admissibility = never overestimates
Consistency	If h(n) <= cost(n, n') + h(n'), A* is optimally efficient	Stronger than admissibility; means no node is re-expanded
Space complexity	O(b^d) -- stores all expanded nodes in memory	This is A*'s primary limitation for large state spaces

When A is the Wrong Choice:* A* requires an explicit state space with well-defined transitions. For open-ended planning problems where the state space is not enumerable (e.g., creative writing, open-ended research), LLM-based planning or ToT is more appropriate. On the NCP-AAI exam, if the scenario lacks a clear state-transition model, A* is likely a distractor answer.

Use Cases for NCP-AAI: Navigation, resource allocation with cost optimization, finding optimal action sequences in well-defined state spaces, robotic motion planning.

Monte Carlo Tree Search (MCTS)

MCTS builds a search tree incrementally through random simulations, balancing exploration of new paths against exploitation of known good paths. It is the algorithm behind AlphaGo and many game-playing agents.

Four Phases:

1. SELECTION      2. EXPANSION     3. SIMULATION     4. BACKPROPAGATION
   ┌─●─┐            ┌─●─┐           ┌─●─┐              ┌─●─┐
   │   │            │   │           │   │              │   │
   ●   ●            ●   ●           ●   ●              ●   ●
   │                │   │           │   │              │   │
   ●← (UCB1)       ●   ○←NEW      ●   ○              ●   ○←UPDATE
                                       │                  │
                                       ↓ random          ↑ reward
                                    (playout)         (propagate)

Implementation:

import math
import random

class MCTSNode:
    def __init__(self, state, parent=None, action=None):
        self.state = state
        self.parent = parent
        self.action = action
        self.children = []
        self.visits = 0
        self.reward = 0.0
        self.untried_actions = state.get_legal_actions()

    def ucb1_score(self, exploration_param=1.414):
        """UCB1: balance exploitation vs exploration"""
        if self.visits == 0:
            return float('inf')
        exploitation = self.reward / self.visits
        exploration = exploration_param * math.sqrt(
            math.log(self.parent.visits) / self.visits
        )
        return exploitation + exploration

    def select_child(self):
        """Select child with highest UCB1 score"""
        return max(self.children, key=lambda c: c.ucb1_score())

    def is_fully_expanded(self):
        return len(self.untried_actions) == 0

class MCTSPlanner:
    def __init__(self, n_iterations=1000):
        self.n_iterations = n_iterations

    def plan(self, root_state):
        root = MCTSNode(root_state)

        for _ in range(self.n_iterations):
            node = root
            state = root_state.clone()

            # Phase 1: Selection (traverse tree using UCB1)
            while node.is_fully_expanded() and node.children:
                node = node.select_child()
                state.apply_action(node.action)

            # Phase 2: Expansion (add one new child)
            if node.untried_actions:
                action = random.choice(node.untried_actions)
                node.untried_actions.remove(action)
                state.apply_action(action)
                child = MCTSNode(state, parent=node, action=action)
                node.children.append(child)
                node = child

            # Phase 3: Simulation (random playout to terminal state)
            sim_state = state.clone()
            while not sim_state.is_terminal():
                action = random.choice(sim_state.get_legal_actions())
                sim_state.apply_action(action)
            reward = sim_state.get_reward()

            # Phase 4: Backpropagation (update statistics up the tree)
            while node is not None:
                node.visits += 1
                node.reward += reward
                node = node.parent

        # Return action of most-visited child (most robust choice)
        return max(root.children, key=lambda c: c.visits).action

Use Cases: Game playing, exploration tasks with uncertain outcomes, high-stakes decisions where simulation is possible, robotic planning.

LLM-Based Planning

Use LLMs directly to generate structured plans from natural language goals:

import json

def llm_plan(goal, context, llm):
    prompt = f"""
    You are a task planning AI. Break down the following goal
    into a step-by-step plan.

    Goal: {goal}
    Current context: {context}

    Provide a detailed plan in JSON format:
    {{
        "steps": [
            {{"id": 1, "action": "...", "reasoning": "...", "dependencies": []}},
            {{"id": 2, "action": "...", "reasoning": "...", "dependencies": [1]}}
        ]
    }}
    """
    response = llm.predict(prompt)
    plan = json.loads(response)
    return plan["steps"]

# Example
plan = llm_plan(
    goal="Deploy a new microservice to production",
    context="Current env: staging, tests passing, Docker image built",
    llm=llm
)

Advantages: Natural language I/O, handles novel situations, minimal domain engineering. Disadvantages: Non-deterministic, can hallucinate invalid plans, expensive per call.

Comparing Advanced Algorithms

Algorithm	Type	Optimality	Cost	Best For
A*	Heuristic search	Optimal (with admissible h)	O(b^d) time and space	Known state spaces with cost optimization
MCTS	Simulation-based	Converges to optimal	O(n_iterations) per decision	Uncertain outcomes, game-like scenarios
LLM Planning	Generative	No guarantee	Per-token API cost	Novel situations, natural language goals
BFS/DFS	Uninformed search	Complete (BFS) / Not guaranteed (DFS)	O(b^d) / O(b*d)	Simple state spaces, proof of concept

Integrating Planning with Memory

Planning does not happen in isolation. Production agents combine planning with memory systems to improve plan quality over time. This integration is an important NCP-AAI concept that bridges the Planning and Memory domains.

Short-Term Memory for Active Plans

During plan execution, the agent maintains a working memory of the current plan state, completed steps, pending steps, and intermediate results:

class PlanningWithMemory:
    def __init__(self, planner, executor, memory):
        self.planner = planner
        self.executor = executor
        self.memory = memory  # Working memory

    def execute_plan(self, goal):
        plan = self.planner.create_plan(goal)

        # Store plan in working memory
        self.memory.store("current_plan", plan)
        self.memory.store("completed_steps", [])
        self.memory.store("plan_status", "in_progress")

        for step in plan:
            # Check if plan needs revision based on new information
            if self.memory.get("needs_replan"):
                completed = self.memory.get("completed_steps")
                plan = self.planner.replan(goal, completed)
                self.memory.store("current_plan", plan)
                self.memory.store("needs_replan", False)

            result = self.executor.run(step)
            self.memory.append("completed_steps", (step, result))

            # Store observations for future planning
            self.memory.store_observation(step, result)

        self.memory.store("plan_status", "completed")

Long-Term Memory for Plan Reuse

Successful plans can be cached and retrieved for similar future tasks, dramatically reducing planning latency:

class PlanMemoryStore:
    def __init__(self, vector_store):
        self.vector_store = vector_store

    def store_successful_plan(self, goal, plan, outcome):
        """Store a successful plan for future retrieval"""
        embedding = self.embed(goal)
        self.vector_store.add(
            embedding=embedding,
            metadata={"goal": goal, "plan": plan, "outcome": outcome}
        )

    def retrieve_similar_plan(self, new_goal, threshold=0.85):
        """Find a previously successful plan for a similar goal"""
        embedding = self.embed(new_goal)
        results = self.vector_store.search(embedding, top_k=3)

        for result in results:
            if result.similarity >= threshold:
                return result.metadata["plan"]  # Reuse with adaptation

        return None  # No similar plan found; create new plan

This pattern -- plan memoization via semantic similarity -- is especially powerful for enterprise agents that handle recurring types of requests. On the NCP-AAI exam, questions about improving planning efficiency often have "plan caching" or "plan reuse" as the correct answer.

Hierarchical Planning: Plan-and-Execute Pattern

The Plan-and-Execute pattern separates strategic planning (slow, thoughtful) from tactical execution (fast, reactive). A high-level planner creates the overall strategy, then a ReAct agent executes each step.

class HierarchicalPlanner:
    def __init__(self, planner_llm, executor_agent):
        self.planner = planner_llm
        self.executor = executor_agent

    def solve(self, goal):
        # Step 1: Generate high-level plan
        plan_prompt = f"""
        Goal: {goal}

        Create a step-by-step plan to achieve this goal.
        Each step should be a concrete, executable subgoal.

        Plan:
        """
        plan = self.planner.predict(plan_prompt)
        steps = self._parse_plan(plan)

        # Step 2: Execute each step with ReAct agent
        results = []
        for i, step in enumerate(steps):
            print(f"Executing Step {i+1}: {step}")
            try:
                result = self.executor.run(step)
                results.append({"step": step, "result": result, "status": "success"})
            except Exception as e:
                results.append({"step": step, "error": str(e), "status": "failed"})
                # Replan from current state
                remaining = self._replan(goal, results, steps[i+1:])
                steps = steps[:i+1] + remaining

        # Step 3: Synthesize final answer
        synthesis_prompt = f"""
        Goal: {goal}
        Execution Results: {results}

        Synthesize a final answer:
        """
        return self.planner.predict(synthesis_prompt)

Benefits:

Separates strategic planning from execution
Reduces cognitive load on the executor agent
Enables replanning if individual steps fail
Scales to complex multi-step workflows

Master These Concepts with Practice

Our NCP-AAI practice bundle includes:

7 full practice exams (455+ questions)
Detailed explanations for every answer
Domain-by-domain performance tracking

Try 15 Free Questions Get Full Access - $19.99

30-day money-back guarantee

Multi-Agent Planning

Centralized Planning

A single planner coordinates all agents, assigning sub-goals and resolving conflicts globally:

class CentralizedPlanner:
    def plan_for_agents(self, agents, global_goal):
        sub_goals = self.decompose_goal(global_goal, len(agents))
        plans = {}
        for agent, sub_goal in zip(agents, sub_goals):
            plans[agent.id] = self.create_plan(agent, sub_goal)
        return self.coordinate_plans(plans)  # Resolve conflicts

Pros: Globally optimal coordination, no conflicting actions. Cons: Single point of failure, does not scale to many agents.

Decentralized Planning

Each agent plans independently and coordinates through communication:

class DecentralizedAgent:
    def plan_and_coordinate(self, goal):
        my_plan = self.plan(goal)

        # Share intentions with neighbors
        for neighbor in self.neighbors:
            neighbor.receive_intention(self.id, my_plan)

        # Receive and incorporate neighbor plans
        neighbor_plans = self.receive_intentions()

        # Adjust plan to avoid conflicts
        return self.resolve_conflicts(my_plan, neighbor_plans)

Pros: Scalable, robust to individual agent failure. Cons: Suboptimal (no global view), requires communication protocols.

Hierarchical Multi-Level Planning

High-level planner assigns strategic goals; low-level planners handle tactical execution:

┌──────────────────────────┐
│   High-Level Planner     │  (Strategic goals)
└────────────┬─────────────┘
             │
      ┌──────┴──────┐
      v             v
  Mid-Level 1   Mid-Level 2     (Tactical plans)
      │             │
   ┌──┴──┐      ┌──┴──┐
   v     v      v     v
  Low   Low    Low   Low        (Primitive actions)

Use Case: Large-scale systems (factory automation, traffic management, enterprise workflows).

NVIDIA Platform Tools for Planning

NeMo Agent Toolkit Planning Modules

The NVIDIA NeMo Agent Toolkit provides built-in support for multiple planning strategies through configurable planning modules:

from nemo_agent import Agent, PlanningStrategy

agent = Agent(
    model="nvidia/llama-3-70b-nemo",
    planning_strategy=PlanningStrategy.REACT,
    max_planning_steps=10,
    planning_timeout=30  # seconds
)

Three Planning Strategies in NeMo Agent Toolkit:

Strategy	Constant	Behavior	When to Use
ReAct	`PlanningStrategy.REACT`	Explicit reasoning + action loop with tool calls	Most production agents (default)
CoT	`PlanningStrategy.COT`	Step-by-step reasoning without action execution	Analysis, explanation, reasoning-only tasks
Direct	`PlanningStrategy.DIRECT`	No intermediate planning; single-pass response	Simple queries, low-latency requirements

Key Concept

For the NCP-AAI exam, know the three NeMo Agent Toolkit planning strategies: REACT (reasoning + actions, most flexible), COT (reasoning only, no tool calls), and DIRECT (fastest, no planning overhead). The exam tests your ability to select the appropriate strategy based on task requirements.

NeMo Agent Toolkit ReAct Configuration

# ReAct agent with tools
from nemo_agent import Agent, Tool

search_tool = Tool(
    name="wikipedia_search",
    description="Search Wikipedia for factual information",
    endpoint="https://api.wikipedia.org/search"
)

agent = Agent(
    model="nvidia/llama-3-70b-nemo",
    planning_strategy=PlanningStrategy.REACT,
    tools=[search_tool],
    max_planning_steps=10,
    planning_timeout=30
)

result = agent.run("What year was the Eiffel Tower built and how tall is it?")
# Agent uses ReAct loop: Thought → Action (search) → Observation → Thought → Answer

NVIDIA AIQ Toolkit Agent Graph

Visualize and execute agent workflows as directed graphs with conditional edges:

agent_graph:
  nodes:
    - id: search
      type: tool_call
      tool: search_database
    - id: analyze
      type: llm_reasoning
      prompt: "Analyze search results"
    - id: respond
      type: tool_call
      tool: send_response
    - id: fallback
      type: llm_reasoning
      prompt: "Generate response from cached data"
  edges:
    - from: search
      to: analyze
      condition: "search.status == success"
    - from: search
      to: fallback
      condition: "search.status == failure"    # Error recovery
    - from: analyze
      to: respond
    - from: fallback
      to: respond

NCP-AAI Exam Focus: Understand conditional edges (branching logic based on node outputs) and how they enable error recovery through fallback paths.

LangChain PlanAndExecute Agent

from langchain.agents import PlanAndExecute, load_agent_executor, load_chat_planner

planner = load_chat_planner(llm)
executor = load_agent_executor(llm, tools, verbose=True)

agent = PlanAndExecute(planner=planner, executor=executor)
result = agent.run("Book a flight from SF to NYC and reserve a hotel")

LlamaIndex Workflow Engine

from llama_index.core import Workflow

workflow = Workflow()
workflow.add_step("research", research_agent)
workflow.add_step("plan", planning_agent)
workflow.add_step("execute", execution_agent)
workflow.add_dependency("research", "plan")
workflow.add_dependency("plan", "execute")

result = workflow.run(input="Analyze competitor products and recommend strategy")

Best Practices for Production Planning Systems

Eight Rules for Production Planning

Set planning timeouts to prevent infinite search loops
Cache common plans for repeated tasks (plan memoization)
Implement plan validation before execution (check preconditions)
Use hierarchical planning for complex multi-step tasks
Enable replanning for dynamic environments (continual planning)
Monitor plan execution and log decisions for debugging
Balance planning time vs. execution quality -- do not over-plan simple tasks
Add fallback strategies for when primary plans fail

Planning Performance Optimization

class OptimizedPlanner:
    def __init__(self):
        self.plan_cache = {}
        self.timeout = 5.0  # seconds
        self.max_retries = 3

    def plan_with_optimizations(self, goal):
        # 1. Check cache first
        cache_key = self._normalize_goal(goal)
        if cache_key in self.plan_cache:
            cached_plan = self.plan_cache[cache_key]
            if self.is_still_valid(cached_plan):
                return cached_plan

        # 2. Plan with timeout
        plan = self._timeout_call(self.plan, args=(goal,), timeout=self.timeout)

        # 3. Validate before caching
        if plan and self.validate_plan(plan):
            self.plan_cache[cache_key] = plan

        return plan

    def validate_plan(self, plan):
        """Check preconditions and executability"""
        for step in plan:
            if not self.check_preconditions(step):
                return False
        return True

Common Planning Pitfalls

Avoid these mistakes both in production systems and on the NCP-AAI exam:

Pitfall	Description	Solution
Over-planning	Spending too much time planning vs. executing	Set timeouts; use DIRECT mode for simple tasks
Ignoring uncertainty	Assuming the environment is static	Use continual planning or ReAct
No replanning	Failing to adapt when plans fail	Add fallback strategies and error recovery
Invalid preconditions	Assuming preconditions that do not hold	Validate preconditions before each step
Brittle plans	Plans that fail on minor deviations	Build in tolerance and alternative paths
Infinite loops	Circular dependencies or goal conflicts	Set max iterations and detect cycles
Wrong strategy	Using ToT for simple tasks or CoT for tool-heavy tasks	Apply the decision framework above

NCP-AAI Exam Preparation

Key Planning Concepts by Domain

Agent Design and Cognition (15%):

CoT prompting techniques (zero-shot, few-shot, self-consistency)
ReAct loop structure and execution trace
ToT search algorithms (BFS, DFS, best-first)
Five classical planning approaches (forward, backward, HTN, POP, continual)
Goal decomposition and task hierarchies

Agent Development (15%):

Implementing ReAct agents in LangChain
NeMo Agent Toolkit planning modules (REACT, COT, DIRECT)
Custom planning loops and plan validation
Integrating planning with memory and tools
Handling planning failures, retries, and fallbacks

Agent Architecture (15%):

Choosing the right planning strategy for the scenario
Multi-agent planning (centralized vs. decentralized)
Hierarchical plan-and-execute patterns
A* and MCTS for optimization problems

Common NCP-AAI Exam Traps for Planning Questions

Exam Traps to Avoid

Trap 1: CoT for tool-heavy tasks. If the scenario involves API calls, database queries, or external tools, CoT alone is always wrong. CoT cannot execute actions.

Trap 2: ReAct is always best. ReAct is the default, but not always optimal. For pure reasoning (math, logic), CoT is cheaper and faster. For problems requiring exploration of alternatives, ToT is superior.

Trap 3: Over-planning in dynamic environments. If the question describes a changing environment, do not choose a fully upfront planning approach. Continual planning or ReAct with replanning is correct.

Trap 4: Ignoring HTN for structured workflows. When the domain has clear task hierarchies (e.g., "deploy software," "book travel"), HTN is the intended answer, not generic LLM planning.

Trap 5: Confusing POP with sequential planning. If the question asks about maximizing parallelism or minimizing total execution time, partial-order planning is the answer, not forward planning.

Practice Questions

Hands-On Practice Scenarios

Error Recovery and Replanning Strategies

Error recovery is a critical planning capability tested on the NCP-AAI exam. When a plan step fails, the agent must decide between several recovery strategies:

Recovery Strategy Hierarchy

Plan Step Fails
│
├── 1. Retry (transient error?)
│     └── Same action, same parameters, up to N retries
│
├── 2. Alternative action (same goal, different method)
│     └── Try a different tool/API that achieves the same sub-goal
│
├── 3. Partial replan (adjust remaining steps)
│     └── Keep completed steps, replan from current state
│
├── 4. Full replan (start over with new strategy)
│     └── Discard current plan, create entirely new approach
│
└── 5. Graceful degradation (reduce scope)
      └── Achieve a subset of the original goal, inform user of limitations

Implementation Pattern

class ResilientPlanner:
    def execute_with_recovery(self, goal, plan):
        for i, step in enumerate(plan):
            success = False

            # Level 1: Retry
            for attempt in range(self.max_retries):
                result = self.execute_action(step)
                if result.success:
                    success = True
                    break

            if not success:
                # Level 2: Alternative action
                alternatives = self.get_alternative_actions(step)
                for alt_action in alternatives:
                    result = self.execute_action(alt_action)
                    if result.success:
                        success = True
                        break

            if not success:
                # Level 3: Partial replan
                completed = plan[:i]
                remaining_goal = self.compute_remaining_goal(goal, completed)
                new_plan = self.planner.replan(
                    current_state=self.get_state(),
                    goal=remaining_goal,
                    constraints={"avoid": [step]}  # Don't repeat failed approach
                )
                if new_plan:
                    return self.execute_with_recovery(remaining_goal, new_plan)

                # Level 5: Graceful degradation
                return self.degrade_gracefully(goal, completed)

        return "Goal achieved"

Error Recovery Exam Scenarios

The NCP-AAI frequently tests error recovery with scenarios like:

Database timeout: Agent switches to cached data (fallback strategy)
API rate limit: Agent queues requests and implements exponential backoff (retry with delay)
Tool unavailable: Agent uses an alternative tool that provides similar functionality (alternative action)
Partial results: Agent adjusts plan to work with incomplete data (graceful degradation)
Conflicting information: Agent adds a verification step before proceeding (plan amendment)

Key Concept

On the NCP-AAI exam, error recovery questions typically ask: "What planning feature enables recovery when step X fails?" The answer framework is: (1) retry for transient errors, (2) conditional branching/fallback for known failure modes, (3) replanning for unexpected failures, and (4) graceful degradation when no recovery is possible.

Master Planning Strategies with Preporato

Excel at planning and reasoning questions on the NCP-AAI exam. Preporato's comprehensive practice bundle covers:

150+ planning and reasoning questions across all strategies (ReAct, CoT, ToT, HTN, MCTS)
Scenario-based problems requiring you to select optimal planning approaches
NVIDIA tool usage questions (NeMo Agent Toolkit, AIQ Toolkit)
Error recovery challenges (replanning, fallback strategies, conditional branching)
Code-based questions on LangChain, NeMo Agent Toolkit, and custom planners

Summary

Planning strategies determine an agent's problem-solving capability. Here is the complete hierarchy of what the NCP-AAI exam tests:

LLM-Native Strategies:

Chain-of-Thought: Step-by-step reasoning for logic-heavy, action-free tasks (lowest cost)
ReAct: Interleaved reasoning and actions for dynamic, tool-based tasks (NCP-AAI favorite)
Tree of Thoughts: Multi-path exploration for strategic, optimization problems (highest cost)

Classical Planning Approaches:

Forward Planning: Intuitive start-to-goal search for small state spaces
Backward Planning: Goal-directed regression for well-defined preconditions
HTN Planning: Hierarchical decomposition for structured workflows (most common in production)
Partial-Order Planning: Flexible ordering for parallelizable tasks
Continual Planning: Interleaved planning and execution for dynamic environments

Advanced Algorithms:

A*: Optimal pathfinding with admissible heuristics
MCTS: Simulation-based search with UCB1 exploration-exploitation balance

NVIDIA Tools:

NeMo Agent Toolkit: REACT, COT, and DIRECT planning modules
AIQ Toolkit: Agent graphs with conditional edges for workflow orchestration

Key Takeaways Checklist

0/11 completed

Ready to Pass the NCP-AAI Exam?

Join thousands who passed with Preporato practice tests

Start Practicing Now - $19.99

Instant access30-day guaranteeUpdated monthly

Start Here

Why Planning Matters for Agentic AI

The Planning Challenge

Core Planning Capabilities Tested on NCP-AAI

Chain-of-Thought (CoT) Prompting

How CoT Works

Basic CoT (Few-Shot)

Zero-Shot CoT

Self-Consistency CoT

CoT Strengths and Limitations

Key Concept

CoT Variants Summary

When CoT Fails: Understanding Limitations

ReAct: Reasoning + Acting

Why ReAct Matters

The ReAct Loop

ReAct Implementation

ReAct Variants

1. ReAct with Self-Correction

2. ReAct with Reflection

3. ReAct + CoT Hybrid

ReAct Advantages for NCP-AAI

Exam Trap

ReAct Limitations

Tree of Thoughts (ToT): Exploring Multiple Reasoning Paths

Key Result

ToT Concepts

ToT Implementation

ToT Execution Example

ToT Cost Analysis

Tree of Thoughts Total LLM Calls

ToT Advantages and Limitations

When to Use ReAct vs. CoT vs. ToT: Decision Framework

Strategy Selection Decision Tree

Quick Reference Matrix

Planning Strategy Selection Guide

Cost and Latency Comparison

Planning Strategies: Performance Comparison

Exam Trap

Five Classical Planning Approaches

1. Forward Planning (Progressive Search)

2. Backward Planning (Regression)

3. Hierarchical Task Network (HTN) Planning

Key Concept

Task Decomposition Matrix

4. Partial-Order Planning (POP)

5. Continual Planning (Interleaved Planning and Execution)

Classical Approaches Comparison

Five Classical Planning Approaches

Exam Trap

Advanced Planning Algorithms

A* Planning (Optimal Pathfinding)

A* Evaluation Function

Monte Carlo Tree Search (MCTS)

UCB1 (Upper Confidence Bound) Formula

LLM-Based Planning

Comparing Advanced Algorithms

Integrating Planning with Memory

Short-Term Memory for Active Plans

Long-Term Memory for Plan Reuse

Hierarchical Planning: Plan-and-Execute Pattern

Master These Concepts with Practice

Multi-Agent Planning

Centralized Planning

Decentralized Planning

Hierarchical Multi-Level Planning

NVIDIA Platform Tools for Planning

NeMo Agent Toolkit Planning Modules

Key Concept

NeMo Agent Toolkit ReAct Configuration

NVIDIA AIQ Toolkit Agent Graph

LangChain PlanAndExecute Agent

LlamaIndex Workflow Engine

Best Practices for Production Planning Systems

Eight Rules for Production Planning

Planning Performance Optimization

Common Planning Pitfalls

NCP-AAI Exam Preparation

Key Planning Concepts by Domain

Common NCP-AAI Exam Traps for Planning Questions