Preporato
NCP-AAINVIDIAAgentic AIAgent Planning

Agent Planning Strategies: ReAct, Chain-of-Thought, and Tree of Thoughts for NCP-AAI

Preporato TeamDecember 10, 202515 min readNCP-AAI

Planning—the ability to break down complex goals into executable steps—is what separates advanced agentic AI systems from simple chatbots. The NVIDIA NCP-AAI certification heavily emphasizes planning strategies, as they determine an agent's capability to solve multi-step problems, reason about consequences, and optimize action sequences. This comprehensive guide explores the three foundational planning paradigms tested on the NCP-AAI exam: Chain-of-Thought (CoT), ReAct, and Tree of Thoughts (ToT).

Why Planning Matters for Agentic AI

Planning enables agents to:

  • Decompose complex tasks into manageable sub-tasks
  • Reason about action sequences before execution
  • Anticipate obstacles and plan contingencies
  • Optimize for goals (shortest path, lowest cost, highest success rate)
  • Handle ambiguous or underspecified requests

NCP-AAI Coverage:

  • Agent Design and Cognition domain (15%): Planning algorithms and reasoning patterns
  • Agent Development domain (15%): Implementing planning mechanisms
  • Agent Architecture domain (15%): Choosing appropriate planning strategies

The Planning Challenge

Without planning, agents exhibit:

  • Myopic behavior: Short-sighted decisions without considering future consequences
  • Action thrashing: Inefficient trial-and-error without strategic thinking
  • Goal confusion: Losing track of the original objective in multi-step tasks

Example:

Task: "Book a flight to Paris for next week"

Without Planning (Bad):
Agent: "What dates work for you?"
User: "Monday to Friday"
Agent: "Let me search flights..."
Agent: "Oh, I need your departure city. What city?"
User: "San Francisco"
Agent: "Searching... Oh, I need your budget. What's your budget?"
→ Inefficient, poor user experience

With Planning (Good):
Agent: "To book your flight, I need:
  1. Departure city
  2. Travel dates
  3. Budget range
  4. Seating preference
Can you provide these details?"
→ Strategic, efficient information gathering

Preparing for NCP-AAI? Practice with 455+ exam questions

Chain-of-Thought (CoT) Prompting

Definition: Chain-of-Thought prompting elicits step-by-step reasoning from LLMs by showing examples or instructing the model to "think through" problems.

Basic CoT

Prompt Template:

Question: [Problem]

Let's think step by step:

Example:

from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI

cot_template = PromptTemplate(
    input_variables=["question"],
    template="""
    Question: {question}

    Let's solve this step by step:
    1. First, I'll identify what information I need
    2. Then, I'll break down the problem
    3. Next, I'll work through each step
    4. Finally, I'll provide the answer

    Solution:
    """
)

llm = ChatOpenAI(model="gpt-4", temperature=0)
chain = cot_template | llm

result = chain.invoke({
    "question": "If a store sells 15 items per hour and is open 8 hours per day, how many items are sold in a week?"
})

# Output:
# 1. Items per day = 15 items/hour × 8 hours = 120 items/day
# 2. Days per week = 7
# 3. Items per week = 120 items/day × 7 days = 840 items
# Answer: 840 items per week

Few-Shot CoT

Provide examples of reasoning:

few_shot_cot_template = """
Example 1:
Question: A bakery makes 48 cupcakes. If they pack 6 cupcakes per box, how many boxes do they need?
Reasoning:
- Total cupcakes: 48
- Cupcakes per box: 6
- Calculation: 48 ÷ 6 = 8
Answer: 8 boxes

Example 2:
Question: If a car travels at 60 mph for 2.5 hours, how far does it travel?
Reasoning:
- Speed: 60 miles per hour
- Time: 2.5 hours
- Formula: Distance = Speed × Time
- Calculation: 60 × 2.5 = 150
Answer: 150 miles

Now solve this:
Question: {question}
Reasoning:
"""

# Model learns to mimic reasoning structure

Zero-Shot CoT

No examples needed—just add "Let's think step by step":

zero_shot_cot_prompt = """
Question: {question}

Let's think step by step:
"""

# Surprisingly effective: 40-50% accuracy improvement on reasoning tasks

Research Finding (Wei et al., 2022): Adding "Let's think step by step" to prompts improved accuracy on math word problems from 17.7% → 78.7% (GPT-3).

CoT Strengths and Limitations

AspectStrengthsLimitations
Use CasesMath, logic puzzles, reasoningReal-world actions, tool use
TransparencyShows reasoning processReasoning can be incorrect
LatencySingle LLM callLonger output = more tokens
ReliabilityDeterministic reasoning pathProne to reasoning errors

NCP-AAI Insight: CoT is best for reasoning-heavy, action-light tasks (analysis, planning, explanation) but insufficient for action-heavy tasks (API calls, tool use, multi-step execution).

ReAct: Reasoning + Acting

Definition: ReAct (Reasoning and Acting) interleaves reasoning traces with action execution, allowing agents to dynamically adjust plans based on environment feedback.

The ReAct Loop

Thought: [Reasoning about what to do next]
Action: [Tool/function to execute]
Action Input: [Arguments for the tool]
Observation: [Result from executing the action]
... (repeat Thought/Action/Observation)
Thought: I now know the final answer
Final Answer: [Response to user]

ReAct Implementation

from langchain.agents import create_react_agent, AgentExecutor
from langchain.tools import Tool
from langchain_openai import ChatOpenAI

# Define tools
def search_web(query: str) -> str:
    """Search the web for information"""
    # Implementation
    return f"Search results for '{query}': ..."

def calculator(expression: str) -> str:
    """Evaluate mathematical expressions"""
    try:
        return str(eval(expression))
    except Exception as e:
        return f"Error: {e}"

tools = [
    Tool(name="Search", func=search_web, description="Search the web for current information"),
    Tool(name="Calculator", func=calculator, description="Perform mathematical calculations")
]

# ReAct prompt template
react_prompt = """
Answer the following question as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought: {agent_scratchpad}
"""

# Create agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = create_react_agent(llm, tools, react_prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=10)

# Execute
result = executor.invoke({"input": "What is the population of Tokyo multiplied by 2?"})

Execution Trace:

Thought: I need to find Tokyo's population, then multiply by 2
Action: Search
Action Input: "Tokyo population 2024"
Observation: Tokyo's population is approximately 14 million

Thought: Now I need to multiply 14 million by 2
Action: Calculator
Action Input: "14000000 * 2"
Observation: 28000000

Thought: I now know the final answer
Final Answer: The population of Tokyo (14 million) multiplied by 2 is 28 million.

ReAct Variants

1. ReAct with Self-Correction

Agent can revisit decisions:

# Extended prompt
react_selfcorrect_prompt = """
... (standard ReAct format)

If an action fails or returns unexpected results, reconsider your approach:
Thought: That didn't work as expected. Let me try a different approach.
Action: [Alternative action]
...
"""

# Example execution
"""
Thought: I'll search for "Tokyo population"
Action: Search
Action Input: "Tokyo population"
Observation: Error: Too many results, be more specific

Thought: That didn't work. Let me be more specific with the year.
Action: Search
Action Input: "Tokyo population 2024"
Observation: Tokyo's population is approximately 14 million
→ Self-correction leads to success
"""

2. ReAct with Reflection

Agent reflects on reasoning quality:

react_reflection_prompt = """
... (standard ReAct)

After completing the task, reflect:
Reflection: [Evaluate the quality of your reasoning and actions]
Improvements: [What could be done better next time]
"""

# Example
"""
Final Answer: 28 million

Reflection: My approach was effective—I systematically gathered information and performed calculations. The search query could have been more precise initially.

Improvements: Next time, include the year in the initial search query to avoid ambiguity.
"""

ReAct Advantages for NCP-AAI

  1. Grounded in reality: Actions provide real feedback, preventing hallucinations
  2. Transparent reasoning: Thought traces are interpretable and debuggable
  3. Dynamic adaptation: Can adjust strategy based on observations
  4. Tool integration: Natural fit for function calling and API interactions

NCP-AAI Exam Focus: ReAct is the default planning strategy for production agents due to its balance of reasoning and action.

ReAct Limitations

ChallengeImpactMitigation
Verbose outputHigh token costsUse cheaper models for reasoning
Linear planningDoesn't explore alternativesCombine with ToT (below)
Error accumulationEarly mistakes compoundAdd reflection/self-correction
Max iterationsCan timeout on complex tasksSet appropriate limits

Tree of Thoughts (ToT): Exploring Multiple Reasoning Paths

Definition: Tree of Thoughts generates multiple reasoning paths (branches), evaluates them, and selects the most promising direction—enabling search-based planning.

ToT Concepts

1. Thought Decomposition Break the problem into intermediate steps (thoughts).

2. Thought Generation Generate multiple candidate thoughts at each step.

3. State Evaluation Evaluate the promise of each thought (how likely to lead to solution).

4. Search Algorithm

  • Breadth-First Search (BFS): Explore all options at each level
  • Depth-First Search (DFS): Explore one path deeply before backtracking
  • Best-First Search: Prioritize most promising paths

ToT Implementation

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
import numpy as np

class TreeOfThoughts:
    def __init__(self, llm, max_depth=3, branching_factor=3):
        self.llm = llm
        self.max_depth = max_depth
        self.branching_factor = branching_factor

    def generate_thoughts(self, problem, current_state, depth):
        """Generate multiple candidate next thoughts"""
        prompt = f"""
        Problem: {problem}
        Current reasoning: {current_state}

        Generate {self.branching_factor} different possible next steps in solving this problem.
        Format each as a numbered option:
        """

        response = self.llm.predict(prompt)
        # Parse into list of thoughts
        thoughts = self._parse_thoughts(response)
        return thoughts

    def evaluate_thought(self, problem, thought_sequence):
        """Evaluate how promising a thought sequence is (0-10)"""
        prompt = f"""
        Problem: {problem}
        Reasoning so far: {thought_sequence}

        On a scale of 0-10, how likely is this reasoning path to lead to a correct solution?
        Consider:
        - Logical coherence
        - Progress toward the goal
        - Avoiding dead ends

        Score (0-10):
        """

        response = self.llm.predict(prompt)
        score = float(response.strip())
        return score

    def search(self, problem, algorithm="bfs"):
        """Search through the tree of thoughts"""
        if algorithm == "bfs":
            return self._breadth_first_search(problem)
        elif algorithm == "dfs":
            return self._depth_first_search(problem)
        else:
            return self._best_first_search(problem)

    def _breadth_first_search(self, problem):
        """Explore all paths level by level"""
        queue = [("", 0)]  # (thought_sequence, depth)
        best_path = None
        best_score = -1

        while queue:
            current_state, depth = queue.pop(0)

            if depth >= self.max_depth:
                # Evaluate terminal node
                score = self.evaluate_thought(problem, current_state)
                if score > best_score:
                    best_score = score
                    best_path = current_state
                continue

            # Generate next thoughts
            thoughts = self.generate_thoughts(problem, current_state, depth)

            # Add to queue
            for thought in thoughts:
                new_state = current_state + "\n" + thought
                queue.append((new_state, depth + 1))

        return best_path, best_score

    def solve(self, problem):
        """Solve problem using Tree of Thoughts"""
        best_path, score = self.search(problem, algorithm="bfs")

        # Generate final answer from best path
        final_prompt = f"""
        Problem: {problem}
        Best reasoning path found:
        {best_path}

        Based on this reasoning, provide the final answer:
        """

        answer = self.llm.predict(final_prompt)
        return answer, best_path, score

# Usage
llm = ChatOpenAI(model="gpt-4", temperature=0.7)
tot = TreeOfThoughts(llm, max_depth=3, branching_factor=3)

problem = "How can I optimize database queries in a high-traffic web application?"
answer, reasoning, score = tot.solve(problem)

print(f"Answer: {answer}")
print(f"Reasoning path: {reasoning}")
print(f"Confidence: {score}/10")

ToT Execution Example

Problem: "Plan a 3-day trip to New York City on a $1000 budget"

Tree Exploration:

Root: "Plan NYC trip, 3 days, $1000"
│
├── Branch 1: "Focus on free attractions (museums, parks, walking tours)"
│   ├── Branch 1.1: "Stay in budget hotel ($100/night), use subway"
│   │   └── Branch 1.1.1: "Day 1: Central Park + Met, Day 2: Brooklyn Bridge + 9/11 Memorial..." [Score: 8/10]
│   ├── Branch 1.2: "Stay in hostel ($50/night), more budget for activities"
│   └── Branch 1.3: "Airbnb in Queens ($80/night), authentic experience"
│
├── Branch 2: "Prioritize iconic paid attractions (Empire State, Statue of Liberty)"
│   ├── Branch 2.1: "Buy CityPass ($140), budget accommodation"
│   │   └── Branch 2.1.1: "Day 1: ESB + Top of Rock, Day 2: Statue of Liberty..." [Score: 7/10]
│   └── Branch 2.2: "Focus on 2-3 major attractions, skip tourist traps"
│
└── Branch 3: "Cultural experience (Broadway, dining, neighborhoods)"
    ├── Branch 3.1: "One Broadway show ($150), street food, diverse neighborhoods"
    │   └── Branch 3.1.1: "Day 1: Broadway + Times Square, Day 2: Chinatown + Little Italy..." [Score: 6/10]
    └── Branch 3.2: "Skip Broadway, focus on food tours and local experiences"

Best Path Selected: Branch 1.1.1 (Score: 8/10)
Reasoning: Maximizes experiences within budget by focusing on free attractions while maintaining comfort.

ToT Advantages

  1. Explores alternatives: Considers multiple strategies before committing
  2. Handles complexity: Effective for problems with many possible approaches
  3. Avoids local optima: Can backtrack from dead ends
  4. Self-evaluation: Explicitly assesses reasoning quality

ToT Limitations

ChallengeImpactSolution
High costMany LLM calls (branching_factor^depth)Prune low-scoring branches early
LatencySlow for real-time applicationsUse for planning phase only
Evaluation difficultyHard to score thought quality accuratelyTrain value model or use heuristics

NCP-AAI Use Case: ToT is ideal for strategic planning, optimization problems, and creative tasks where exploring alternatives is valuable.

Master These Concepts with Practice

Our NCP-AAI practice bundle includes:

  • 7 full practice exams (455+ questions)
  • Detailed explanations for every answer
  • Domain-by-domain performance tracking

30-day money-back guarantee

Comparing Planning Strategies

StrategyBest ForLatencyCostTransparencyNCP-AAI Exam Weight
Chain-of-ThoughtReasoning, math, logicLow (1 call)LowHighMedium
ReActTool use, multi-step tasksMedium (N calls)MediumHighHigh
Tree of ThoughtsStrategic planning, optimizationHigh (exponential)HighMediumMedium

When to Use Each Strategy

Chain-of-Thought:

# Example: Pure reasoning task
"Explain why the following code is inefficient and suggest improvements."
→ Single-pass reasoning, no actions needed → CoT

ReAct:

# Example: Information retrieval + processing
"Find the current stock price of Tesla and calculate its P/E ratio."
→ Requires actions (search, retrieve data, calculate) → ReAct

Tree of Thoughts:

# Example: Strategic decision with multiple options
"Design a microservices architecture for an e-commerce platform. Consider scalability, cost, and maintainability."
→ Many valid approaches, need to explore and compare → ToT

Advanced Planning Patterns

Hierarchical Planning (Plan-and-Execute)

Break complex goals into subgoals:

from langchain.chains import LLMChain

class HierarchicalPlanner:
    def __init__(self, planner_llm, executor_agent):
        self.planner = planner_llm
        self.executor = executor_agent

    def solve(self, goal):
        # Step 1: Generate high-level plan
        plan_prompt = f"""
        Goal: {goal}

        Create a step-by-step plan to achieve this goal. Each step should be a concrete subgoal.

        Plan:
        """
        plan = self.planner.predict(plan_prompt)
        steps = self._parse_plan(plan)

        # Step 2: Execute each step with ReAct agent
        results = []
        for i, step in enumerate(steps):
            print(f"Executing Step {i+1}: {step}")
            result = self.executor.run(step)
            results.append(result)

        # Step 3: Synthesize final answer
        synthesis_prompt = f"""
        Goal: {goal}
        Plan: {plan}
        Execution Results: {results}

        Synthesize a final answer:
        """
        final_answer = self.planner.predict(synthesis_prompt)
        return final_answer

# Usage
planner_llm = ChatOpenAI(model="gpt-4", temperature=0)
executor_agent = create_react_agent(...)  # ReAct agent with tools

hierarchical_agent = HierarchicalPlanner(planner_llm, executor_agent)
result = hierarchical_agent.solve("Analyze competitor pricing and recommend our pricing strategy")

# Output:
# Step 1: Identify main competitors → Executor uses Search tool
# Step 2: Gather pricing data for each competitor → Executor uses Scraper tool
# Step 3: Analyze pricing patterns → Executor uses Analytics tool
# Step 4: Recommend strategy based on analysis → Executor synthesizes

Benefits:

  • Separates strategic planning (slow, thoughtful) from execution (fast, reactive)
  • Reduces cognitive load on executor agent
  • Enables re-planning if steps fail

Monte Carlo Tree Search (MCTS)

Combine ToT with simulation:

class MCTSPlanner:
    def __init__(self, llm, simulations=10):
        self.llm = llm
        self.simulations = simulations

    def select_action(self, state):
        """Select action with highest win rate from simulations"""
        actions = self.get_possible_actions(state)
        action_scores = {}

        for action in actions:
            wins = 0
            for _ in range(self.simulations):
                # Simulate outcome
                outcome = self.simulate(state, action)
                if outcome == "success":
                    wins += 1

            action_scores[action] = wins / self.simulations

        # Select action with highest success rate
        best_action = max(action_scores, key=action_scores.get)
        return best_action

    def simulate(self, state, action):
        """Simulate executing action from state"""
        prompt = f"""
        Current state: {state}
        Action taken: {action}

        Simulate the outcome: Will this likely succeed or fail?
        Answer with "success" or "failure" and brief reasoning.
        """
        response = self.llm.predict(prompt)
        return "success" if "success" in response.lower() else "failure"

Use Case: Games, robotic planning, high-stakes decisions where simulation is possible.

NCP-AAI Exam Preparation

Key Planning Concepts

Agent Design and Cognition (15%):

  1. CoT prompting techniques (zero-shot, few-shot)
  2. ReAct loop structure and execution
  3. ToT search algorithms (BFS, DFS, best-first)
  4. Hierarchical planning patterns

Agent Development (15%):

  1. Implementing ReAct agents in LangChain
  2. Custom planning loops
  3. Integrating planning with memory and tools
  4. Handling planning failures and retries

Practice Questions

Q1: A customer service agent needs to resolve a support ticket that may require multiple API calls. Which planning strategy is most appropriate?

A) Chain-of-Thought (single reasoning pass) B) ReAct (interleaved reasoning and actions) C) Tree of Thoughts (explore multiple resolutions) D) No planning (direct response)

Answer: B - ReAct allows dynamic adaptation based on API responses, essential for multi-step troubleshooting.

Q2: An agent is designing a system architecture with many valid approaches. Time constraints allow exploring only 3-4 options thoroughly. Best strategy?

A) CoT with single solution B) ReAct with first viable option C) ToT with limited branching (3 branches, 2 depth) D) Random selection

Answer: C - ToT enables systematic exploration of alternatives within time constraints.

Q3: Which planning strategy has the LOWEST token cost for a 5-step task?

A) Chain-of-Thought (single pass) B) ReAct (5 Thought/Action/Observation cycles) C) Tree of Thoughts (branching factor 3, depth 5) D) All have similar costs

Answer: A - CoT generates all reasoning in one pass; ReAct requires multiple LLM calls; ToT is exponentially expensive.

Master Planning Strategies with Preporato

Excel at planning and reasoning questions on the NCP-AAI exam. Preporato's comprehensive practice bundle covers:

150+ planning and reasoning questions across all strategies ✅ Hands-on implementations of CoT, ReAct, and ToT agents ✅ Scenario-based problems to choose optimal planning approaches ✅ Performance analysis comparing strategy latency and costs ✅ Code templates for LangChain, AutoGen, and custom planners

Exclusive Offer: Use code PLAN25 for 20% off—limited time only!


Summary

Planning strategies determine an agent's problem-solving capability:

  • Chain-of-Thought: Elicit step-by-step reasoning for logic-heavy tasks
  • ReAct: Interleave reasoning and actions for dynamic, tool-based tasks (NCP-AAI favorite)
  • Tree of Thoughts: Explore multiple reasoning paths for strategic, optimization problems
  • Advanced patterns: Hierarchical planning, MCTS for complex scenarios

Key Takeaway: Choose planning strategies based on task requirements—ReAct for most production agents, ToT for strategic planning, CoT for pure reasoning.

Ready to ace NCP-AAI planning questions? Practice with Preporato today! 🚀

Ready to Pass the NCP-AAI Exam?

Join thousands who passed with Preporato practice tests

Instant access30-day guaranteeUpdated monthly