Planning -- the ability to break down complex goals into executable steps -- is what separates advanced agentic AI systems from simple chatbots. The NVIDIA NCP-AAI certification heavily emphasizes planning strategies, as they determine an agent's capability to solve multi-step problems, reason about consequences, and optimize action sequences. This comprehensive guide covers every planning paradigm tested on the NCP-AAI exam: the three foundational reasoning strategies (Chain-of-Thought, ReAct, and Tree of Thoughts), five classical planning approaches (forward, backward, HTN, partial-order, and continual), advanced algorithms (A* and MCTS), NVIDIA NeMo Agent Toolkit planning modules, multi-agent planning patterns, and common exam traps you need to avoid.
Without planning, agents exhibit three critical failure modes:
Myopic behavior: Short-sighted decisions without considering future consequences
Action thrashing: Inefficient trial-and-error without strategic thinking
Goal confusion: Losing track of the original objective in multi-step tasks
Example -- Flight Booking Without vs. With Planning:
Task: "Book a flight to Paris for next week"
Without Planning (Bad):
Agent: "What dates work for you?"
User: "Monday to Friday"
Agent: "Let me search flights..."
Agent: "Oh, I need your departure city. What city?"
User: "San Francisco"
Agent: "Searching... Oh, I need your budget. What's your budget?"
→ Inefficient, poor user experience (3 round trips)
With Planning (Good):
Agent: "To book your flight, I need:
1. Departure city
2. Travel dates
3. Budget range
4. Seating preference
Can you provide these details?"
→ Strategic, efficient information gathering (1 round trip)
Core Planning Capabilities Tested on NCP-AAI
The exam tests your understanding of five core planning capabilities:
Task decomposition: Breaking complex requests into ordered subtasks
Multi-step orchestration: Sequencing actions with correct dependencies
Conditional branching: Adapting plans based on runtime conditions
Error recovery: Replanning when tasks fail (fallback strategies)
Goal optimization: Finding efficient paths to objectives under constraints
Preparing for NCP-AAI? Practice with 455+ exam questions
Definition: Chain-of-Thought prompting elicits step-by-step reasoning from LLMs by showing examples or instructing the model to "think through" problems. Introduced by Wei et al. (2022) in "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models," CoT demonstrated that prompting a 540B-parameter model (PaLM) with just eight chain-of-thought exemplars achieved state-of-the-art accuracy on the GSM8K benchmark of math word problems.
How CoT Works
CoT decomposes complex reasoning into explicit intermediate steps, making the model's thought process transparent and verifiable. Rather than jumping from question to answer, the model generates a reasoning trace that walks through each logical step.
Basic CoT (Few-Shot)
Provide examples of step-by-step reasoning to teach the model the format:
from langchain.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
few_shot_cot_template = """
Example 1:
Question: A bakery makes 48 cupcakes. If they pack 6 cupcakes per box, how many boxes do they need?
Reasoning:
- Total cupcakes: 48
- Cupcakes per box: 6
- Calculation: 48 / 6 = 8
Answer: 8 boxes
Example 2:
Question: If a car travels at 60 mph for 2.5 hours, how far does it travel?
Reasoning:
- Speed: 60 miles per hour
- Time: 2.5 hours
- Formula: Distance = Speed x Time
- Calculation: 60 x 2.5 = 150
Answer: 150 miles
Now solve this:
Question: {question}
Reasoning:
"""
llm = ChatOpenAI(model="gpt-4", temperature=0)
chain = PromptTemplate(input_variables=["question"], template=few_shot_cot_template) | llm
result = chain.invoke({
"question": "If a store sells 15 items per hour and is open 8 hours per day, how many items are sold in a week?"
})
# Output:# - Items per hour: 15# - Hours per day: 8# - Items per day: 15 x 8 = 120# - Days per week: 7# - Items per week: 120 x 7 = 840# Answer: 840 items per week
Zero-Shot CoT
No examples needed -- just append "Let's think step by step" to the prompt. Kojima et al. (2022) showed in "Large Language Models are Zero-Shot Reasoners" that this simple addition dramatically improves reasoning performance without any exemplars:
zero_shot_cot_prompt = """
Question: {question}
Let's think step by step:
"""# Results from Kojima et al. (2022) using text-davinci-002:# MultiArith: 17.7% → 78.7% accuracy# GSM8K: 10.4% → 40.7% accuracy# Similar improvements observed with PaLM 540B
The versatility of this single prompt across diverse reasoning tasks -- arithmetic, symbolic, commonsense, and logical -- hints at untapped zero-shot cognitive capabilities in large language models.
Self-Consistency CoT
Generate multiple CoT reasoning paths and select the answer that appears most frequently (majority vote). This reduces the impact of any single flawed reasoning chain:
classSelfConsistencyCoT:
def__init__(self, llm, num_samples=5):
self.llm = llm
self.num_samples = num_samples
defsolve(self, question):
answers = []
for _ inrange(self.num_samples):
# Generate with temperature > 0 for diverse reasoning paths
response = self.llm.predict(
f"Question: {question}\nLet's think step by step:",
temperature=0.7
)
answer = self.extract_answer(response)
answers.append(answer)
# Majority votefrom collections import Counter
most_common = Counter(answers).most_common(1)[0][0]
return most_common
# Typically improves accuracy by 5-15% over single-pass CoT
CoT Strengths and Limitations
Aspect
Strengths
Limitations
Use Cases
Math, logic puzzles, multi-step reasoning
Real-world actions, tool use
Transparency
Shows full reasoning process
Reasoning can be confidently wrong
Latency
Single LLM call (lowest cost)
Longer output = more tokens
Reliability
Deterministic reasoning path
Prone to compounding errors
Grounding
Internal knowledge only
Cannot access external information
Key Concept
CoT is best for reasoning-heavy, action-light tasks (analysis, planning, explanation) but insufficient for action-heavy tasks (API calls, tool use, multi-step execution). On the NCP-AAI exam, if a scenario involves tool calls or external interactions, CoT alone is the wrong answer -- look for ReAct instead.
CoT Variants Summary
The NCP-AAI exam may present scenarios where you need to choose between CoT variants. Here is a quick reference:
Variant
Description
When to Use
Cost
Few-Shot CoT
Provide 2-5 worked examples before the question
When you have good exemplars and need reliable formatting
1 call (longer prompt)
Zero-Shot CoT
Append "Let's think step by step" with no examples
Quick reasoning boost without crafting examples
1 call (short prompt)
Self-Consistency CoT
Generate k diverse reasoning paths, majority vote
When reliability matters more than cost
k calls (k=5-10 typical)
Manual CoT
Hand-craft optimal reasoning chains for specific domains
Domain-specific applications with known optimal reasoning
1 call (curated prompt)
Key Research Results to Remember:
Wei et al. (2022): Few-shot CoT with PaLM 540B achieved state-of-the-art on GSM8K math benchmarks, demonstrating that reasoning emerges in sufficiently large models with appropriate prompting.
Kojima et al. (2022): Zero-shot CoT ("Let's think step by step") improved MultiArith accuracy from 17.7% to 78.7% with InstructGPT, showing that no exemplars are needed to unlock reasoning capabilities.
Wang et al. (2022): Self-Consistency improved CoT accuracy by 5-15% across benchmarks by sampling multiple reasoning paths and taking the majority vote, addressing the fragility of single-chain reasoning.
When CoT Fails: Understanding Limitations
CoT has well-documented failure modes that the NCP-AAI exam may test:
Faithful but wrong reasoning: The model can generate a logical-looking chain of reasoning that arrives at the wrong answer. The steps appear sound, but a subtle error early in the chain propagates forward.
Overthinking simple problems: For straightforward factual lookups or pattern matching, CoT adds unnecessary tokens and latency without improving accuracy. Use DIRECT mode for these.
No external grounding: CoT operates entirely on the model's internal knowledge. If the information is outdated, incomplete, or hallucinated, the entire reasoning chain is built on a faulty foundation. This is precisely why ReAct was developed.
Length sensitivity: Very long reasoning chains (10+ steps) can lose coherence, with the model forgetting earlier constraints or introducing contradictions. For truly complex problems, hierarchical decomposition (HTN) may be more appropriate.
ReAct: Reasoning + Acting
Definition: ReAct (Reasoning and Acting) interleaves reasoning traces with action execution, allowing agents to dynamically adjust plans based on environment feedback. Introduced by Yao et al. (2022) in "ReAct: Synergizing Reasoning and Acting in Language Models," the framework addresses a key limitation of CoT: while chain-of-thought uses only internal representations, ReAct grounds reasoning in real-world observations by interleaving thought with action.
Why ReAct Matters
The core insight of ReAct is that reasoning and acting are complementary:
Reasoning traces help the model induce, track, and update action plans, as well as handle exceptions
Actions allow the model to interface with external sources (knowledge bases, APIs, environments) to gather additional information
Yao et al. evaluated ReAct on four benchmarks -- HotPotQA, Fever, ALFWorld, and WebShop -- and found that ReAct outperforms vanilla action generation while being competitive with CoT. The best results came from combining ReAct with CoT, using both internal knowledge and externally obtained information.
The ReAct Loop
Thought: [Reasoning about what to do next]
Action: [Tool/function to execute]
Action Input: [Arguments for the tool]
Observation: [Result from executing the action]
... (repeat Thought/Action/Observation)
Thought: I now know the final answer
Final Answer: [Response to user]
ReAct Implementation
from langchain.agents import create_react_agent, AgentExecutor
from langchain.tools import Tool
from langchain_openai import ChatOpenAI
# Define toolsdefsearch_web(query: str) -> str:
"""Search the web for information"""returnf"Search results for '{query}': ..."defcalculator(expression: str) -> str:
"""Evaluate mathematical expressions"""try:
returnstr(eval(expression))
except Exception as e:
returnf"Error: {e}"
tools = [
Tool(name="Search", func=search_web, description="Search the web for current information"),
Tool(name="Calculator", func=calculator, description="Perform mathematical calculations")
]
# ReAct prompt template
react_prompt = """
Answer the following question as best you can. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
Question: {input}
Thought: {agent_scratchpad}
"""# Create agent
llm = ChatOpenAI(model="gpt-4", temperature=0)
agent = create_react_agent(llm, tools, react_prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True, max_iterations=10)
# Execute
result = executor.invoke({"input": "What is the population of Tokyo multiplied by 2?"})
Execution Trace:
Thought: I need to find Tokyo's population, then multiply by 2
Action: Search
Action Input: "Tokyo population 2024"
Observation: Tokyo's population is approximately 14 million
Thought: Now I need to multiply 14 million by 2
Action: Calculator
Action Input: "14000000 * 2"
Observation: 28000000
Thought: I now know the final answer
Final Answer: The population of Tokyo (14 million) multiplied by 2 is 28 million.
ReAct Variants
1. ReAct with Self-Correction
The agent detects failures and adjusts its approach:
react_selfcorrect_prompt = """
... (standard ReAct format)
If an action fails or returns unexpected results, reconsider your approach:
Thought: That didn't work as expected. Let me try a different approach.
Action: [Alternative action]
...
"""# Example execution trace:# Thought: I'll search for "Tokyo population"# Action: Search# Action Input: "Tokyo population"# Observation: Error: Too many results, be more specific## Thought: That didn't work. Let me be more specific with the year.# Action: Search# Action Input: "Tokyo population 2024 census"# Observation: Tokyo's population is approximately 14 million# → Self-correction leads to success
2. ReAct with Reflection
The agent evaluates its own reasoning quality after task completion:
react_reflection_prompt = """
... (standard ReAct)
After completing the task, reflect:
Reflection: [Evaluate the quality of your reasoning and actions]
Improvements: [What could be done better next time]
"""# Example:# Final Answer: 28 million## Reflection: My approach was effective -- I systematically gathered# information and performed calculations. The search query could# have been more precise initially.## Improvements: Include the year in initial search queries to# avoid ambiguity. Consider verifying with a second source.
3. ReAct + CoT Hybrid
Combine internal CoT reasoning with external ReAct actions for best results. In the original paper, Yao et al. found that combining ReAct with CoT outperformed either approach alone on HotPotQA and Fever benchmarks. The hybrid approach works by first attempting CoT internal reasoning, then switching to ReAct when the model detects that its internal knowledge is insufficient or outdated:
classReActCoTHybrid:
def__init__(self, llm, tools):
self.llm = llm
self.tools = tools
defsolve(self, question):
# Step 1: Attempt CoT internal reasoning
cot_response = self.llm.predict(f"""
Question: {question}
First, reason about what you already know (internal knowledge).
Rate your confidence in your knowledge on a scale of 1-10.
Internal reasoning:
""")
confidence = self.extract_confidence(cot_response)
if confidence >= 8:
# High confidence: use CoT answer directly (saves tool calls)returnself.llm.predict(f"""
Based on this reasoning: {cot_response}
Provide the final answer:
""")
else:
# Low confidence: switch to ReAct for external verificationreturnself.react_agent.run(f"""
Question: {question}
Initial reasoning (may be incomplete): {cot_response}
Verify and complete this answer using available tools.
""")
# This pattern:# - Saves tool calls when internal knowledge is sufficient# - Falls back to grounded actions when knowledge gaps exist# - Combines the speed of CoT with the accuracy of ReAct
This hybrid pattern is particularly valuable in production because it reduces API costs. Many agent queries can be answered with internal knowledge alone (saving tool call latency and cost), while complex or factual queries automatically escalate to tool-augmented reasoning. On the NCP-AAI exam, questions about optimizing agent costs while maintaining accuracy often point to this hybrid approach.
ReAct Advantages for NCP-AAI
Grounded in reality: Actions provide real feedback, preventing hallucinations
Transparent reasoning: Thought traces are interpretable and debuggable
Dynamic adaptation: Can adjust strategy based on observations
Tool integration: Natural fit for function calling and API interactions
Explicit and traceable: Unlike black-box planning, every decision is logged
NCP-AAI Exam Focus: ReAct is the default planning strategy for production agents due to its balance of reasoning and action. It is the most heavily tested planning framework on the exam.
Exam Trap
A common NCP-AAI mistake is assuming ReAct is always the best choice. While ReAct is the default for production agents, it has key limitations: linear planning (no alternative exploration), error accumulation (early mistakes compound), and high token costs from verbose output. Know when to combine ReAct with other strategies or when ToT or HTN is the better fit.
ReAct Limitations
Challenge
Impact
Mitigation
Verbose output
High token costs
Use cheaper models for reasoning steps
Linear planning
Does not explore alternatives
Combine with ToT for branching
Error accumulation
Early mistakes compound
Add reflection/self-correction
Max iterations
Can timeout on complex tasks
Set appropriate limits, add fallbacks
Single path
Commits to first viable approach
Use ToT when comparison is needed
Tree of Thoughts (ToT): Exploring Multiple Reasoning Paths
Definition: Tree of Thoughts generates multiple reasoning paths (branches), evaluates them, and selects the most promising direction -- enabling search-based planning. Introduced by Yao et al. (2023) in "Tree of Thoughts: Deliberate Problem Solving with Large Language Models" (NeurIPS 2023), ToT generalizes CoT by allowing LMs to explore coherent units of text ("thoughts") as intermediate problem-solving steps, with the ability to look ahead, evaluate, and backtrack.
Key Result
On the Game of 24 benchmark, GPT-4 with standard CoT prompting solved only 4% of tasks, while ToT achieved a 74% success rate -- a dramatic improvement demonstrating the power of deliberate exploration over linear reasoning.
ToT Concepts
1. Thought Decomposition -- Break the problem into intermediate steps (thoughts), where each thought is a coherent unit of reasoning.
2. Thought Generation -- Generate multiple candidate thoughts at each step using the LLM.
3. State Evaluation -- Evaluate how promising each thought path is (how likely it leads to a correct solution). The LLM itself can serve as the evaluator, scoring each path on a numeric scale.
4. Search Algorithm -- Navigate the tree of possibilities:
Breadth-First Search (BFS): Explore all options at each level before going deeper
Depth-First Search (DFS): Explore one path deeply before backtracking
Best-First Search: Prioritize the most promising paths using evaluation scores
ToT Implementation
from langchain_openai import ChatOpenAI
classTreeOfThoughts:
def__init__(self, llm, max_depth=3, branching_factor=3):
self.llm = llm
self.max_depth = max_depth
self.branching_factor = branching_factor
defgenerate_thoughts(self, problem, current_state, depth):
"""Generate multiple candidate next thoughts"""
prompt = f"""
Problem: {problem}
Current reasoning: {current_state}
Generate {self.branching_factor} different possible next steps
in solving this problem. Format each as a numbered option:
"""
response = self.llm.predict(prompt)
thoughts = self._parse_thoughts(response)
return thoughts
defevaluate_thought(self, problem, thought_sequence):
"""Evaluate how promising a thought sequence is (0-10)"""
prompt = f"""
Problem: {problem}
Reasoning so far: {thought_sequence}
On a scale of 0-10, how likely is this reasoning path
to lead to a correct solution?
Consider:
- Logical coherence
- Progress toward the goal
- Avoiding dead ends
Score (0-10):
"""
response = self.llm.predict(prompt)
score = float(response.strip())
return score
def_breadth_first_search(self, problem):
"""Explore all paths level by level"""
queue = [("", 0)] # (thought_sequence, depth)
best_path = None
best_score = -1while queue:
current_state, depth = queue.pop(0)
if depth >= self.max_depth:
score = self.evaluate_thought(problem, current_state)
if score > best_score:
best_score = score
best_path = current_state
continue
thoughts = self.generate_thoughts(problem, current_state, depth)
for thought in thoughts:
new_state = current_state + "\n" + thought
queue.append((new_state, depth + 1))
return best_path, best_score
def_depth_first_search(self, problem, current_state="", depth=0, threshold=3):
"""Explore one path deeply, backtrack if score is too low"""if depth >= self.max_depth:
score = self.evaluate_thought(problem, current_state)
return current_state, score
thoughts = self.generate_thoughts(problem, current_state, depth)
best_path = None
best_score = -1for thought in thoughts:
new_state = current_state + "\n" + thought
# Prune: skip paths that score below threshold
mid_score = self.evaluate_thought(problem, new_state)
if mid_score < threshold:
continue# Backtrack
path, score = self._depth_first_search(
problem, new_state, depth + 1, threshold
)
if score > best_score:
best_score = score
best_path = path
return best_path, best_score
defsolve(self, problem, algorithm="bfs"):
"""Solve problem using Tree of Thoughts"""if algorithm == "bfs":
best_path, score = self._breadth_first_search(problem)
else:
best_path, score = self._depth_first_search(problem)
# Generate final answer from best path
final_prompt = f"""
Problem: {problem}
Best reasoning path found:
{best_path}
Based on this reasoning, provide the final answer:
"""
answer = self.llm.predict(final_prompt)
return answer, best_path, score
# Usage
llm = ChatOpenAI(model="gpt-4", temperature=0.7)
tot = TreeOfThoughts(llm, max_depth=3, branching_factor=3)
problem = "Design a microservices architecture for an e-commerce platform."
answer, reasoning, score = tot.solve(problem)
ToT Execution Example
Problem: "Plan a 3-day trip to New York City on a $1000 budget"
Root: "Plan NYC trip, 3 days, $1000"
│
├── Branch 1: "Focus on free attractions (museums, parks, walking tours)"
│ ├── Branch 1.1: "Budget hotel ($100/night), subway pass"
│ │ └── Branch 1.1.1: "Day 1: Central Park + Met, Day 2: Brooklyn
│ │ Bridge + 9/11 Memorial, Day 3: High Line + Chelsea Market"
│ │ [Score: 8/10]
│ ├── Branch 1.2: "Hostel ($50/night), more budget for activities"
│ └── Branch 1.3: "Airbnb in Queens ($80/night), authentic experience"
│
├── Branch 2: "Prioritize iconic paid attractions"
│ ├── Branch 2.1: "Buy CityPass ($140), budget accommodation"
│ │ └── Branch 2.1.1: "Day 1: ESB + Top of Rock, Day 2:
│ │ Statue of Liberty..." [Score: 7/10]
│ └── Branch 2.2: "Focus on 2-3 major attractions, skip tourist traps"
│
└── Branch 3: "Cultural experience (Broadway, dining, neighborhoods)"
├── Branch 3.1: "One Broadway show ($150), street food"
│ └── Branch 3.1.1: "Day 1: Broadway + Times Square, Day 2:
│ Chinatown + Little Italy..." [Score: 6/10]
└── Branch 3.2: "Skip Broadway, focus on food tours"
Best Path Selected: Branch 1.1.1 (Score: 8/10)
Reasoning: Maximizes experiences within budget by focusing on free
attractions while maintaining comfort with a budget hotel.
ToT Cost Analysis
The computational cost of ToT grows exponentially with depth and branching factor:
ToT Advantages and Limitations
Advantages:
Explores alternatives: Considers multiple strategies before committing
Handles complexity: Effective for problems with many valid approaches
Avoids local optima: Can backtrack from dead ends
Self-evaluation: Explicitly assesses reasoning quality at each step
Limitations:
Challenge
Impact
Solution
High cost
Many LLM calls (exponential)
Prune low-scoring branches early
Latency
Slow for real-time applications
Use for planning phase only
Evaluation difficulty
Hard to score thought quality
Train value model or use heuristics
Overkill for simple tasks
Unnecessary complexity
Use CoT or ReAct when one path suffices
When to Use ReAct vs. CoT vs. ToT: Decision Framework
Choosing the right planning strategy is one of the most frequently tested skills on the NCP-AAI exam. Use this decision framework:
Strategy Selection Decision Tree
Does the task require external tool calls or API interactions?
├── YES → Does it require comparing multiple approaches?
│ ├── YES → ReAct + ToT hybrid
│ └── NO → ReAct (default for production agents)
└── NO → Is there a single correct answer (math, logic)?
├── YES → Chain-of-Thought
└── NO → Are there multiple valid approaches to evaluate?
├── YES → Tree of Thoughts
└── NO → Chain-of-Thought
Quick Reference Matrix
Planning Strategy Selection Guide
Scenario
Best Strategy
Why
Exam Frequency
Customer service resolving tickets with API calls
ReAct
Requires tool use + adaptive reasoning
Very High
Solving a math word problem
Chain-of-Thought
Pure reasoning, no actions needed
High
Designing system architecture (multiple valid approaches)
Tree of Thoughts
Must explore and compare alternatives
Medium
Flight booking with search + comparison + booking
ReAct
Sequential tool calls with reasoning
Very High
Sudoku or constraint-satisfaction puzzle
Tree of Thoughts
Requires branching and backtracking
Medium
Code review and explanation
Chain-of-Thought
Reasoning-only, no external actions
Medium
Multi-hop question answering with knowledge retrieval
ReAct
Needs external knowledge + reasoning
High
Strategic business planning with trade-offs
Tree of Thoughts
Multiple valid strategies to evaluate
Low
Cost and Latency Comparison
Planning Strategies: Performance Comparison
Strategy
LLM Calls
Token Cost
Latency
Transparency
NCP-AAI Weight
Chain-of-Thought
1 call
Low
Low (seconds)
High (full trace)
Medium
Self-Consistency CoT
k calls (k=5-10)
Medium
Medium
High
Low
ReAct
N calls (1 per step)
Medium
Medium (seconds-minutes)
High (thought + action)
Very High
Tree of Thoughts (BFS)
b^d calls (exponential)
High
High (minutes)
Medium (best path shown)
Medium
Tree of Thoughts (DFS)
b*d calls (with pruning)
Medium-High
Medium-High
Medium
Medium
Exam Trap
When the exam presents a scenario involving tool calls or external API interactions, Chain-of-Thought alone is always the wrong answer. CoT only performs reasoning without actions. If the scenario requires executing searches, database queries, or API calls, the correct answer involves ReAct or a planning framework that supports action execution. This is one of the most common traps on the NCP-AAI.
Five Classical Planning Approaches
Beyond the three LLM-native strategies above, the NCP-AAI exam tests your knowledge of classical AI planning approaches that provide the theoretical foundation for agent planning systems.
1. Forward Planning (Progressive Search)
Definition: Start from the current state and explore actions forward until the goal state is reached.
Current State → Action 1 → State 2 → Action 2 → ... → Goal State
Implementation:
classForwardPlanner:
defplan(self, start, goal):
state = start
plan = []
while state != goal:
# Find action that moves toward goal
action = self.select_best_action(state, goal)
plan.append(action)
state = self.apply_action(state, action)
return plan
# Example
planner = ForwardPlanner()
plan = planner.plan(start="home", goal="office")
# Result: ["walk_to_car", "drive_to_office", "park_car", "enter_building"]
Advantages: Intuitive, easy to implement, works well when the action space is small.
Disadvantages: Can be inefficient for large search spaces; explores many irrelevant actions.
2. Backward Planning (Regression)
Definition: Start from the goal state and work backward to determine the required preconditions at each step.
Goal State ← Action N ← State N-1 ← ... ← Current State
When to Use: When the goal has fewer achievable states than the start state, or when preconditions are well-defined. Also effective for multi-hop question answering where you decompose from the final question backward to sub-questions.
Exam Application -- Multi-Hop QA Example:
Question: "What is the capital of the country where the 2024 Olympics were held?"
Backward decomposition:
Goal: Capital of country X
← Requires: Country X where 2024 Olympics were held
← Requires: Location of 2024 Olympics → Paris
← Derives: Country → France
← Derives: Capital of France → Paris
Forward execution of decomposed plan:
Step 1: Search "2024 Olympics location" → Paris
Step 2: Search "What country is Paris in?" → France
Step 3: Search "Capital of France" → Paris
Answer: Paris
This backward-then-forward pattern -- decompose the problem backward from the goal, then execute forward -- is a common NCP-AAI exam pattern. The question often asks which reasoning technique is demonstrated (backward chaining) or which execution framework supports it (ReAct for the forward execution with tool calls).
3. Hierarchical Task Network (HTN) Planning
Definition: Decompose high-level abstract tasks into primitive actions using predefined decomposition methods. HTN is the most common planning approach in production agentic systems and one of the most frequently tested topics on the NCP-AAI exam.
The SHOP2 algorithm (Nau et al., 2003) is the canonical HTN planner, notable for supporting partially ordered subtask decomposition -- meaning subtasks within a method do not all need a fixed execution order.
Structure:
┌─────────────────────────────────────┐
│ High-Level Goal (Abstract Task) │
└──────────────┬──────────────────────┘
│
┌──────┴──────┐
v v
Subtask 1 Subtask 2
│ │
┌──┴──┐ ┌──┴──┐
v v v v
Action Action Action Action
(Primitive -- directly executable)
HTN planning is the most common planning approach for production agentic systems. It decomposes abstract goals into concrete primitive actions using predefined methods, making it ideal for well-structured domains like trip planning, software deployment, business workflows, and customer service escalation. On the NCP-AAI exam, if a question describes a complex workflow with clear hierarchical structure, HTN is likely the correct answer.
Task Decomposition Matrix
Use this matrix to determine the right decomposition strategy:
Task Characteristic
Decomposition Approach
Example
Fixed sequence of steps
Sequential HTN
Software deployment pipeline
Steps can run in parallel
Partial-order HTN (SHOP2)
Cooking: boil water + chop vegetables simultaneously
Multiple valid methods
HTN with method selection
Travel: fly vs. train vs. drive
Dynamic environment
HTN + continual replanning
Warehouse robot navigation
Unknown structure
LLM-based decomposition
Novel user requests
4. Partial-Order Planning (POP)
Definition: Plan actions without committing to a specific execution order until necessary. Actions are only ordered when one depends on the output of another, enabling parallelism.
classPartialOrderPlanner:
def__init__(self):
self.actions = []
self.orderings = [] # List of (before, after) constraintsself.causal_links = [] # (producer, condition, consumer)defadd_action(self, action):
self.actions.append(action)
defadd_ordering(self, before, after):
"""Enforce: 'before' must execute before 'after'"""self.orderings.append((before, after))
defget_parallel_groups(self):
"""Find actions that can execute in parallel"""returnself.topological_sort_with_levels(self.actions, self.orderings)
# Example: Dinner preparation
planner = PartialOrderPlanner()
planner.add_action("chop_vegetables")
planner.add_action("boil_water")
planner.add_action("cook_pasta")
planner.add_action("make_sauce")
planner.add_action("serve")
# Only add necessary ordering constraints
planner.add_ordering("boil_water", "cook_pasta")
planner.add_ordering("chop_vegetables", "make_sauce")
planner.add_ordering("cook_pasta", "serve")
planner.add_ordering("make_sauce", "serve")
# Parallel groups:# Group 1 (parallel): ["boil_water", "chop_vegetables"]# Group 2 (parallel): ["cook_pasta", "make_sauce"]# Group 3: ["serve"]# Total time: 3 sequential groups instead of 5 sequential actions
Advantages: Enables parallel execution, more flexible scheduling, reduces total execution time.
Disadvantages: Complex constraint management, harder to debug than sequential plans.
Why POP Matters for Agentic AI: In multi-agent systems, partial-order planning directly maps to task parallelism. If two subtasks have no ordering constraint between them, they can be assigned to different agents and executed simultaneously. This is why POP is foundational to scalable agent architectures. On the NCP-AAI exam, any question about maximizing throughput or minimizing wall-clock execution time for independent subtasks points to partial-order planning.
Causal Links and Threat Resolution: In formal POP, a causal link (A --[p]--> B) means action A produces condition p that action B requires. A "threat" occurs when a third action C could undo condition p between A and B. The planner resolves threats by either promoting C before A or demoting C after B. While the NCP-AAI does not require formal proofs, understanding that POP must protect causal dependencies from interference is important for architecture questions about concurrent agent actions.
5. Continual Planning (Interleaved Planning and Execution)
Definition: Plan, execute, observe, replan -- a continuous cycle where planning happens during execution and adapts to real-world feedback. This is the closest classical analogue to the ReAct pattern.
classContinualPlanner:
defexecute_with_replanning(self, goal):
plan = self.create_initial_plan(goal)
execution_log = []
whilenotself.goal_achieved(goal):
ifnot plan:
return"Failed: no valid plan found"# Execute next action
action = plan.pop(0)
result = self.execute_action(action)
execution_log.append((action, result))
# Check for unexpected outcomesif result.unexpected or result.failed:
# Replan from current state with updated beliefs
plan = self.replan(
current_state=self.get_current_state(),
goal=goal,
failed_action=action,
history=execution_log
)
# Update world model with new observationsself.update_beliefs(result)
return"Goal achieved"
When to Use: Dynamic environments where conditions change frequently (robotics, real-time systems, live data processing). On the NCP-AAI exam, if the scenario describes a changing environment, the answer is almost never a fully upfront planning approach.
Classical Approaches Comparison
Five Classical Planning Approaches
Approach
Best For
Key Advantage
Key Limitation
Forward Planning
Simple, small search spaces
Intuitive implementation
Inefficient for large state spaces
Backward Planning
Goal-directed, well-defined preconditions
Efficient precondition analysis
Requires well-defined goal states
HTN Planning
Complex hierarchical workflows
Most common in production agentic AI
Requires predefined decomposition methods
Partial-Order (POP)
Parallelizable, independent subtasks
Enables parallel execution
Complex constraint management
Continual Planning
Dynamic, changing environments
Adapts to real-time conditions
Higher computational overhead
Exam Trap
On the NCP-AAI exam, watch out for scenarios where over-planning is the trap answer. If the question describes a dynamic environment with frequent changes, the answer is almost never a fully upfront planning approach (like pure forward or backward planning). Look for continual planning or ReAct-based replanning strategies instead. Conversely, if the domain is well-structured with known decomposition rules, HTN is preferred over LLM-based planning.
Advanced Planning Algorithms
A* Planning (Optimal Pathfinding)
A* finds the lowest-cost path from a start state to a goal state using a heuristic function to guide the search. It guarantees an optimal solution when the heuristic is admissible (never overestimates the true cost).
A* Evaluation Function
f(n) = g(n) + h(n)
Copy
Implementation:
import heapq
classAStarPlanner:
defplan(self, start, goal):
frontier = [(0, start)] # Priority queue: (f_score, state)
came_from = {start: None}
g_score = {start: 0} # Actual cost from startwhile frontier:
current_f, current = heapq.heappop(frontier)
if current == goal:
returnself.reconstruct_path(came_from, start, goal)
for action, next_state inself.get_successors(current):
new_g = g_score[current] + self.action_cost(action)
if next_state notin g_score or new_g < g_score[next_state]:
g_score[next_state] = new_g
f_score = new_g + self.heuristic(next_state, goal)
heapq.heappush(frontier, (f_score, next_state))
came_from[next_state] = (current, action)
returnNone# No path founddefheuristic(self, state, goal):
"""Must be admissible: never overestimates true cost"""# Example: Manhattan distance for grid-based planningreturnabs(state.x - goal.x) + abs(state.y - goal.y)
A Complexity:*
Time: O(b^d) worst case, but typically much better with a good heuristic
Space: O(b^d) -- stores all expanded nodes
Optimality: Guaranteed when h(n) is admissible
Key Properties of A for the NCP-AAI Exam:*
Property
Description
Exam Relevance
Completeness
A* will always find a solution if one exists (given finite branching)
Know that A* never gives up prematurely
Optimality
Guaranteed optimal when h(n) is admissible
Understand admissibility = never overestimates
Consistency
If h(n) <= cost(n, n') + h(n'), A* is optimally efficient
Stronger than admissibility; means no node is re-expanded
Space complexity
O(b^d) -- stores all expanded nodes in memory
This is A*'s primary limitation for large state spaces
When A is the Wrong Choice:* A* requires an explicit state space with well-defined transitions. For open-ended planning problems where the state space is not enumerable (e.g., creative writing, open-ended research), LLM-based planning or ToT is more appropriate. On the NCP-AAI exam, if the scenario lacks a clear state-transition model, A* is likely a distractor answer.
Use Cases for NCP-AAI: Navigation, resource allocation with cost optimization, finding optimal action sequences in well-defined state spaces, robotic motion planning.
Monte Carlo Tree Search (MCTS)
MCTS builds a search tree incrementally through random simulations, balancing exploration of new paths against exploitation of known good paths. It is the algorithm behind AlphaGo and many game-playing agents.
Use Cases: Game playing, exploration tasks with uncertain outcomes, high-stakes decisions where simulation is possible, robotic planning.
LLM-Based Planning
Use LLMs directly to generate structured plans from natural language goals:
import json
defllm_plan(goal, context, llm):
prompt = f"""
You are a task planning AI. Break down the following goal
into a step-by-step plan.
Goal: {goal}
Current context: {context}
Provide a detailed plan in JSON format:
{{
"steps": [
{{"id": 1, "action": "...", "reasoning": "...", "dependencies": []}},
{{"id": 2, "action": "...", "reasoning": "...", "dependencies": [1]}}
]
}}
"""
response = llm.predict(prompt)
plan = json.loads(response)
return plan["steps"]
# Example
plan = llm_plan(
goal="Deploy a new microservice to production",
context="Current env: staging, tests passing, Docker image built",
llm=llm
)
Advantages: Natural language I/O, handles novel situations, minimal domain engineering.
Disadvantages: Non-deterministic, can hallucinate invalid plans, expensive per call.
Comparing Advanced Algorithms
Algorithm
Type
Optimality
Cost
Best For
A*
Heuristic search
Optimal (with admissible h)
O(b^d) time and space
Known state spaces with cost optimization
MCTS
Simulation-based
Converges to optimal
O(n_iterations) per decision
Uncertain outcomes, game-like scenarios
LLM Planning
Generative
No guarantee
Per-token API cost
Novel situations, natural language goals
BFS/DFS
Uninformed search
Complete (BFS) / Not guaranteed (DFS)
O(b^d) / O(b*d)
Simple state spaces, proof of concept
Integrating Planning with Memory
Planning does not happen in isolation. Production agents combine planning with memory systems to improve plan quality over time. This integration is an important NCP-AAI concept that bridges the Planning and Memory domains.
Short-Term Memory for Active Plans
During plan execution, the agent maintains a working memory of the current plan state, completed steps, pending steps, and intermediate results:
classPlanningWithMemory:
def__init__(self, planner, executor, memory):
self.planner = planner
self.executor = executor
self.memory = memory # Working memorydefexecute_plan(self, goal):
plan = self.planner.create_plan(goal)
# Store plan in working memoryself.memory.store("current_plan", plan)
self.memory.store("completed_steps", [])
self.memory.store("plan_status", "in_progress")
for step in plan:
# Check if plan needs revision based on new informationifself.memory.get("needs_replan"):
completed = self.memory.get("completed_steps")
plan = self.planner.replan(goal, completed)
self.memory.store("current_plan", plan)
self.memory.store("needs_replan", False)
result = self.executor.run(step)
self.memory.append("completed_steps", (step, result))
# Store observations for future planningself.memory.store_observation(step, result)
self.memory.store("plan_status", "completed")
Long-Term Memory for Plan Reuse
Successful plans can be cached and retrieved for similar future tasks, dramatically reducing planning latency:
classPlanMemoryStore:
def__init__(self, vector_store):
self.vector_store = vector_store
defstore_successful_plan(self, goal, plan, outcome):
"""Store a successful plan for future retrieval"""
embedding = self.embed(goal)
self.vector_store.add(
embedding=embedding,
metadata={"goal": goal, "plan": plan, "outcome": outcome}
)
defretrieve_similar_plan(self, new_goal, threshold=0.85):
"""Find a previously successful plan for a similar goal"""
embedding = self.embed(new_goal)
results = self.vector_store.search(embedding, top_k=3)
for result in results:
if result.similarity >= threshold:
return result.metadata["plan"] # Reuse with adaptationreturnNone# No similar plan found; create new plan
This pattern -- plan memoization via semantic similarity -- is especially powerful for enterprise agents that handle recurring types of requests. On the NCP-AAI exam, questions about improving planning efficiency often have "plan caching" or "plan reuse" as the correct answer.
Hierarchical Planning: Plan-and-Execute Pattern
The Plan-and-Execute pattern separates strategic planning (slow, thoughtful) from tactical execution (fast, reactive). A high-level planner creates the overall strategy, then a ReAct agent executes each step.
classHierarchicalPlanner:
def__init__(self, planner_llm, executor_agent):
self.planner = planner_llm
self.executor = executor_agent
defsolve(self, goal):
# Step 1: Generate high-level plan
plan_prompt = f"""
Goal: {goal}
Create a step-by-step plan to achieve this goal.
Each step should be a concrete, executable subgoal.
Plan:
"""
plan = self.planner.predict(plan_prompt)
steps = self._parse_plan(plan)
# Step 2: Execute each step with ReAct agent
results = []
for i, step inenumerate(steps):
print(f"Executing Step {i+1}: {step}")
try:
result = self.executor.run(step)
results.append({"step": step, "result": result, "status": "success"})
except Exception as e:
results.append({"step": step, "error": str(e), "status": "failed"})
# Replan from current state
remaining = self._replan(goal, results, steps[i+1:])
steps = steps[:i+1] + remaining
# Step 3: Synthesize final answer
synthesis_prompt = f"""
Goal: {goal}
Execution Results: {results}
Synthesize a final answer:
"""returnself.planner.predict(synthesis_prompt)
For the NCP-AAI exam, know the three NeMo Agent Toolkit planning strategies: REACT (reasoning + actions, most flexible), COT (reasoning only, no tool calls), and DIRECT (fastest, no planning overhead). The exam tests your ability to select the appropriate strategy based on task requirements.
NeMo Agent Toolkit ReAct Configuration
# ReAct agent with toolsfrom nemo_agent import Agent, Tool
search_tool = Tool(
name="wikipedia_search",
description="Search Wikipedia for factual information",
endpoint="https://api.wikipedia.org/search"
)
agent = Agent(
model="nvidia/llama-3-70b-nemo",
planning_strategy=PlanningStrategy.REACT,
tools=[search_tool],
max_planning_steps=10,
planning_timeout=30
)
result = agent.run("What year was the Eiffel Tower built and how tall is it?")
# Agent uses ReAct loop: Thought → Action (search) → Observation → Thought → Answer
NVIDIA AIQ Toolkit Agent Graph
Visualize and execute agent workflows as directed graphs with conditional edges:
NCP-AAI Exam Focus: Understand conditional edges (branching logic based on node outputs) and how they enable error recovery through fallback paths.
LangChain PlanAndExecute Agent
from langchain.agents import PlanAndExecute, load_agent_executor, load_chat_planner
planner = load_chat_planner(llm)
executor = load_agent_executor(llm, tools, verbose=True)
agent = PlanAndExecute(planner=planner, executor=executor)
result = agent.run("Book a flight from SF to NYC and reserve a hotel")
LlamaIndex Workflow Engine
from llama_index.core import Workflow
workflow = Workflow()
workflow.add_step("research", research_agent)
workflow.add_step("plan", planning_agent)
workflow.add_step("execute", execution_agent)
workflow.add_dependency("research", "plan")
workflow.add_dependency("plan", "execute")
result = workflow.run(input="Analyze competitor products and recommend strategy")
Best Practices for Production Planning Systems
Eight Rules for Production Planning
Set planning timeouts to prevent infinite search loops
Cache common plans for repeated tasks (plan memoization)
Implement plan validation before execution (check preconditions)
Use hierarchical planning for complex multi-step tasks
Enable replanning for dynamic environments (continual planning)
Monitor plan execution and log decisions for debugging
Balance planning time vs. execution quality -- do not over-plan simple tasks
Add fallback strategies for when primary plans fail
Planning Performance Optimization
classOptimizedPlanner:
def__init__(self):
self.plan_cache = {}
self.timeout = 5.0# secondsself.max_retries = 3defplan_with_optimizations(self, goal):
# 1. Check cache first
cache_key = self._normalize_goal(goal)
if cache_key inself.plan_cache:
cached_plan = self.plan_cache[cache_key]
ifself.is_still_valid(cached_plan):
return cached_plan
# 2. Plan with timeout
plan = self._timeout_call(self.plan, args=(goal,), timeout=self.timeout)
# 3. Validate before cachingif plan andself.validate_plan(plan):
self.plan_cache[cache_key] = plan
return plan
defvalidate_plan(self, plan):
"""Check preconditions and executability"""for step in plan:
ifnotself.check_preconditions(step):
returnFalsereturnTrue
Common Planning Pitfalls
Avoid these mistakes both in production systems and on the NCP-AAI exam:
Pitfall
Description
Solution
Over-planning
Spending too much time planning vs. executing
Set timeouts; use DIRECT mode for simple tasks
Ignoring uncertainty
Assuming the environment is static
Use continual planning or ReAct
No replanning
Failing to adapt when plans fail
Add fallback strategies and error recovery
Invalid preconditions
Assuming preconditions that do not hold
Validate preconditions before each step
Brittle plans
Plans that fail on minor deviations
Build in tolerance and alternative paths
Infinite loops
Circular dependencies or goal conflicts
Set max iterations and detect cycles
Wrong strategy
Using ToT for simple tasks or CoT for tool-heavy tasks
Handling planning failures, retries, and fallbacks
Agent Architecture (15%):
Choosing the right planning strategy for the scenario
Multi-agent planning (centralized vs. decentralized)
Hierarchical plan-and-execute patterns
A* and MCTS for optimization problems
Common NCP-AAI Exam Traps for Planning Questions
Exam Traps to Avoid
Trap 1: CoT for tool-heavy tasks. If the scenario involves API calls, database queries, or external tools, CoT alone is always wrong. CoT cannot execute actions.
Trap 2: ReAct is always best. ReAct is the default, but not always optimal. For pure reasoning (math, logic), CoT is cheaper and faster. For problems requiring exploration of alternatives, ToT is superior.
Trap 3: Over-planning in dynamic environments. If the question describes a changing environment, do not choose a fully upfront planning approach. Continual planning or ReAct with replanning is correct.
Trap 4: Ignoring HTN for structured workflows. When the domain has clear task hierarchies (e.g., "deploy software," "book travel"), HTN is the intended answer, not generic LLM planning.
Trap 5: Confusing POP with sequential planning. If the question asks about maximizing parallelism or minimizing total execution time, partial-order planning is the answer, not forward planning.
Practice Questions
Hands-On Practice Scenarios
Scenario 1: Travel Agent Planner
Scenario 2: Dynamic Warehouse Robot
Scenario 3: Multi-Agent Task Force
Scenario 4: Error Recovery in Customer Support
Error Recovery and Replanning Strategies
Error recovery is a critical planning capability tested on the NCP-AAI exam. When a plan step fails, the agent must decide between several recovery strategies:
Recovery Strategy Hierarchy
Plan Step Fails
│
├── 1. Retry (transient error?)
│ └── Same action, same parameters, up to N retries
│
├── 2. Alternative action (same goal, different method)
│ └── Try a different tool/API that achieves the same sub-goal
│
├── 3. Partial replan (adjust remaining steps)
│ └── Keep completed steps, replan from current state
│
├── 4. Full replan (start over with new strategy)
│ └── Discard current plan, create entirely new approach
│
└── 5. Graceful degradation (reduce scope)
└── Achieve a subset of the original goal, inform user of limitations
Implementation Pattern
classResilientPlanner:
defexecute_with_recovery(self, goal, plan):
for i, step inenumerate(plan):
success = False# Level 1: Retryfor attempt inrange(self.max_retries):
result = self.execute_action(step)
if result.success:
success = Truebreakifnot success:
# Level 2: Alternative action
alternatives = self.get_alternative_actions(step)
for alt_action in alternatives:
result = self.execute_action(alt_action)
if result.success:
success = Truebreakifnot success:
# Level 3: Partial replan
completed = plan[:i]
remaining_goal = self.compute_remaining_goal(goal, completed)
new_plan = self.planner.replan(
current_state=self.get_state(),
goal=remaining_goal,
constraints={"avoid": [step]} # Don't repeat failed approach
)
if new_plan:
returnself.execute_with_recovery(remaining_goal, new_plan)
# Level 5: Graceful degradationreturnself.degrade_gracefully(goal, completed)
return"Goal achieved"
Error Recovery Exam Scenarios
The NCP-AAI frequently tests error recovery with scenarios like:
Database timeout: Agent switches to cached data (fallback strategy)
API rate limit: Agent queues requests and implements exponential backoff (retry with delay)
Tool unavailable: Agent uses an alternative tool that provides similar functionality (alternative action)
Partial results: Agent adjusts plan to work with incomplete data (graceful degradation)
Conflicting information: Agent adds a verification step before proceeding (plan amendment)
Key Concept
On the NCP-AAI exam, error recovery questions typically ask: "What planning feature enables recovery when step X fails?" The answer framework is: (1) retry for transient errors, (2) conditional branching/fallback for known failure modes, (3) replanning for unexpected failures, and (4) graceful degradation when no recovery is possible.
Recommended Study Path for Planning Topics
Week-by-Week Approach
Week 1: Foundations
Learn CoT, ReAct, and ToT frameworks and their research origins
Implement basic CoT prompts (zero-shot and few-shot)
Understand when to use each strategy (decision framework)
Week 2: Classical Planning
Study forward, backward, HTN, POP, and continual planning
Implement a simple HTN planner for a real-world domain (trip planning, software deployment)
Understand A* optimality guarantees and MCTS exploration-exploitation trade-off
Week 3: NVIDIA Tools and Production Patterns
Study NeMo Agent Toolkit planning modules (REACT, COT, DIRECT)
Learn AIQ Toolkit agent graph configuration with conditional edges
Practice hierarchical plan-and-execute patterns with LangChain
Week 4: Integration and Exam Practice
Study planning + memory integration patterns
Practice error recovery and replanning scenarios
Take Preporato practice tests focused on planning questions
Review common exam traps and distractor patterns
Week 5: Multi-Agent and Advanced Topics
Study centralized vs. decentralized multi-agent planning
Review MCTS UCB1 formula and A* heuristic properties
Code-based questions on LangChain, NeMo Agent Toolkit, and custom planners
Summary
Planning strategies determine an agent's problem-solving capability. Here is the complete hierarchy of what the NCP-AAI exam tests:
LLM-Native Strategies:
Chain-of-Thought: Step-by-step reasoning for logic-heavy, action-free tasks (lowest cost)
ReAct: Interleaved reasoning and actions for dynamic, tool-based tasks (NCP-AAI favorite)
Tree of Thoughts: Multi-path exploration for strategic, optimization problems (highest cost)
Classical Planning Approaches:
Forward Planning: Intuitive start-to-goal search for small state spaces
Backward Planning: Goal-directed regression for well-defined preconditions
HTN Planning: Hierarchical decomposition for structured workflows (most common in production)
Partial-Order Planning: Flexible ordering for parallelizable tasks
Continual Planning: Interleaved planning and execution for dynamic environments
Advanced Algorithms:
A*: Optimal pathfinding with admissible heuristics
MCTS: Simulation-based search with UCB1 exploration-exploitation balance
NVIDIA Tools:
NeMo Agent Toolkit: REACT, COT, and DIRECT planning modules
AIQ Toolkit: Agent graphs with conditional edges for workflow orchestration
Key Takeaways Checklist
0/11 completed
Chain-of-Thought: step-by-step reasoning for logic-heavy tasks (Wei et al., 2022)Zero-Shot CoT: just add Let us think step by step (Kojima et al., 2022)ReAct: interleave reasoning and actions for tool-based tasks (Yao et al., 2022)Tree of Thoughts: explore multiple paths for strategic planning (Yao et al., 2023)Five classical approaches: forward, backward, HTN, partial-order, continualA* planning: optimal pathfinding with f(n) = g(n) + h(n)MCTS: simulation-based search with UCB1 exploration-exploitationHTN planning: most common in production agentic AI systemsNeMo Agent Toolkit: REACT, COT, and DIRECT planning strategiesDecision framework: use task requirements to select the right strategyCommon exam traps: CoT for tool tasks, ReAct always best, over-planning in dynamic environments
Ready to Pass the NCP-AAI Exam?
Join thousands who passed with Preporato practice tests