什么是 langgraph?
langgraph是专为llm应用程序设计的工作流编排框架。其核心原则是:
- 将复杂任务分解为状态和转换
- 管理状态转换逻辑
- 任务执行过程中各种异常的处理
想想购物:浏览→添加到购物车→结账→付款。 langgraph 帮助我们有效地管理此类工作流程。
核心概念
1. 国家
状态就像任务执行中的检查点:
from typing import typeddict, list class shoppingstate(typeddict): # current state current_step: str # cart items cart_items: list[str] # total amount total_amount: float # user input user_input: str class shoppinggraph(stategraph): def __init__(self): super().__init__() # define states self.add_node("browse", self.browse_products) self.add_node("add_to_cart", self.add_to_cart) self.add_node("checkout", self.checkout) self.add_node("payment", self.payment)
2. 状态转换
状态转换定义任务流的“路线图”:
class shoppingcontroller: def define_transitions(self): # add transition rules self.graph.add_edge("browse", "add_to_cart") self.graph.add_edge("add_to_cart", "browse") self.graph.add_edge("add_to_cart", "checkout") self.graph.add_edge("checkout", "payment") def should_move_to_cart(self, state: shoppingstate) -> bool: """determine if we should transition to cart state""" return "add to cart" in state["user_input"].lower()
3. 状态持久化
为了保证系统的可靠性,我们需要持久化状态信息:
class statemanager: def __init__(self): self.redis_client = redis.redis() def save_state(self, session_id: str, state: dict): """save state to redis""" self.redis_client.set( f"shopping_state:{session_id}", json.dumps(state), ex=3600 # 1 hour expiration ) def load_state(self, session_id: str) -> dict: """load state from redis""" state_data = self.redis_client.get(f"shopping_state:{session_id}") return json.loads(state_data) if state_data else none
4. 错误恢复机制
任何步骤都可能失败,我们需要优雅地处理这些情况:
class errorhandler: def __init__(self): self.max_retries = 3 async def with_retry(self, func, state: dict): """function execution with retry mechanism""" retries = 0 while retries < self.max_retries: try: return await func(state) except exception as e: retries += 1 if retries == self.max_retries: return self.handle_final_error(e, state) await self.handle_retry(e, state, retries) def handle_final_error(self, error, state: dict): """handle final error""" # save error state state["error"] = str(error) # rollback to last stable state return self.rollback_to_last_stable_state(state)
现实示例:智能客户服务系统
让我们看一个实际的例子——智能客服系统:
from langgraph.graph import stategraph, state class customerservicestate(typeddict): conversation_history: list[str] current_intent: str user_info: dict resolved: bool class customerservicegraph(stategraph): def __init__(self): super().__init__() # initialize states self.add_node("greeting", self.greet_customer) self.add_node("understand_intent", self.analyze_intent) self.add_node("handle_query", self.process_query) self.add_node("confirm_resolution", self.check_resolution) async def greet_customer(self, state: state): """greet customer""" response = await self.llm.generate( prompt=f""" conversation history: {state['conversation_history']} task: generate appropriate greeting requirements: 1. maintain professional friendliness 2. acknowledge returning customers 3. ask how to help """ ) state['conversation_history'].append(f"assistant: {response}") return state async def analyze_intent(self, state: state): """understand user intent""" response = await self.llm.generate( prompt=f""" conversation history: {state['conversation_history']} task: analyze user intent output format: {{ "intent": "refund/inquiry/complaint/other", "confidence": 0.95, "details": "specific description" }} """ ) state['current_intent'] = json.loads(response) return state
用法
# Initialize system graph = CustomerServiceGraph() state_manager = StateManager() error_handler = ErrorHandler() async def handle_customer_query(user_id: str, message: str): # Load or create state state = state_manager.load_state(user_id) or { "conversation_history": [], "current_intent": None, "user_info": {}, "resolved": False } # Add user message state["conversation_history"].append(f"User: {message}") # Execute state machine flow try: result = await graph.run(state) # Save state state_manager.save_state(user_id, result) return result["conversation_history"][-1] except Exception as e: return await error_handler.with_retry( graph.run, state )
最佳实践
-
陈述设计原则
- 保持状态简单明了
- 仅存储必要的信息
- 考虑序列化要求
-
转换逻辑优化
- 使用条件转换
- 避免无限循环
- 设置最大步数限制
-
错误处理策略
- 实施优雅降级
- 记录详细信息
- 提供回滚机制
-
- 使用异步操作
- 实现状态缓存
- 控制状态大小
常见陷阱和解决方案
-
状态爆炸
- 问题:状态太多导致维护困难
- 解决方案:合并相似的状态,使用状态组合而不是创建新的
-
死锁情况
- 问题:循环状态转换导致任务挂起
- 解决方案:添加超时机制和强制退出条件
-
状态一致性
概括
langgraph 状态机为管理复杂的 ai agent 任务流提供了强大的解决方案:
- 清晰的任务流程管理
- 可靠的状态持久性
- 全面的错误处理
- 灵活的扩展性