Same Brain, Different Plumbing: LangChain vs Raw SDK (Why I Stopped Hand-Writing JSON at 2AM)

In AI engineering, people love asking which model is smarter. The better question is usually simpler: how much code did you have to write to make that model useful?

This week, I rebuilt the same shopping assistant twice in my ReAct Udemy-course project. The first version uses LangChain abstractions. The second version uses raw Ollama SDK calls with manual tool schemas and provider-native message formatting. The model behavior is almost identical. The developer experience is not.

Before comparing lines of code, it helps to define what this agent is actually doing. The loop both versions follow is the ReAct cycle, short for Reason + Act. The model reasons about the user request, decides whether it needs a tool, acts by calling that tool, reads the result, and then reasons again. This cycle repeats until the model has enough verified information to answer. In plain terms, the agent thinks, does, checks, and thinks again.

That ReAct cycle is why both implementations feel similar at a high level. In both, we start with system and user context, ask the model for the next step, detect tool calls, execute one tool call, append the observation back to the conversation, and continue iterating. If no tool call is returned, we exit with the final answer. The algorithm is the same. What changes is how much scaffolding you manage manually.

Here is the shared ReAct loop shape from both versions:

# LangChain (core loop shape)
for iteration in range(1, MAX_ITERATIONS + 1):
    ai_message = llm_with_tools.invoke(messages)
    tool_calls = ai_message.tool_calls
    if not tool_calls:
        return ai_message.content

    tool_call = tool_calls[0]
    tool_name = tool_call.get("name")
    tool_args = tool_call.get("args", {})
    observation = tools_dict[tool_name].invoke(tool_args)

    messages.append(ai_message)
    messages.append(ToolMessage(content=str(observation), tool_call_id=tool_call.get("id")))

# Raw SDK (core loop shape)
for iteration in range(1, MAX_ITERATIONS + 1):
    response = ollama_chat_traced(messages=messages)
    ai_message = response.message
    tool_calls = ai_message.tool_calls
    if not tool_calls:
        return ai_message.content

    tool_call = tool_calls[0]
    tool_name = tool_call.function.name
    tool_args = tool_call.function.arguments
    observation = tools_dict[tool_name](**tool_args)

    messages.append(ai_message)
    messages.append({"role": "tool", "content": str(observation)})

The first major difference appears at tool definition time. In LangChain, using @tool means the framework can generate the JSON schema from the function name, type hints, and docstring. In the raw SDK version, you write and maintain that schema yourself. It is not difficult, but it is repetitive, and repetition is where production bugs hide. Rename a parameter in Python and forget to update the schema dictionary, and now you have drift.

Code comparison for tool definitions and schema handling:

# LangChain tools
from langchain.tools import tool

@tool
def get_product_price(product: str) -> float:
    """Look up the price of a product in the catalog."""
    prices = {"laptop": 1299.99, "headphones": 149.95, "keyboard": 89.50}
    return prices.get(product, 0)

@tool
def apply_discount(price: float, discount_tier: str) -> float:
    """Apply a discount tier to a price and return the final price.
    Available tiers: bronze, silver, gold."""
    discount_percentages = {"bronze": 5, "silver": 12, "gold": 23}
    discount = discount_percentages.get(discount_tier, 0)
    return round(price * (1 - discount / 100), 2)

# Raw SDK tools + manual JSON schema
@traceable(run_type="tool")
def get_product_price(product: str) -> float:
    prices = {"laptop": 1299.99, "headphones": 149.95, "keyboard": 89.50}
    return prices.get(product, 0)

@traceable(run_type="tool")
def apply_discount(price: float, discount_tier: str) -> float:
    discount_percentages = {"bronze": 5, "silver": 12, "gold": 23}
    discount = discount_percentages.get(discount_tier, 0)
    return round(price * (1 - discount / 100), 2)

tools_for_llm = [
    {
        "type": "function",
        "function": {
            "name": "get_product_price",
            "description": "Look up the price of a product in the catalog.",
            "parameters": {
                "type": "object",
                "properties": {
                    "product": {
                        "type": "string",
                        "description": "The product name, e.g. 'laptop', 'headphones', 'keyboard'",
                    }
                },
                "required": ["product"],
            },
        },
    }
]

The second difference is message portability. LangChain’s message classes (SystemMessage, HumanMessage, ToolMessage) give you a framework-level format that stays stable even when providers differ underneath. In the raw Ollama approach, you construct role/content dictionaries directly in provider style. That is fine while you stay in one ecosystem, but the migration tax appears the minute you switch providers with slightly different expectations around message or tool payload shape.

Code comparison for messages:

# LangChain messages
messages = [
    SystemMessage(content="You are a helpful shopping assistant..."),
    HumanMessage(content=question),
]

messages.append(ai_message)
messages.append(ToolMessage(content=str(observation), tool_call_id=tool_call_id))

# Raw SDK (Ollama-style) messages
messages = [
    {"role": "system", "content": "You are a helpful shopping assistant..."},
    {"role": "user", "content": question},
]

messages.append(ai_message)
messages.append({"role": "tool", "content": str(observation)})

The third difference is invocation and execution safety. In LangChain, bind_tools(...).invoke(...) wraps the interaction in framework conventions that are easier to validate, trace, and debug at scale. In the raw path, you call ollama.chat(...) directly and execute tools with argument unpacking (**tool_args). This is flexible and fast for experimentation, but it places more responsibility on your own glue code and error handling discipline.

Code comparison for model call and invocation:

# LangChain call path
llm = init_chat_model(MODEL, model_provider="openai", temperature=0)
llm_with_tools = llm.bind_tools(tools)
ai_message = llm_with_tools.invoke(messages)

# Raw SDK call path
@traceable(name="Ollama Chat", run_type="llm")
def ollama_chat_traced(messages):
    return ollama.chat(model=MODEL, tools=tools_for_llm, messages=messages)

response = ollama_chat_traced(messages=messages)
ai_message = response.message

Then there is tool-call structure itself. LangChain normalizes tool-call data into a consistent dictionary-like interface. Ollama’s SDK returns typed objects with nested attributes. Neither is “wrong,” but one is portability-first and the other is provider-native. If your roadmap includes multiple model providers, that distinction is operational, not academic.

Code comparison for tool-call parsing:

# LangChain
tool_call = tool_calls[0]
tool_name = tool_call.get("name")
tool_args = tool_call.get("args", {})
tool_call_id = tool_call.get("id")

# Raw SDK (Ollama typed object)
tool_call = tool_calls[0]
tool_name = tool_call.function.name
tool_args = tool_call.function.arguments

A related nuance is tool-result correlation. In some ecosystems, especially OpenAI-style tool calling, matching a tool result back to the exact tool call ID is required for correctness. In local sequential flows, you may get away without explicit IDs because calls are handled in order. LangChain smooths this cross-provider mismatch for you. In handmade loops, you own that compatibility layer.

Code comparison for appending tool results:

# LangChain tool result correlation
messages.append(
    ToolMessage(content=str(observation), tool_call_id=tool_call_id)
)

# Raw SDK sequential append
messages.append(
    {"role": "tool", "content": str(observation)}
)

Observability follows the same pattern. In raw implementations, you often manually decorate and trace each boundary you care about. In framework-first implementations, tracing hooks are easier to apply consistently across the loop. If you have ever debugged a failing agent run at 1:30 AM, you already know this is not a small quality-of-life detail.

Code comparison for tracing style:

# LangChain loop tracing
@traceable(name="LangChain Agent Loop")
def run_agent(question: str):
    ...

# Raw SDK tracing split across tools + chat + loop
@traceable(run_type="tool")
def get_product_price(...):
    ...

@traceable(run_type="tool")
def apply_discount(...):
    ...

@traceable(name="Ollama Chat", run_type="llm")
def ollama_chat_traced(messages):
    ...

@traceable(name="Ollama Agent Loop")
def run_agent(question: str):
    ...

The verdict is not that raw SDK code is bad. It is actually excellent for learning internals and keeping full control over every moving part. But for teams shipping features under deadlines, LangChain buys back time by removing repetitive plumbing and reducing cross-provider friction.

So yes, both versions run the same ReAct cycle. They produce the same category of result. But one feels like driving, and the other feels like building the car while traffic is already moving.

And sometimes, to be fair, building the car is exactly the point.

References

Udemy Course by Eden Marco

LangChain- Agentic AI Engineering with LangChain & LangGraph

LangChain Docs — Tools
https://docs.langchain.com/oss/python/langchain/tools

LangChain Docs — Messages
https://docs.langchain.com/oss/python/langchain/messages

Ollama Docs — Tool Calling
https://docs.ollama.com/capabilities/tool-calling

Optional reference:
OpenAI Function Calling Guide
https://platform.openai.com/docs/guides/function-calling?api-mode=responses