Building a Stock Market AI Copilot From Scratch: LangChain, Real Data APIs, and a Clickable App

Posted on February 26, 2026February 26, 2026 by Mark Harrell

Contents show

Building a Stock Market AI Copilot From Scratch: LangChain, Real Data APIs, and a Clickable App

You know that moment when you see a stock drop 8% in a week and you're just sitting there asking yourself — “okay but why?”

That question alone is what most market tools completely fail to answer well. You get charts. You get price tables. You get a news feed that may or may not be related. Then you're expected to piece it all together yourself, usually in a Slack message or a quick note to your team.

That workflow is manual, messy, and impossible to scale.

What if you could just ask a question — like a human question — and get back a short, clean market brief that actually uses real data? Not a hallucinated number. Not a vague ChatGPT response. A structured note, backed by actual prices, fundamentals, and headlines, all generated automatically.

That's exactly what this project builds. A Market Brief Copilot. It's a minimal viable product (MVP) that accepts a natural-language question about a stock, fetches real data through tool-backed APIs, and returns a brief you could paste straight into a Slack channel or a weekly memo.

Let's get into how it works — step by step, in plain English.

What Problem Are We Actually Solving?

Before writing a single line of code, it's worth understanding the actual pain point.

In any product where people care about markets — fintech apps, wealthtech platforms, analyst teams, investment newsletters — there's always someone who needs to answer a simple question like:

“Why did AAPL drop so much last month?”
“How risky was MSFT's ride over the last 90 days?”
“What's been in the news around this ticker lately?”

Right now, answering that requires pulling price data from one place, checking a fundamentals dashboard somewhere else, scanning a news feed, and writing a short summary by hand. It works, but it takes time. And every person formats it differently.

Dashboards can't handle open-ended questions. Pure AI responses can't be trusted unless the numbers come from real data. This project sits right in between — a copilot that calls real data tools, then uses an AI model to write the brief from those tool outputs, not from memory.

That's the core idea. The AI doesn't guess. It reads.

Tools You'll Need Before Starting

This project assumes you have a working Python setup and you've built at least one script that calls an external API. You don't need to be an expert, but basic Python comfort helps a lot.

You'll need:

Python 3.10 or newer
An EODHD API key (this is where the stock data comes from)
An OpenAI API key (for the language model that writes the brief)
A virtual environment set up locally (venv or conda both work fine)

The main libraries used:

EODHD — the data layer. Prices, fundamentals, and news all come from here
OpenAI — the language model that turns tool outputs into a readable brief
LangChain + LangGraph — these handle the “agent” logic, meaning the model can decide which tools to call and when
Streamlit — a quick way to wrap everything into a clickable web app without building a full frontend

How the App Is Structured: Two Files, Two Jobs

One of the cleanest decisions in this project is the separation between the brain and the face of the app.

copilot.py — The Engine

This file does all the actual work:

Defines the data-fetching tools (prices, fundamentals, news, risk)
Sets up the AI agent with rules about how to behave
Exposes one callable function — run_query() — that accepts a question and returns a markdown brief plus a structured data dictionary

If you ever want to swap Streamlit for a different frontend or wrap this in a proper API, you keep this file and replace everything else.

app.py — The Interface

This file only handles what the user sees:

A sidebar with inputs (ticker, query, optional settings)
A two-column layout: brief on the left, raw metrics on the right
No data logic whatsoever — it just calls run_query() and displays what comes back

This split matters more than it sounds. When all your logic lives inside a Streamlit file, you can't reuse it anywhere else. Keeping them separate means the core product behavior lives in one testable, portable place.

Building the Engine: copilot.py

Setting Up Imports and API Keys

Start with a clean set of imports. Nothing excessive — just what the project genuinely needs.

import json
from datetime import datetime, timedelta
from typing import Any, Dict, List, Optional, Tuple

import numpy as np
import pandas as pd
import requests
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent

eodhd_api_key = 'YOUR EODHD API KEY'
openai_api_key = 'YOUR OPENAI API KEY'

In a real deployed product, those API keys would live in environment variables. For now, putting them at the top of the file keeps things simple while you're building.

Helper Functions That Keep Everything Sane

Three small utility functions live at the top of the file. They're not glamorous, but they prevent a surprising number of bugs.

normalize_ticker()

People type stock tickers in all kinds of ways — aapl, AAPL, AAPL.US, with trailing spaces, lowercase, whatever. EODHD expects a specific format. This function cleans that up before any API call happens.

def normalize_ticker(t: str) -> str:
    t = (t or "").strip().upper()
    if not t:
        return t
    if "." in t:
        return t
    return f"{t}.US"

_safe_json_loads()

When the agent returns tool outputs, the data might come back as a Python dictionary or as a JSON string depending on context. This helper handles both cases without throwing errors.

def _safe_json_loads(x: Any) -> Optional[Any]:
    if x is None:
        return None
    if isinstance(x, (dict, list)):
        return x
    if not isinstance(x, str):
        return None
    try:
        return json.loads(x)
    except Exception:
        return None

get_eod_prices_raw()

Every tool that needs price data uses this function instead of copy-pasting API call logic everywhere. It fetches OHLCV (open, high, low, close, volume) data from EODHD, cleans it up, and returns a neat pandas DataFrame sorted by date.

def get_eod_prices_raw(ticker: str, start: str, end: str) -> pd.DataFrame:
    url = f"https://eodhd.com/api/eod/{ticker}"
    params = {"from": start, "to": end, "api_token": eodhd_api_key, "fmt": "json"}
    r = requests.get(url, params=params)
    data = r.json()
    if not isinstance(data, list) or not data:
        return pd.DataFrame(columns=["date", "open", "high", "low", "close", "volume"])
    df = pd.DataFrame(data)
    keep = [c for c in ["date", "open", "high", "low", "close", "volume"] if c in df.columns]
    df = df[keep].copy()
    df["ticker"] = ticker
    df["date"] = pd.to_datetime(df["date"], errors="coerce")
    df = df.dropna(subset=["date", "close"]).sort_values("date").reset_index(drop=True)
    return df

The Data Tools: What the AI Can Actually Call

This is where it gets interesting. Before the AI writes anything, it needs to fetch facts. These tools are the only way it's allowed to do that — no guessing, no making up numbers.

Think of these tools as a small internal API. The agent can see what tools exist, decide which ones are needed for a given question, call them, and use the outputs to build a brief.

Each tool has a narrow, specific job. That's intentional. Narrow tools produce predictable outputs, which makes the brief consistent and the UI easy to build.

Tool 1: last_n_days_prices

The most common starting point for any market question: “what happened recently?”

This tool pulls enough daily price bars to cover the last N trading days, then returns a compact summary — start and end date, first and last close, total return, and how many trading days were included.

@tool
def last_n_days_prices(ticker: str, n: int = 60) -> Dict[str, Any]:
    """
    Quick return window over last N trading days.
    Returns a compact summary. No raw rows.
    """
    ticker = normalize_ticker(ticker)
    end = datetime.utcnow().date().isoformat()
    start = (datetime.utcnow().date() - timedelta(days=240)).isoformat()
    df = get_eod_prices_raw(ticker, start, end)
    if df.empty:
        return {"ticker": ticker, "error": "no_price_data"}
    df = df.tail(int(n)).reset_index(drop=True)
    if df.empty:
        return {"ticker": ticker, "error": "no_price_data"}
    first_close = float(df.loc[0, "close"])
    last_close = float(df.loc[len(df) - 1, "close"])
    total_return = float((last_close / first_close) - 1.0)
    return {
        "ticker": ticker,
        "n": int(n),
        "start_date": str(df.loc[0, "date"].date()),
        "end_date": str(df.loc[len(df) - 1, "date"].date()),
        "first_close": first_close,
        "last_close": last_close,
        "total_return": total_return,
    }

Notice it doesn't return raw rows. That keeps the agent from dumping a giant price table into its response, which would be messy and expensive in terms of tokens.

Tool 2: fundamentals_snapshot

For a good market brief, you usually want at least a rough valuation anchor — something like “is this stock cheap or expensive right now?” without turning the whole thing into a fundamentals report.

This tool fetches EODHD's fundamentals data and pulls a small set of fields that actually matter for a quick brief:

P/E and P/B ratios
Market capitalization
Sector and industry
Beta (how volatile the stock is relative to the market)
Dividend yield and profit margin

@tool
def fundamentals_snapshot(ticker: str) -> Dict[str, Any]:
    """
    Lightweight fundamentals snapshot.
    Returns a flat dict.
    """
    ticker = normalize_ticker(ticker)
    url = f"https://eodhd.com/api/fundamentals/{ticker}"
    params = {"api_token": eodhd_api_key, "fmt": "json"}
    r = requests.get(url, params=params)
    data = r.json()
    if not isinstance(data, dict) or not data:
        return {"ticker": ticker, "error": "no_data"}
    highlights = data.get("Highlights", {}) or {}
    general = data.get("General", {}) or {}
    valuation = data.get("Valuation", {}) or {}
    technicals = data.get("Technicals", {}) or {}
    return {
        "ticker": ticker,
        "name": general.get("Name"),
        "sector": general.get("Sector"),
        "industry": general.get("Industry"),
        "market_cap": highlights.get("MarketCapitalization"),
        "pe": valuation.get("TrailingPE"),
        "pb": valuation.get("PriceBookMRQ"),
        "profit_margin": highlights.get("ProfitMargin"),
        "dividend_yield": highlights.get("DividendYield"),
        "beta": technicals.get("Beta"),
    }

Missing fields just return None. The tool never guesses or fills in fake values.

Tool 3: latest_news

A price move without context is just a number. This tool pulls the latest headlines for a ticker from EODHD's financial news API, sorts them by date, and returns a clean list with date, title, link, and source.

@tool
def latest_news(ticker: str, limit: int = 5) -> List[Dict[str, Any]]:
    """
    Latest headlines for a ticker.
    Returns a compact list of dicts.
    """
    ticker = normalize_ticker(ticker)
    url = f"https://eodhd.com/api/news"
    params = {"s": ticker, "limit": int(limit), "offset": 0, "api_token": eodhd_api_key, "fmt": "json"}
    r = requests.get(url, params=params)
    data = r.json()
    if not isinstance(data, list) or not data:
        return []
    df = pd.DataFrame(data)
    keep = [c for c in ["date", "title", "link", "source"] if c in df.columns]
    df = df[keep].copy()
    if "date" in df.columns:
        df["date"] = pd.to_datetime(df["date"], errors="coerce")
        df = df.sort_values("date", ascending=False)
    out = df.head(int(limit)).reset_index(drop=True).to_dict(orient="records")
    for row in out:
        dt = row.get("date")
        if isinstance(dt, (pd.Timestamp, datetime)):
            row["date"] = dt.isoformat()
    return out

No sentiment analysis, no scoring — just the raw headlines in a predictable shape. The AI model does the interpretation when it writes the brief.

Tool 4: risk_metrics

Sometimes the real question isn't “what was the return?” It's “how rough was it getting there?”

That's what this tool covers. Give it a date range, and it computes:

Annualized volatility — how noisy the daily price swings were, scaled up to a yearly figure
Max drawdown — the biggest peak-to-trough decline inside the window
Total return — so the return and risk numbers are always on the same time frame

@tool
def risk_metrics(ticker: str, start: str, end: str) -> Dict[str, Any]:
    """
    Risk metrics from daily close prices over a window.
    volatility_ann: annualized vol from daily returns
    max_drawdown: max drawdown over the window
    """
    ticker = normalize_ticker(ticker)
    df = get_eod_prices_raw(ticker, start, end)
    if df.empty:
        return {"ticker": ticker, "error": "no_price_data"}
    df = df.sort_values("date").reset_index(drop=True)
    df["ret"] = df["close"].pct_change().fillna(0.0)
    vol_ann = float(df["ret"].std(ddof=0) * np.sqrt(252))
    cummax = df["close"].cummax()
    dd = (df["close"] / cummax) - 1.0
    max_dd = float(dd.min())
    first_close = float(df.loc[0, "close"])
    last_close = float(df.loc[len(df) - 1, "close"])
    total_return = float((last_close / first_close) - 1.0)
    return {
        "ticker": ticker,
        "start_date": str(df.loc[0, "date"].date()),
        "end_date": str(df.loc[len(df) - 1, "date"].date()),
        "n": int(len(df)),
        "total_return": total_return,
        "volatility_ann": vol_ann,
        "max_drawdown": max_dd,
    }

This tool should only run when someone actually asks for risk metrics. It's an extra API call, and if you let the agent run it by default, you'll slow down every single brief unnecessarily.

Tool 5: eod_prices (The Escape Hatch)

Most of the time, the four tools above are all you need. But the moment you want to build custom indicators — rolling averages, ATR, pattern detection — you need raw price bars.

@tool
def eod_prices(ticker: str, start: str, end: str) -> List[Dict[str, Any]]:
    """
    Raw OHLCV rows. Use only for custom calculations that cannot be done
    with the compact tools.
    """
    ticker = normalize_ticker(ticker)
    df = get_eod_prices_raw(ticker, start, end)
    return json.loads(df.to_json(orient="records"))

Keep it as a last resort. It returns full daily rows, which means more data being passed to the model. More data means slower runs and higher costs.

Sanity Testing Your Tools

Before setting up the agent, test each tool directly. This is not optional. If your tools return garbage, your briefs will too.

print("\n--- last_n_days_prices ---")
out_price = last_n_days_prices.invoke({"ticker": "AAPL.US", "n": 60})
print(out_price)

print("\n--- fundamentals_snapshot ---")
out_fund = fundamentals_snapshot.invoke({"ticker": "AAPL.US"})
print(out_fund)

print("\n--- latest_news ---")
out_news = latest_news.invoke({"ticker": "AAPL.US", "limit": 5})
print(f"news rows: {len(out_news)}")
print(out_news[:2])

print("\n--- risk_metrics ---")
end = datetime.utcnow().date()
start = (end - timedelta(days=180)).isoformat()
end = end.isoformat()
out_risk = risk_metrics.invoke({"ticker": "AAPL.US", "start": start, "end": end})
print(out_risk)

When this runs correctly, you'll see something like:

last_n_days_prices returning a 60-day window with a clean return figure
fundamentals_snapshot showing P/E, P/B, market cap, beta, sector
latest_news returning 5 structured headline objects
risk_metrics returning volatility and max drawdown over the window

If any field shows None or an error key, that's fine — the tools are designed to communicate failures cleanly instead of crashing.

Creating the Agent

Here's where everything becomes a copilot instead of just a collection of functions.

The agent setup has two parts: the system prompt (which acts as a behavioral contract) and the agent itself (which handles deciding when to call which tool).

The System Prompt

system_prompt = (
    "You are a market brief copilot embedded in a product.\n"
    "Rules:\n"
    "1) Use tools for facts. Never invent numbers.\n"
    "2) Do not dump raw price rows or long news lists.\n"
    "3) If the user didn't ask for something, don't compute it.\n"
    "4) Output in clean Markdown with sections.\n"
    "5) Keep it short and useful, like an internal dashboard note.\n"
    "Tool guidance:\n"
    "- Use last_n_days_prices for return windows.\n"
    "- Use fundamentals_snapshot for PE/PB/market cap/sector/beta.\n"
    "- Use latest_news for headlines.\n"
    "- Use risk_metrics only if asked for vol/drawdown.\n"
    "- Use eod_prices only if absolutely required for custom calcs.\n"
)

This prompt is basically a job description for the agent. Without clear rules like “never invent numbers” and “only compute what was asked,” the model will drift. It'll start guessing returns. It'll add sections nobody asked for. Explicit instructions prevent that.

Building and Wiring the Agent

def _build_agent() -> Any:
    llm = ChatOpenAI(
        model='gpt-4o-mini',
        temperature=0,
        api_key=openai_api_key,
    )
    tools = [last_n_days_prices, fundamentals_snapshot, latest_news, risk_metrics, eod_prices]
    return create_react_agent(model=llm, tools=tools)

AGENT = _build_agent()

temperature=0 means the model doesn't get creative with the numbers. You want deterministic behavior here — the same question should produce the same brief structure every time.

Extracting Tool Outputs for the UI

When the agent runs, it calls tools and gets back data. We want to reuse that data in the UI without making duplicate API calls.

This function walks through the LangGraph message list and extracts any tool outputs it finds:

def _extract_artifacts(messages: List[Any]) -> Dict[str, Any]:
    out: Dict[str, Any] = {}
    for m in messages:
        name = getattr(m, "name", None)
        content = getattr(m, "content", None)
        if not name:
            continue
        payload = _safe_json_loads(content)
        if payload is None:
            continue
        if name.endswith("last_n_days_prices"):
            out["price"] = payload
        elif name.endswith("fundamentals_snapshot"):
            out["valuation"] = payload
        elif name.endswith("risk_metrics"):
            out["risk"] = payload
        elif name.endswith("latest_news"):
            out["headlines"] = payload
    return out

This is what makes the right-hand panel in the UI possible. The agent already fetched everything it needed to write the brief. We just grab those outputs and pass them to the display layer — no second round of API calls.

The Callable Backend Function

This is the piece that ties everything together. One input, two outputs.

def run_query(
    query: str,
    default_ticker: str = "AAPL.US",
    default_n_days: int = 60,
    force_fundamentals: bool = True,
    force_risk: bool = False,
    force_news: bool = True,
    news_limit: int = 5,
) -> Tuple[str, Dict[str, Any]]:
    q = (query or "").strip()
    if not q:
        q = f"For {default_ticker}, compute total return over the last {default_n_days} trading days."
    
    constraints = [
        "Constraints:",
        "1) Use tools for facts. Never invent numbers.",
        "2) Do not dump raw price rows or long news lists.",
        "3) Output in clean Markdown with sections: Snapshot, Metrics, What it might mean, Caveats.",
        "4) Keep it short and useful.",
        f"5) If the query does not specify a window, assume last {int(default_n_days)} trading days.",
        f"6) If the query does not specify a ticker, assume {normalize_ticker(default_ticker)}.",
    ]
    
    if force_fundamentals:
        constraints.append("7) You must include fundamentals (PE, PB, market cap, sector, beta).")
    if force_risk:
        constraints.append("8) You must include risk metrics (annualized vol, max drawdown).")
        constraints.append("   Use the same start_date and end_date as the return window.")
    if force_news:
        constraints.append(f"9) You must include headlines. Pull exactly {int(news_limit)}.")
    
    user_prompt = "User query:\n" + q + "\n\n" + "\n".join(constraints)
    
    response = AGENT.invoke(
        {"messages": [("system", system_prompt), ("user", user_prompt)]}
    )
    
    messages = response.get("messages", [])
    final_msg = messages[-1] if messages else None
    brief_md = getattr(final_msg, "content", "") or ""
    artifacts = _extract_artifacts(messages)
    
    return brief_md, artifacts

The function builds a structured prompt from the user's question plus any enforcement constraints. Then it runs the agent, extracts the final brief text, pulls the tool artifacts, and returns both.

brief_md goes on the left side of the app. artifacts goes on the right.

Three Real Examples: What the Output Actually Looks Like

Demo 1 — The Standard Brief

Query: “For AAPL.US, compute total return over the last 60 trading days. Fetch PE and PB. Pull 5 latest headlines. Brief interpretation.”

Output:

### Snapshot
- Window: last 60 trading days (2025-10-28 to 2026-01-23)
- Price path: 269.00 → 248.04
- Total return: -7.79%

### Metrics
- Sector: Technology
- Market cap: $3.665T
- P/E: 33.20
- P/B: 49.44
- Beta: 1.09

### What it might mean
The 60-day horizon shows a ~7.8% decline in a high-valuation tech name.
Headlines point to macro headwinds — tariff concerns, shutdown fears,
and broader market rotation out of mega-caps.

### Caveats
- Data reflect the latest available snapshot
- P/B is elevated; interpret as market pricing in future growth expectations
- This is not investment advice

This is clean. The return is pinned to a specific date range. The valuation fields are there for context but don't dominate. The interpretation is short and honest.

Demo 2 — Risk-Focused Brief

Query: “For MSFT.US, last 90 trading days. Compute annualized volatility and max drawdown. Keep it short. No headlines.”

Output:

## Market Brief - MSFT.US

### Snapshot
- Window: 2025-09-16 to 2026-01-23 (90 trading days)
- Start price: 509.04
- End price: 465.95
- Total return: -8.46%

### Metrics
- Annualized volatility: 19.30%
- Max drawdown: -18.07% (peak-to-trough during the window)

### What it might mean
The -18% drawdown is significantly larger than the end-to-end return of -8.5%.
This tells you the ride was much rougher than the final number suggests.
Elevated vol at ~19% annualized backs that up.

### Caveats
- Returns reflect price only; dividends not included
- Data as of 2026-01-23

This one is for situations where the final return number doesn't tell the whole story. A -8.46% return sounds bad enough. A -18% drawdown somewhere in the middle is a lot harder to stomach. That difference matters for anyone actually holding the position.

Demo 3 — News-Only Context

Query: “For AAPL.US, pull 7 latest headlines. Summarize what changed in 6–8 lines. Reference themes, not every headline. Don't compute returns unless needed.”

Output:

## AAPL.US – Thematic Quick Read

### What changed (themes)
- Macro risk is back in focus: futures drift lower ahead of the Fed meeting
- Policy risk remains a thread: tariffs and government shutdown headlines persist
- Allocation debates: small-cap vs mega-cap discussions signal rotation concerns
- Growth vs value framing is broadening across institutional commentary
- Earnings season is adding collateral volatility to macro noise
- AI narrative is gaining traction with new founder-level product launches

No return metrics, no risk stats — because the question didn't ask for them. This is the “give me context fast” version of the copilot. It clusters headlines into themes instead of listing seven individual items, which is much more useful.

Building the Streamlit Interface

At this stage, the engine is done. The Streamlit app just needs to collect inputs, call run_query(), and display what comes back.

Why Query-First Design

The biggest design choice in the UI is making the query the primary input. Not the ticker. Not the date range. The question.

People don't naturally think “I want to configure a 60-day fundamentals run.” They think “what happened to AAPL over the last two months?” The sidebar should match that mental model.

The ticker and window exist as defaults — fallbacks for when the query doesn't specify them. The optional parameters (force fundamentals, force risk, force headlines) are enforcement knobs for teams that want consistency across every brief.

The App Skeleton

import streamlit as st
import pandas as pd
from copilot import run_query

st.set_page_config(page_title="Market Brief Copilot", layout="wide")
st.title("Market Brief Copilot")
st.caption("LangChain + EODHD. Minimal internal-style brief, with tool-backed metrics.")

layout="wide" is important. You'll have a brief on the left and metrics on the right — cramped layout would make that unreadable.

The Inputs Sidebar

with st.sidebar:
    st.header("Inputs")
    query = st.text_area("Query", value="For AAPL.US, compute total return over the last 60 trading days. Fetch PE and PB. Pull 5 latest headlines.")
    default_ticker = st.text_input("Default ticker (used only if query doesn't mention one)", value="AAPL.US")
    default_n_days = st.slider("Default trading days window", min_value=20, max_value=180, value=60)
    
    st.divider()
    
    with st.sidebar.expander("Optional parameters (force include)"):
        include_fund = st.checkbox("Fundamentals (PE, PB, etc.)", value=True)
        include_risk = st.checkbox("Risk metrics (volatility, drawdown)", value=False)
        include_news = st.checkbox("Headlines", value=False)
        news_limit = st.slider("Headline count", min_value=3, max_value=10, value=5)
    
    run_btn = st.button("Generate brief", type="primary")

The expander for optional parameters is collapsed by default — it's not something a normal user needs to touch every time. But it's there when a team wants to enforce a consistent format.

Rendering the Metrics Panel

The right-hand side of the app shows the tool-backed numbers in a scannable layout. This is where the “auditable” part of the copilot lives — you can verify every number the brief mentions.

def _render_metrics(artifacts: dict):
    cols = st.columns(3)
    price = artifacts.get("price")
    valuation = artifacts.get("valuation")
    risk = artifacts.get("risk")
    headlines = artifacts.get("headlines")
    
    with cols[0]:
        st.subheader("Price window")
        if isinstance(price, dict) and "error" not in price:
            st.metric("Total return", f"{price.get('total_return', 0.0):.2%}")
            st.caption(f"{price.get('start_date')} to {price.get('end_date')}")
            st.write(pd.DataFrame([price]).T)
        elif isinstance(price, dict) and "error" in price:
            st.warning(price["error"])
        else:
            st.info("No price tool output (not requested or tool not used).")
    
    with cols[1]:
        st.subheader("Fundamentals")
        if isinstance(valuation, dict) and "error" not in valuation:
            df = pd.DataFrame([valuation])
            keep = ["ticker", "name", "sector", "market_cap", "pe", "pb", "beta", "dividend_yield", "profit_margin"]
            keep = [c for c in keep if c in df.columns]
            st.write(df[keep].T)
        elif isinstance(valuation, dict) and "error" in valuation:
            st.warning(valuation["error"])
        else:
            st.info("No fundamentals tool output (not requested or tool not used).")
    
    with cols[2]:
        st.subheader("Risk")
        if isinstance(risk, dict) and "error" not in risk:
            st.metric("Volatility (ann.)", f"{risk.get('volatility_ann', 0.0):.2%}")
            st.metric("Max drawdown", f"{risk.get('max_drawdown', 0.0):.2%}")
            st.caption(f"{risk.get('start_date')} to {risk.get('end_date')}")
            st.write(pd.DataFrame([risk]).T)
        else:
            st.info("No risk tool output (not requested or tool not used).")
    
    st.subheader("Headlines")
    if isinstance(headlines, list) and len(headlines) > 0:
        for h in headlines:
            title = h.get("title", "Untitled")
            link = h.get("link")
            src = h.get("source")
            dt = h.get("date")
            line = f"- {title}"
            if src:
                line += f" ({src})"
            if dt:
                line += f" · {dt}"
            if link:
                st.markdown(f"{line}\n{link}")
            else:
                st.markdown(line)
    else:
        st.info("No headlines tool output (not requested or tool not used).")

Every column has a fallback message for when a tool wasn't called. That way the UI never shows blank sections without explanation.

Connecting It All Together

The main loop is clean. Button click triggers run_query(), outputs get split into two columns.

if run_btn:
    with st.spinner("Running tools and generating brief..."):
        brief_md, artifacts = run_query(
            query=query,
            default_ticker=default_ticker,
            default_n_days=default_n_days,
            force_fundamentals=include_fund,
            force_risk=include_risk,
            force_news=include_news,
            news_limit=news_limit,
        )
    
    left, right = st.columns([1.2, 1])
    
    with left:
        st.subheader("Market brief")
        st.markdown(brief_md)
    
    with right:
        st.subheader("Tool-backed metrics")
        _render_metrics(artifacts)

else:
    st.info("Set inputs on the left and click **Generate brief**.")

That's the entire app. No data logic. No API calls. Just inputs, one function call, and rendering.

Things That Will Break (And How to Handle Them)

Messy Ticker Input

People will type aapl, AAPL, AAPL.US, and everything in between. The normalize_ticker() function handles most of this, but edge cases will still show up. Build a clear error message when the tool returns an error key — don't let the app silently return a blank brief.

Missing Data

Some tickers don't have news coverage in EODHD. Some fundamentals fields will be None for smaller companies. Some price windows will return empty if the ticker is very new or delisted. The tools return {"error": "no_data"} in these cases — the UI should surface a warning instead of crashing.

Output Drift

If you let the agent run without a strict prompt, it will eventually start doing extra work — adding sections nobody asked for, pulling tools “just in case,” or adjusting the format based on what it thinks looks nice. The fix is keeping the system prompt strict and the toolset small. Boring, but it works.

Token Costs From Raw Data

The eod_prices tool is the easiest way to accidentally make the app slow and expensive. If the agent starts calling it for every request, you'll notice it quickly. Keep it as a fallback for custom calculation needs only.

Where to Take This Next

Once you have the MVP running, a few extensions fit naturally into this pattern:

Multi-ticker comparison — run the same query for two or three tickers and return a side-by-side summary. The core pattern stays the same; you just run the agent once per ticker and combine the outputs.

Scheduled briefs — set up a daily or weekly job that runs a standard query for a watchlist and pushes the results to Slack or email. The run_query() function is already portable enough to drop into any scheduler.

Caching — cache tool results by (ticker, window) so repeated queries on the same stock don't hit the APIs every time. This makes demos much faster.

API wrapper — once you're happy with the behavior, put run_query() behind a FastAPI endpoint. Streamlit stays as the demo shell, and any other frontend can call the same backend.

What This Project Actually Teaches You

The market brief is almost a side effect. The real skill you're building here is knowing how to design a system where an AI model can only say true things.

That's not a small thing. Most AI-powered tools fail in production not because the model is bad, but because there's no guardrail preventing it from making things up. The pattern in this project — narrow tools, strict prompts, structured outputs, auditable artifacts — is the same pattern that separates toy demos from tools people actually trust.

If you take anything from this project, take that: make the model earn every fact it uses. Don't let it guess. Give it tools, constrain it to those tools, and let the data do the talking.

The brief is just the readable layer on top.

Built with LangChain, LangGraph, EODHD, OpenAI, and Streamlit. Tools are the real product.

Building a Stock Market AI Copilot From Scratch: LangChain, Real Data APIs, and a Clickable App

What Problem Are We Actually Solving?

Tools You'll Need Before Starting

How the App Is Structured: Two Files, Two Jobs

copilot.py — The Engine

app.py — The Interface

Building the Engine: copilot.py

Setting Up Imports and API Keys

Helper Functions That Keep Everything Sane

normalize_ticker()

_safe_json_loads()

get_eod_prices_raw()

The Data Tools: What the AI Can Actually Call

Tool 1: last_n_days_prices

Tool 2: fundamentals_snapshot

Tool 3: latest_news

Tool 4: risk_metrics

Tool 5: eod_prices (The Escape Hatch)

Sanity Testing Your Tools

Creating the Agent

The System Prompt

Building and Wiring the Agent

Extracting Tool Outputs for the UI

The Callable Backend Function

Three Real Examples: What the Output Actually Looks Like

Demo 1 — The Standard Brief

Demo 2 — Risk-Focused Brief

Demo 3 — News-Only Context

Building the Streamlit Interface

Why Query-First Design

The App Skeleton

The Inputs Sidebar

Rendering the Metrics Panel

Connecting It All Together

Things That Will Break (And How to Handle Them)

Messy Ticker Input

Missing Data

Output Drift

Token Costs From Raw Data

Where to Take This Next

What This Project Actually Teaches You

More Posts:

Leave a Reply Cancel reply