{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Heroku Managed Inference and Agents (Mia)\n",
    "\n",
    "## Introduction\n",
    "\n",
    "Welcome to this hands-on workshop using [Heroku AI](https://www.heroku.com/ai), where you’ll explore Managed Inference, Heroku Tools (Agents), and the Model Context Protocol (MCP). Through this interactive Jupyter Notebook, you’ll learn how to deploy and interact with AI models using Heroku’s platform and open standards.\n",
    "\n",
    "## Prerequisites\n",
    "\n",
    "Before getting started, make sure you have the following:\n",
    "\n",
    "- A Heroku account  \n",
    "  → Sign up at [https://signup.heroku.com/events](https://signup.heroku.com/events)  \n",
    "- A working Jupyter Notebook environment  \n",
    "  → We’re using [heroku-reference-apps/heroku-jupyter](https://github.com/heroku-reference-apps/heroku-jupyter) for this workshop  \n",
    "- The Managed Inference and Agents add-on attached to your Heroku Jupyter app  \n",
    "  → You can attach it from the Heroku Dashboard  \n",
    "- Basic familiarity with Python and AI concepts\n",
    "\n",
    "## Workshop Agenda\n",
    "\n",
    "1. **Setup**  \n",
    "   Get your environment ready for AI and Agents on Heroku.\n",
    "\n",
    "2. **Managed Inference**  \n",
    "   Learn how to provision and call AI models using Heroku’s Managed Inference.\n",
    "\n",
    "3. **Heroku Tools (Agents)**  \n",
    "   Enable agents to take action inside your Heroku apps with simple configurations.\n",
    "\n",
    "4. **Model Context Protocol (MCP)**  \n",
    "   Use MCP with Managed Inference and Agents to power your applications with context aware agents, and deploy your own MCP to Heroku.\n",
    "\n",
    "---\n",
    "\n",
    "## 1. Setup: Enabling AI and Agents on Heroku\n",
    "\n",
    "To get started, attach the **Managed Inference and Agents** add-on to your Heroku app (in this case, the Jupyter app). This will automatically inject the following environment variables:\n",
    "\n",
    "- `INFERENCE_URL` – the endpoint for your provisioned model  \n",
    "- `INFERENCE_KEY` – the API key to authenticate requests  \n",
    "- `INFERENCE_MODEL_ID` – the identifier of the selected model  \n",
    "\n",
    "To enable Heroku Tools (Agents), you'll also need to set:\n",
    "\n",
    "- `TARGET_APP_NAME` – the name of the Heroku app the agents will interact with  \n",
    "  → For this workshop, we’ll use the same Jupyter app\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Setup the environment variables and imports:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import os\n",
    "import requests\n",
    "from IPython.display import display, Markdown, clear_output\n",
    "\n",
    "# Verify the environment variables are set\n",
    "required_vars = ['INFERENCE_URL', 'INFERENCE_KEY', 'INFERENCE_MODEL_ID']\n",
    "missing_vars = []\n",
    "\n",
    "for var in required_vars:\n",
    "    if var not in os.environ:\n",
    "        missing_vars.append(var)\n",
    "\n",
    "if missing_vars:\n",
    "    raise ValueError(f\"Missing required environment variables: {', '.join(missing_vars)}\")\n",
    "\n",
    "INFERENCE_URL = os.environ['INFERENCE_URL']\n",
    "INFERENCE_KEY = os.environ['INFERENCE_KEY']\n",
    "INFERENCE_MODEL_ID = os.environ['INFERENCE_MODEL_ID']\n",
    "TARGET_APP_NAME = os.environ.get('TARGET_APP_NAME', 'ai-engineer-workshop')\n",
    "\n",
    "print(\"✅ Environment variables loaded successfully!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## 2. Managed Inference with Heroku AI\n",
    "\n",
    "### What is Managed Inference?\n",
    "\n",
    "Heroku Managed Inference and Agents (MIA) simplifies the deployment and use of powerful AI models. Instead of managing infrastructure, you get access to state-of-the-art models as a fully managed service—directly from your Heroku apps.\n",
    "\n",
    "Key benefits include:\n",
    "\n",
    "- **No server management**  \n",
    "  Heroku handles scaling, updates, and availability.\n",
    "  \n",
    "- **Simple API access**  \n",
    "  Use environment variables to make authenticated requests to your model.\n",
    "  \n",
    "- **Faster innovation**  \n",
    "  Focus on building smart features—no need to worry about GPUs or orchestration.\n",
    "\n",
    "### Calling a Chat Completion Model ([`/v1/chat/completions`](https://devcenter.heroku.com/articles/heroku-inference-api-v1-chat-completions))\n",
    "\n",
    "The `/v1/chat/completions` endpoint generates conversational completions for a provided set of input messages. You can specify the model, adjust generation settings such as `temperature`, and opt to stream the responses in real time. You can also specify `tools` the model can choose to call.\n",
    "\n",
    "Now let’s make a basic API call to a chat model you’ve provisioned with Heroku MIA. We’ll prompt the model to generate a response."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Simple Chat Completion\n",
    "headers = {\n",
    "    \"Authorization\": f\"Bearer {INFERENCE_KEY}\",\n",
    "    \"Content-Type\": \"application/json\"\n",
    "}\n",
    "\n",
    "payload = {\n",
    "    \"model\": INFERENCE_MODEL_ID,\n",
    "    \"messages\": [\n",
    "        {\"role\": \"user\", \"content\": \"Explain the concept of 'Managed Inference' in one sentence.\"}\n",
    "    ],\n",
    "    \"temperature\": 0.5,\n",
    "    \"max_tokens\": 100\n",
    "}\n",
    "\n",
    "response = None\n",
    "try:\n",
    "    response = requests.post(f\"{INFERENCE_URL}/v1/chat/completions\", headers=headers, json=payload)\n",
    "    response.raise_for_status()\n",
    "    result = response.json()\n",
    "\n",
    "    print(\"📋 Model Response:\")\n",
    "    print(json.dumps(result, indent=2))\n",
    "    \n",
    "    ai_content = result['choices'][0]['message']['content']\n",
    "    display(Markdown(f\"#### 🧠 AI Response\\n\\n{ai_content}\"))\n",
    "\n",
    "except requests.exceptions.RequestException as e:\n",
    "    print(f\"❌ Error making API call: {e}\")\n",
    "    if response is not None:\n",
    "        print(f\"Status: {response.status_code}\")\n",
    "        print(f\"Response: {response.text}\")\n",
    "except (KeyError, IndexError) as e:\n",
    "    print(f\"❌ Error parsing response: {e}\")\n",
    "    if 'result' in locals():\n",
    "        print(f\"Response structure: {json.dumps(result, indent=2)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's make the request to the chat completion model using a streaming response."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Streaming Chat Completion (RAW)\n",
    "headers = {\n",
    "    \"Authorization\": f\"Bearer {INFERENCE_KEY}\",\n",
    "    \"Content-Type\": \"application/json\"\n",
    "}\n",
    "\n",
    "payload = {\n",
    "    \"model\": INFERENCE_MODEL_ID,\n",
    "    \"messages\": [\n",
    "        {\"role\": \"user\", \"content\": \"Why streaming is important in the context of Generative AI?\"}\n",
    "    ],\n",
    "    \"temperature\": 0.5,\n",
    "    \"stream\": True\n",
    "}\n",
    "\n",
    "response = None\n",
    "try:\n",
    "    response = requests.post(f\"{INFERENCE_URL}/v1/chat/completions\", headers=headers, json=payload, stream=True)\n",
    "    response.raise_for_status()\n",
    "\n",
    "    print(\"🌊 Streaming Model Response (RAW):\")\n",
    "    print(\"-\" * 50)\n",
    "    \n",
    "    for chunk in response.iter_content(chunk_size=1024):\n",
    "        if chunk:\n",
    "            print(chunk.decode('utf-8'), end='', flush=True)\n",
    "    \n",
    "    print(\"\\n\" + \"-\" * 50)\n",
    "\n",
    "except requests.exceptions.RequestException as e:\n",
    "    print(f\"❌ Error making API call: {e}\")\n",
    "    if response is not None:\n",
    "        print(f\"Status: {response.status_code}\")\n",
    "        print(f\"Response: {response.text}\")\n",
    "except UnicodeDecodeError as e:\n",
    "    print(f\"❌ Error decoding response: {e}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's make the same streaming call but using a parsed response."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Streaming Chat Completion (Parsed)\n",
    "headers = {\n",
    "    \"Authorization\": f\"Bearer {INFERENCE_KEY}\",\n",
    "    \"Content-Type\": \"application/json\"\n",
    "}\n",
    "\n",
    "payload = {\n",
    "    \"model\": INFERENCE_MODEL_ID,\n",
    "    \"messages\": [\n",
    "        {\"role\": \"user\", \"content\": \"Why streaming is important in the context of Generative AI?\"}\n",
    "    ],\n",
    "    \"stream\": True\n",
    "}\n",
    "\n",
    "stream_response = \"\"\n",
    "response = None\n",
    "\n",
    "try: \n",
    "    response = requests.post(f\"{INFERENCE_URL}/v1/chat/completions\", headers=headers, json=payload, stream=True)\n",
    "    response.raise_for_status()\n",
    "    \n",
    "    print(\"🌊 Streaming Model Response:\")\n",
    "    print(\"-\" * 50)\n",
    "    \n",
    "    for line in response.iter_lines():\n",
    "        if not line:\n",
    "            continue\n",
    "            \n",
    "        decoded_line = line.decode('utf-8').lstrip('data:').strip()\n",
    "        \n",
    "        if decoded_line.startswith('event') or decoded_line == \"[DONE]\":\n",
    "            continue\n",
    "\n",
    "        try:\n",
    "            data = json.loads(decoded_line)\n",
    "            delta = data.get(\"choices\", [{}])[0].get(\"delta\", {}).get(\"content\", \"\")\n",
    "            if delta:\n",
    "                print(delta, end='', flush=True)\n",
    "                stream_response += delta\n",
    "                \n",
    "        except json.JSONDecodeError:\n",
    "            print(f\"\\n⚠️ Error decoding: {decoded_line}\")\n",
    "\n",
    "except requests.exceptions.RequestException as e:\n",
    "    print(f\"❌ Error making API call: {e}\")\n",
    "    if response is not None:\n",
    "        print(f\"Status: {response.status_code}\")\n",
    "        print(f\"Response: {response.text}\")\n",
    "\n",
    "print(\"\\n\" + \"-\" * 50)\n",
    "clear_output(wait=True)\n",
    "display(Markdown(f\"#### 🧠 AI Response\\n\\n{stream_response}\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## 3. Heroku Tools (Agents)\n",
    "\n",
    "Large Language Models (LLMs) are great at understanding and generating human language—but by themselves, they can't interact with external systems or perform real-world tasks.\n",
    "\n",
    "This is where **[Heroku Tools](https://devcenter.heroku.com/articles/heroku-inference-tools)** come in. These tools enable LLMs to become **actionable agents** that can execute commands, fetch data, and transform content inside your Heroku application environment.\n",
    "\n",
    "### What Heroku Tools Can Do\n",
    "\n",
    "- **🔧 Execute Commands**  \n",
    "  Use `dyno_run_command` to run shell commands directly on your Heroku dynos.\n",
    "\n",
    "- **🗃️ Query Databases**  \n",
    "  Access Heroku Postgres using `postgres_get_schema` and `postgres_run_query` tools to inspect and query your databases.\n",
    "\n",
    "- **📝 Transform Documents**  \n",
    "  Convert HTML or PDF files into Markdown using `html_to_markdown` and `pdf_to_markdown`.\n",
    "\n",
    "- **💻 Run Code Securely**  \n",
    "  Execute LLM-generated code safely in a sandboxed environment using `code_exec_python`, `code_exec_node`, `code_exec_ruby`, and `code_exec_go`.\n",
    "\n",
    "This transforms a static LLM into a **dynamic, task-driven agent** that can reason and act inside your full stack—without leaving the Heroku ecosystem.\n",
    "\n",
    "### Calling the Agents Endpoint ([`/v1/agents/heroku`](https://devcenter.heroku.com/articles/heroku-inference-api-v1-agents-heroku))\n",
    "\n",
    "The `/v1/agents/heroku` endpoint allows you to interact with an agentic system powered by large language models (LLMs) that can autonomously invoke tools based on your messages. Unlike `/v1/chat/completions`, which generates a single model response, the `/v1/agents/heroku` endpoint supports automatic tool execution and multistep workflows.\n",
    "\n",
    "> 💡 All the agent requests are streamed."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Demo: Running a Command on a Heroku Dyno\n",
    "\n",
    "Let’s demonstrate how an agent can use the `dyno_run_command` tool to get the current date and time from your Heroku application's dyno.\n",
    "\n",
    "> 🛠️ **Pre-requisite**:  \n",
    "> Ensure your Heroku app (`TARGET_APP_NAME`) has the **Managed Inference and Agents** add-on attached."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Agent using dyno_run_command to get current time\n",
    "headers = {\n",
    "    \"Authorization\": f\"Bearer {INFERENCE_KEY}\",\n",
    "    \"Content-Type\": \"application/json\"\n",
    "}\n",
    "\n",
    "payload_tool = {\n",
    "    \"model\": INFERENCE_MODEL_ID,\n",
    "    \"messages\": [\n",
    "        {\"role\": \"user\", \"content\": \"What is the current date and time on the server?\"}\n",
    "    ],\n",
    "    \"tools\": [\n",
    "        {\n",
    "            \"type\": \"heroku_tool\",\n",
    "            \"name\": \"dyno_run_command\",\n",
    "            \"runtime_params\": {\n",
    "                \"target_app_name\": TARGET_APP_NAME,\n",
    "                \"tool_params\": {\n",
    "                    \"cmd\": \"date\",\n",
    "                    \"description\": \"Runs the 'date' command on a one-off dyno to get the current date and time.\",\n",
    "                    \"parameters\": {\"type\": \"object\", \"properties\": {}, \"required\": []}\n",
    "                }\n",
    "            }\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "response_tool = None\n",
    "try:\n",
    "    response_tool = requests.post(\n",
    "        f\"{INFERENCE_URL}/v1/agents/heroku\", \n",
    "        headers=headers, \n",
    "        json=payload_tool,\n",
    "        stream=True\n",
    "    )\n",
    "    response_tool.raise_for_status()\n",
    "    \n",
    "    print(\"🔧 Agent Tool Execution:\")\n",
    "    print(f\"Calling Heroku agent endpoint for app: {TARGET_APP_NAME}\")\n",
    "    print(\"-\" * 60)\n",
    "    \n",
    "    for line in response_tool.iter_lines():\n",
    "        if not line:\n",
    "            continue\n",
    "            \n",
    "        decoded_line = line.decode('utf-8').lstrip('data:').strip()\n",
    "        \n",
    "        if decoded_line.startswith('event') or decoded_line == \"[DONE]\" or not decoded_line:\n",
    "            continue\n",
    "        \n",
    "        try:\n",
    "            chunk = json.loads(decoded_line)\n",
    "\n",
    "            if chunk.get('type') == 'server_error':\n",
    "                print(f\"❌ Server Error: {chunk.get('message', 'Unknown error')}\")\n",
    "                break\n",
    "            \n",
    "            choices = chunk.get('choices', [])\n",
    "            if not choices:\n",
    "                continue\n",
    "                \n",
    "            delta = choices[0].get('message', {})\n",
    "            \n",
    "            # Handle tool calls\n",
    "            tool_calls = delta.get('tool_calls', [])\n",
    "            for tool_call in tool_calls:\n",
    "                function_info = tool_call.get('function', {})\n",
    "                tool_name = function_info.get('name', 'Unknown')\n",
    "                print(f\"🛠️ Executing tool: {tool_name}\")\n",
    "            \n",
    "            # Handle content\n",
    "            content = delta.get('content')\n",
    "            if content:\n",
    "                display(Markdown(f\"#### 🤖 Agent Response\\n\\n{content}\"))\n",
    "                \n",
    "        except json.JSONDecodeError as e:\n",
    "            print(f\"⚠️ JSON decode error: {e}\")\n",
    "            print(f\"Raw line: {decoded_line}\")\n",
    "\n",
    "except requests.exceptions.RequestException as e:\n",
    "    print(f\"❌ Error making API call: {e}\")\n",
    "    if response_tool is not None:\n",
    "        print(f\"Status: {response_tool.status_code}\")\n",
    "        print(f\"Response: {response_tool.text}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Demo: Code Execution\n",
    "\n",
    "Let’s see how an agent can use the `code_exec_node` tool to run a simple script that calculates the 13th Fibonacci number.\n",
    "\n",
    "> 💡 You can also try this with `code_exec_python`, `code_exec_ruby`, or `code_exec_go` to see how the same logic runs across different languages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Agent using code_exec_* to calculate the 13th Fibonacci number\n",
    "\n",
    "language = \"node\"  # Exercise: Change to python, ruby, or go\n",
    "\n",
    "def highlight_language(language):\n",
    "    return \"javascript\" if language == \"node\" else language\n",
    "\n",
    "headers = {\n",
    "    \"Authorization\": f\"Bearer {INFERENCE_KEY}\",\n",
    "    \"Content-Type\": \"application/json\"\n",
    "}\n",
    "\n",
    "payload_tool = {\n",
    "    \"model\": INFERENCE_MODEL_ID,\n",
    "    \"messages\": [\n",
    "        {\"role\": \"user\", \"content\": \"What is the 13th Fibonacci number?\"} # Exercise: Change the prompt to a different operation\n",
    "    ],\n",
    "    \"tools\": [\n",
    "        {\n",
    "            \"type\": \"heroku_tool\",\n",
    "            \"name\": f\"code_exec_{language}\"\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "response_tool = None\n",
    "try:\n",
    "    response_tool = requests.post(\n",
    "        f\"{INFERENCE_URL}/v1/agents/heroku\", \n",
    "        headers=headers, \n",
    "        json=payload_tool,\n",
    "        stream=True\n",
    "    )\n",
    "    response_tool.raise_for_status()\n",
    "    \n",
    "    print(\"💻 Agent Code Execution:\")\n",
    "    print(f\"Language: {language}\")\n",
    "    print(\"-\" * 50)\n",
    "    \n",
    "    for line in response_tool.iter_lines():\n",
    "        if not line:\n",
    "            continue\n",
    "            \n",
    "        decoded_line = line.decode('utf-8').lstrip('data:').strip()\n",
    "        \n",
    "        if decoded_line.startswith('event') or decoded_line == \"[DONE]\" or not decoded_line:\n",
    "            continue\n",
    "        \n",
    "        try:\n",
    "            chunk = json.loads(decoded_line)\n",
    "\n",
    "            if chunk.get('type') == 'server_error':\n",
    "                print(f\"❌ Server Error: {chunk.get('message', 'Unknown error')}\")\n",
    "                break\n",
    "            \n",
    "            choices = chunk.get('choices', [])\n",
    "            if not choices:\n",
    "                continue\n",
    "                \n",
    "            delta = choices[0].get('message', {})\n",
    "            \n",
    "            # Handle tool calls\n",
    "            tool_calls = delta.get('tool_calls', [])\n",
    "            for tool_call in tool_calls:\n",
    "                function_info = tool_call.get('function', {})\n",
    "                tool_name = function_info.get('name', 'Unknown')\n",
    "                \n",
    "                try:\n",
    "                    arguments = json.loads(function_info.get('arguments', '{}'))\n",
    "                    code = arguments.get('code', '')\n",
    "                    \n",
    "                    print(f\"🛠️ Tool: {tool_name}\")\n",
    "                    print(f\"📝 Arguments: {json.dumps(arguments, indent=2)}\")\n",
    "                    \n",
    "                    if code:\n",
    "                        display(Markdown(f\"```{highlight_language(language)}\\n{code}\\n```\"))\n",
    "                        \n",
    "                except json.JSONDecodeError as e:\n",
    "                    print(f\"⚠️ Error parsing tool arguments: {e}\")\n",
    "            \n",
    "            # Handle content\n",
    "            content = delta.get('content')\n",
    "            if content:\n",
    "                display(Markdown(f\"#### 🤖 Agent Response\\n\\n{content}\"))\n",
    "                \n",
    "        except json.JSONDecodeError as e:\n",
    "            print(f\"⚠️ JSON decode error: {e}\")\n",
    "            print(f\"Raw line: {decoded_line}\")\n",
    "\n",
    "except requests.exceptions.RequestException as e:\n",
    "    print(f\"❌ Error making API call: {e}\")\n",
    "    if response_tool is not None:\n",
    "        print(f\"Status: {response_tool.status_code}\")\n",
    "        print(f\"Response: {response_tool.text}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Demo: Tool Chaining\n",
    "\n",
    "Let's see how an agent can use the `html_to_markdown` to fetch a code snippet from Wikipedia and then use the `code_exec_python` tool to execute the code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Tool Chaining with html_to_markdown and code_exec_python\n",
    "\n",
    "headers = {\n",
    "    \"Authorization\": f\"Bearer {INFERENCE_KEY}\",\n",
    "    \"Content-Type\": \"application/json\"\n",
    "}\n",
    "\n",
    "payload_tool = {\n",
    "    \"model\": INFERENCE_MODEL_ID,\n",
    "    \"messages\": [\n",
    "        {\"role\": \"user\", \"content\": \"Use the Python snippet from the Wikipedia page for Euclidean algorithm to calculate the greatest common divisor (GCD) of 252 and 105.\"}\n",
    "    ],\n",
    "    \"tools\": [\n",
    "        {\n",
    "            \"type\": \"heroku_tool\",\n",
    "            \"name\": \"html_to_markdown\"\n",
    "        },\n",
    "        {\n",
    "            \"type\": \"heroku_tool\",\n",
    "            \"name\": \"code_exec_python\"\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "response_tool = None\n",
    "try:\n",
    "    response_tool = requests.post(\n",
    "        f\"{INFERENCE_URL}/v1/agents/heroku\", \n",
    "        headers=headers, \n",
    "        json=payload_tool,\n",
    "        stream=True\n",
    "    )\n",
    "    response_tool.raise_for_status()\n",
    "    \n",
    "    print(\"🔗 Agent Tool Chaining:\")\n",
    "    print(\"Tools: html_to_markdown → code_exec_python\")\n",
    "    print(\"-\" * 60)\n",
    "    \n",
    "    for line in response_tool.iter_lines():\n",
    "        if not line:\n",
    "            continue\n",
    "            \n",
    "        decoded_line = line.decode('utf-8').lstrip('data:').strip()\n",
    "        \n",
    "        if decoded_line.startswith('event') or decoded_line == \"[DONE]\" or not decoded_line:\n",
    "            continue\n",
    "        \n",
    "        try:\n",
    "            chunk = json.loads(decoded_line)\n",
    "\n",
    "            if chunk.get('type') == 'server_error':\n",
    "                print(f\"❌ Server Error: {chunk.get('message', 'Unknown error')}\")\n",
    "                break\n",
    "            \n",
    "            choices = chunk.get('choices', [])\n",
    "            if not choices:\n",
    "                continue\n",
    "                \n",
    "            delta = choices[0].get('message', {})\n",
    "            \n",
    "            # Handle tool calls\n",
    "            tool_calls = delta.get('tool_calls', [])\n",
    "            for tool_call in tool_calls:\n",
    "                function_info = tool_call.get('function', {})\n",
    "                function_name = function_info.get('name', '')\n",
    "                \n",
    "                try:\n",
    "                    arguments = json.loads(function_info.get('arguments', '{}'))\n",
    "                    \n",
    "                    print(f\"🛠️ Tool: {function_name}\")\n",
    "                    print(f\"📝 Arguments: {json.dumps(arguments, indent=2)}\")\n",
    "                    \n",
    "                    if function_name == \"code_exec_python\":\n",
    "                        code = arguments.get('code', '')\n",
    "                        if code:\n",
    "                            display(Markdown(f\"```python\\n{code}\\n```\"))\n",
    "                            \n",
    "                except json.JSONDecodeError as e:\n",
    "                    print(f\"⚠️ Error parsing tool arguments: {e}\")\n",
    "\n",
    "            # Handle content\n",
    "            content = delta.get('content')\n",
    "            if content:\n",
    "                # Truncate long html_to_markdown responses\n",
    "                if content.startswith(\"Tool 'html_to_markdown' returned result:\"):\n",
    "                    if len(content) > 500:\n",
    "                        truncated_content = content[:500] + \"...\"\n",
    "                        print(f\"📄 {truncated_content}\")\n",
    "                        continue\n",
    "                \n",
    "                display(Markdown(f\"#### 🤖 Agent Response\\n\\n{content}\"))\n",
    "                \n",
    "        except json.JSONDecodeError as e:\n",
    "            print(f\"⚠️ JSON decode error: {e}\")\n",
    "            print(f\"Raw line: {decoded_line}\")\n",
    "\n",
    "except requests.exceptions.RequestException as e:\n",
    "    print(f\"❌ Error making API call: {e}\")\n",
    "    if response_tool is not None:\n",
    "        print(f\"Status: {response_tool.status_code}\")\n",
    "        print(f\"Response: {response_tool.text}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Demo: Tool Chaining with postgres_get_schema and postgres_run_query\n",
    "\n",
    "Let's see how an agent can use the `postgres_get_schema` and `postgres_run_query` tools to get the schema of a Heroku Postgres database and run a query."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Tool Chaining with postgres_get_schema and postgres_run_query\n",
    "\n",
    "TARGET_DATABASE_ATTACHMENT = \"HEROKU_POSTGRESQL_AQUA_URL\" # Change to the one attached to your app\n",
    "\n",
    "headers = {\n",
    "    \"Authorization\": f\"Bearer {INFERENCE_KEY}\",\n",
    "    \"Content-Type\": \"application/json\"\n",
    "}\n",
    "\n",
    "payload_tool = {\n",
    "    \"model\": INFERENCE_MODEL_ID,\n",
    "    \"messages\": [\n",
    "        {\"role\": \"user\", \"content\": \"How much energy has been saved in the last 30 days?\"}\n",
    "    ],\n",
    "    \"tools\": [\n",
    "        {\n",
    "            \"type\": \"heroku_tool\",\n",
    "            \"name\": \"postgres_get_schema\",\n",
    "            \"runtime_params\": {\n",
    "                \"target_app_name\": TARGET_APP_NAME,\n",
    "                \"tool_params\": {\n",
    "                    \"db_attachment\": TARGET_DATABASE_ATTACHMENT\n",
    "                }\n",
    "            }\n",
    "        },\n",
    "        {\n",
    "            \"type\": \"heroku_tool\",\n",
    "            \"name\": \"postgres_run_query\",\n",
    "            \"runtime_params\": {\n",
    "                \"target_app_name\": TARGET_APP_NAME,\n",
    "                \"tool_params\": {\n",
    "                    \"db_attachment\": TARGET_DATABASE_ATTACHMENT\n",
    "                }\n",
    "            }\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "response_tool = None\n",
    "try:\n",
    "    response_tool = requests.post(\n",
    "        f\"{INFERENCE_URL}/v1/agents/heroku\", \n",
    "        headers=headers, \n",
    "        json=payload_tool,\n",
    "        stream=True\n",
    "    )\n",
    "    response_tool.raise_for_status()\n",
    "    \n",
    "    print(\"🔗 Agent Tool Chaining:\")\n",
    "    print(\"Tools: postgres_get_schema → postgres_run_query\")\n",
    "    print(\"-\" * 60)\n",
    "    \n",
    "    for line in response_tool.iter_lines():\n",
    "        if not line:\n",
    "            continue\n",
    "            \n",
    "        decoded_line = line.decode('utf-8').lstrip('data:').strip()\n",
    "        \n",
    "        if decoded_line.startswith('event') or decoded_line == \"[DONE]\" or not decoded_line:\n",
    "            continue\n",
    "        \n",
    "        try:\n",
    "            chunk = json.loads(decoded_line)\n",
    "\n",
    "            if chunk.get('type') == 'server_error':\n",
    "                print(f\"❌ Server Error: {chunk.get('message', 'Unknown error')}\")\n",
    "                break\n",
    "            \n",
    "            choices = chunk.get('choices', [])\n",
    "            if not choices:\n",
    "                continue\n",
    "                \n",
    "            delta = choices[0].get('message', {})\n",
    "            \n",
    "            # Handle tool calls\n",
    "            tool_calls = delta.get('tool_calls', [])\n",
    "            for tool_call in tool_calls:\n",
    "                function_info = tool_call.get('function', {})\n",
    "                function_name = function_info.get('name', '')\n",
    "                \n",
    "                try:\n",
    "                    arguments = json.loads(function_info.get('arguments', '{}'))\n",
    "                    \n",
    "                    print(f\"🛠️ Tool: {function_name}\")\n",
    "                    print(f\"📝 Arguments: {json.dumps(arguments, indent=2)}\")\n",
    "                    \n",
    "               \n",
    "                            \n",
    "                except json.JSONDecodeError as e:\n",
    "                    print(f\"⚠️ Error parsing tool arguments: {e}\")\n",
    "\n",
    "            # Handle content\n",
    "            content = delta.get('content')\n",
    "            if content:\n",
    "                display(Markdown(f\"#### 🤖 Agent Response\\n\\n{content}\"))\n",
    "                \n",
    "        except json.JSONDecodeError as e:\n",
    "            print(f\"⚠️ JSON decode error: {e}\")\n",
    "            print(f\"Raw line: {decoded_line}\")\n",
    "\n",
    "except requests.exceptions.RequestException as e:\n",
    "    print(f\"❌ Error making API call: {e}\")\n",
    "    if response_tool is not None:\n",
    "        print(f\"Status: {response_tool.status_code}\")\n",
    "        print(f\"Response: {response_tool.text}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Excercise: Create an example using the `pdf_to_markdown` and `code_exec_*` tools\n",
    "\n",
    "Let's create an example using the `pdf_to_markdown` and `code_exec_*` tools. \n",
    "\n",
    "Make sure in the prompt to pass the URL of the PDF file you want to convert."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Implementation here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## 4. Model Context Protocol (MCP)\n",
    "\n",
    "The Model Context Protocol (MCP) is a standard for enabling agentic behaviors and tool interoperability. It allows LLMs to understand and interact with tools in a structured way, making it easier to build complex, multi-step workflows.\n",
    "\n",
    "### Use MCP with Heroku Managed Inference and Agents\n",
    "\n",
    "You'll need to deploy a MCP to Heroku, for this workshop you can use the `brave-search-mcp` app available in the Heroku team.\n",
    "\n",
    "1. Go to the Managed Inference and Agents add-on setup page and click on **Manage MCP Servers**\n",
    "  ![Managed Inference and Agents Setup](https://workshops-content.ukoreh.com/notebooks/mia-workshop/imgs/managed_inference_setup.png)\n",
    "\n",
    "1. Under **Server Registration** click on **Attach another app** and select the `brave-search-mcp` app.\n",
    "  ![Attach MCP Server](https://workshops-content.ukoreh.com/notebooks/mia-workshop/imgs/setup_mcp.png)\n",
    "\n",
    "1. Refresh and you should see the `brave-search-mcp` app in the list of MCP servers.\n",
    "  ![MCP Server Registed](https://workshops-content.ukoreh.com/notebooks/mia-workshop/imgs/mcp_registered.png)\n",
    "\n",
    "Now you can use the MCP with the Agents endpoint, let's see an example.\n",
    "\n",
    "### Demo: Agent using a MCP deployed to Heroku "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Agent using a MCP deployed to Heroku \n",
    "headers = {\n",
    "    \"Authorization\": f\"Bearer {INFERENCE_KEY}\",\n",
    "    \"Content-Type\": \"application/json\"\n",
    "}\n",
    "\n",
    "payload_tool = {\n",
    "    \"model\": INFERENCE_MODEL_ID,\n",
    "    \"messages\": [\n",
    "        {\"role\": \"user\", \"content\": \"What is the most recent news about AI Agents?\"}\n",
    "    ],\n",
    "    \"tools\": [\n",
    "        {\n",
    "            \"type\": \"mcp\",\n",
    "            \"name\": \"mcp-brave/brave_web_search\"\n",
    "        }\n",
    "    ]\n",
    "}\n",
    "\n",
    "response_tool = None\n",
    "try:\n",
    "    response_tool = requests.post(\n",
    "        f\"{INFERENCE_URL}/v1/agents/heroku\", \n",
    "        headers=headers, \n",
    "        json=payload_tool,\n",
    "        stream=True\n",
    "    )\n",
    "    response_tool.raise_for_status()\n",
    "    \n",
    "    print(\"🔧 MCP Tool Execution:\")\n",
    "    print(\"-\" * 60)\n",
    "    \n",
    "    for line in response_tool.iter_lines():\n",
    "        if not line:\n",
    "            continue\n",
    "            \n",
    "        decoded_line = line.decode('utf-8').lstrip('data:').strip()\n",
    "        \n",
    "        if decoded_line.startswith('event') or decoded_line == \"[DONE]\" or not decoded_line:\n",
    "            continue\n",
    "        \n",
    "        try:\n",
    "            chunk = json.loads(decoded_line)\n",
    "\n",
    "            if chunk.get('type') == 'server_error':\n",
    "                print(f\"❌ Server Error: {chunk.get('message', 'Unknown error')}\")\n",
    "                break\n",
    "            \n",
    "            choices = chunk.get('choices', [])\n",
    "            if not choices:\n",
    "                continue\n",
    "                \n",
    "            delta = choices[0].get('message', {})\n",
    "            \n",
    "            # Handle tool calls\n",
    "            tool_calls = delta.get('tool_calls', [])\n",
    "            for tool_call in tool_calls:\n",
    "                function_info = tool_call.get('function', {})\n",
    "                tool_name = function_info.get('name', 'Unknown')\n",
    "                print(f\"🛠️ Executing MCP tool: {tool_name}\")\n",
    "            \n",
    "            # Handle content\n",
    "            content = delta.get('content')\n",
    "            if content:\n",
    "                display(Markdown(f\"#### 🤖 Agent Response\\n\\n{content}\"))\n",
    "                \n",
    "        except json.JSONDecodeError as e:\n",
    "            print(f\"⚠️ JSON decode error: {e}\")\n",
    "            print(f\"Raw line: {decoded_line}\")\n",
    "\n",
    "except requests.exceptions.RequestException as e:\n",
    "    print(f\"❌ Error making API call: {e}\")\n",
    "    if response_tool is not None:\n",
    "        print(f\"Status: {response_tool.status_code}\")\n",
    "        print(f\"Response: {response_tool.text}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Demo: Using the MCP Toolkits with a simple client\n",
    "\n",
    "All the deployed MCPs to you are are available to use with the MCP Toolkits, this is a MCP gateway that allows you to use your MCPs out of Heroku in a secure way.\n",
    "\n",
    "Let's see an example of how to use the MCP Toolkits to call a MCP deployed to Heroku.\n",
    "\n",
    "> 💡 You can find the MCP Toolkits in the Managed Inference and Agents add-on setup page, under the **Toolkit Integration** section."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install mcp"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "import asyncio\n",
    "from mcp import ClientSession\n",
    "from mcp.client.sse import sse_client\n",
    "\n",
    "# Simple MCP Client\n",
    "async def create_mcp_client(endpoint_url, auth_token):\n",
    "    headers = {'Authorization': f'Bearer {auth_token}'}\n",
    "    \n",
    "    # Use SSE client context manager to get both read and write streams\n",
    "    async with sse_client(url=endpoint_url, headers=headers) as (read_stream, write_stream):\n",
    "        async with ClientSession(read_stream, write_stream) as session:\n",
    "            await session.initialize()\n",
    "            print(\"✅ MCP Client connected!\")\n",
    "            \n",
    "            return session\n",
    "\n",
    "# Usage example\n",
    "async def demo():\n",
    "    endpoint = \"https://us.inference.heroku.com/mcp/sse\"\n",
    "    token = INFERENCE_KEY\n",
    "    \n",
    "    headers = {'Authorization': f'Bearer {token}'}\n",
    "    \n",
    "    # Proper context manager usage\n",
    "    async with sse_client(url=endpoint, headers=headers) as (read_stream, write_stream):\n",
    "        async with ClientSession(read_stream, write_stream) as session:\n",
    "            await session.initialize()\n",
    "            print(\"✅ MCP Client connected!\")\n",
    "            \n",
    "            # List available tools\n",
    "            tools = await session.list_tools()\n",
    "            print(f\"🛠️ Available tools: {[tool.name for tool in tools.tools]}\")\n",
    "            \n",
    "            # Call a tool (example)\n",
    "            result = await session.call_tool(\"mcp-brave/brave_web_search\", {\"query\": \"What is Heroku AI?\"})\n",
    "            print(f\"📋 Result: {result}\")\n",
    "\n",
    "# Run in Jupyter\n",
    "await demo()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: Deploy a MCP to Heroku\n",
    "\n",
    "Deploy a MCP to Heroku and register it to your Managed Inference and Agents add-on, then modify the previous example to use your MCP.\n",
    "\n",
    "> 💡 You can deploy from the Heroku CLI or from the Heroku Dashboard. Make sure to use the `Procfile` to define the `mcp` process type of the stdio MCP. Take a look at [perplexity-ask-mcp](https://github.com/julianduque/perplexity-ask-mcp) for an example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Implementation here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Exercise: Use the deployed MCP with the /v1/agents/heroku endpoint\n",
    "\n",
    "Use the deployed MCP with the `/v1/agents/heroku` endpoint."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Implementation here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "## BONUS\n",
    "\n",
    "### Using the OpenAI SDK to call the Heroku Inference API\n",
    "\n",
    "The `/v1/chat/completions` endpoint is 95% compatible with OpenAI's API, we are working to make it 99% compatible, but for now, you can use the OpenAI SDK to call the Heroku Inference API."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install openai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example: Using the OpenAI SDK to call the Heroku Inference API\n",
    "from openai import OpenAI\n",
    "\n",
    "client = OpenAI(\n",
    "    api_key=INFERENCE_KEY,\n",
    "    base_url=f\"{INFERENCE_URL}/v1\"\n",
    ")\n",
    "\n",
    "response = client.chat.completions.create(\n",
    "    model=INFERENCE_MODEL_ID,\n",
    "    messages=[{\"role\": \"user\", \"content\": \"What is Managed Inference?\"}],\n",
    ")\n",
    "\n",
    "print(response)\n",
    "display(Markdown(f\"#### 🧠 AI Response\\n\\n{response.choices[0].message.content}\"))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "And that's it! You have learned the fundamentals of Heroku Managed Inference and Agents and the MCP Support."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
