Migrate from Helicone to Langfuse
Helicone has moved into maintenance mode following its acquisition by Mintlify. This guide covers how to migrate your prompt management and observability/tracing setup from Helicone to Langfuse.
Migrating Prompt Management
Helicone offers two approaches to prompt management: a gateway approach where you pass a prompt_id and the gateway compiles the template server-side in a single API call, and an SDK approach where you fetch and compile prompts client-side via the @helicone/helpers package. Langfuse uses an SDK-based model similar to Helicone’s SDK approach: prompts are fetched and compiled in your application code, with built-in caching to keep latency low and ensure guaranteed availability.
If you used Helicone’s gateway approach, the main change is moving prompt resolution into your application code. If you already used Helicone’s SDK approach (HeliconePromptManager → getPromptBody), the migration is more straightforward — you’re replacing one SDK fetch+compile pattern with another.
1. Export Prompts from Helicone
Use the Helicone Prompt API to export your existing prompts:
- List all prompts via
POST /v1/prompt-2025/queryto get prompt IDs. - List versions for each prompt via
POST /v1/prompt-2025/query/versions. - Fetch the full body for each version via
GET /v1/prompt-2025/{promptVersionId}/prompt-body. - Record environment assignments via
POST /v1/prompt-2025/query/environment-versionto know which version is deployed where (production, staging, etc.).
2. Convert Variable Syntax
Helicone uses typed variables ({{hc:name:type}}), while Langfuse uses plain {{variable}} placeholders compiled at runtime via .compile().
| Helicone | Langfuse |
|---|---|
{{hc:customer_name:string}} | {{customer_name}} |
{{hc:is_premium:boolean}} | {{is_premium}} |
If you relied on Helicone’s type validation, move that logic into your application code before calling .compile().
3. Map Prompt Bodies to Langfuse
Helicone stores the full LLM request shape (model, messages, temperature, tools, etc.) in a single prompt body. In Langfuse, split this into two parts:
- Prompt content (type
chat): themessagesarray with converted variable syntax. See prompt data model. - Prompt config (JSON): model parameters (
model,temperature,max_tokens) and tool definitions (tools,tool_choice,response_format). See prompt config.
4. Recreate Prompts in Langfuse
Create prompts in Langfuse via the SDK or API, setting:
- Prompt name: maps to Helicone’s
prompt_id. - Prompt type:
chat(since Helicone stores chat messages). - Labels: map Helicone environments (production/staging) to Langfuse labels. For example, the Helicone version assigned to “production” gets the
productionlabel in Langfuse. - Config JSON: include model parameters and tool definitions.
5. Migrate Prompt Partials and Composition
Helicone prompt partials ({{hcp:prompt_id:index:environment}}) pull messages from other prompts. Langfuse offers two alternatives:
- Shared system instructions: create a Langfuse text prompt for the shared snippet and reference it via prompt composability.
- Multi-message fragments: fetch both prompts in code, compile each, and merge message arrays — or use message placeholders to insert messages at specific positions at runtime.
6. Update Application Code
Replace Helicone’s prompt integration with Langfuse’s fetch + compile flow.
If you used Helicone’s gateway approach (prompt_id + inputs in the API call):
# Before (Helicone gateway)
response = client.chat.completions.create(
model="gpt-4o-mini",
prompt_id="customer_support",
inputs={"customer_name": "Alice", "issue_type": "billing"}
)
# After (Langfuse)
from langfuse import Langfuse
langfuse = Langfuse()
prompt = langfuse.get_prompt("customer_support", label="production", type="chat")
compiled_messages = prompt.compile(customer_name="Alice", issue_type="billing")
response = client.chat.completions.create(
model=prompt.config.get("model", "gpt-4o-mini"),
messages=compiled_messages
)If you used Helicone’s SDK approach (@helicone/helpers / HeliconePromptManager):
# Before (Helicone SDK)
# body = prompt_manager.get_prompt_body(prompt_id="customer_support", inputs={...})
# response = client.chat.completions.create(**body)
# After (Langfuse) — same pattern, different SDK
from langfuse import Langfuse
langfuse = Langfuse()
prompt = langfuse.get_prompt("customer_support", label="production", type="chat")
compiled_messages = prompt.compile(customer_name="Alice", issue_type="billing")
response = client.chat.completions.create(
model=prompt.config.get("model", "gpt-4o-mini"),
messages=compiled_messages
)See the prompt management get-started guide for full Python and TypeScript examples.
Migrating Tracing / Observability
Helicone logs LLM requests at the gateway level. Langfuse provides hierarchical traces with nested spans, giving you visibility into multi-step agent workflows — not just individual LLM calls.
Option A: Use the Langfuse SDK (Recommended)
Langfuse offers a Python and TypeScript SDK that can flexibly wrap any application code, plus native integrations with 80+ frameworks and model providers including OpenAI, LangChain, LlamaIndex, Vercel AI SDK, Anthropic, and many more. You can choose the integration that matches your stack — see the full integrations overview for all options.
The simplest starting point for OpenAI users is the drop-in OpenAI SDK wrapper. Since Helicone is OpenAI-compatible, you can even keep Helicone as a gateway during the transition:
from langfuse.openai import openai
client = openai.OpenAI(
api_key="your-api-key",
base_url="https://api.openai.com/v1" # or keep Helicone's URL during transition
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)Beyond simple LLM call logging, the Langfuse SDK provides the @observe() decorator (Python) and equivalent patterns in TypeScript to trace any function in your application — creating hierarchical traces with nested spans for multi-step agent workflows, tool calls, retrieval steps, and more. This gives you the full trace context that gateway-only logging cannot provide.
To link traced generations to your migrated prompts, pass the prompt object to the generation. See linking prompts to traces.
Option B: Use OpenTelemetry
If you already have OpenTelemetry instrumentation, you can point your OTLP exporter at Langfuse’s OTLP endpoint:
export OTEL_EXPORTER_OTLP_ENDPOINT="https://cloud.langfuse.com/api/public/otel"
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Basic <base64(public_key:secret_key)>"Key constraints:
- Langfuse supports OTLP over HTTP via both HTTP/JSON and HTTP/protobuf (
gRPCis not supported yet). - See the OpenTelemetry integration docs for full setup details.
Option C: Replace the Gateway with LiteLLM Proxy
If you used Helicone primarily as an AI gateway (multi-provider routing, failover), LiteLLM Proxy is a drop-in replacement with native Langfuse integration:
# litellm_config.yaml
litellm_settings:
callbacks: ["langfuse_otel"]This preserves the gateway pattern while routing all traces to Langfuse. See the LiteLLM Proxy integration guide for details.
Further Resources
- Langfuse Prompt Management Overview
- Langfuse Observability Overview
- Langfuse Get Started Guide
- Helicone Integration (use Langfuse alongside Helicone)
- Helicone Prompt Integration Docs
- Helicone Prompt API Docs
- Helicone Announcement