Transcribe a file
Transcribe a local audio file using the SubQ API
Use file transcription when you have audio stored locally: recordings, voicemails, podcast episodes, meeting captures, or any other pre-recorded audio. You send the file contents in a single POST request to /v1/listen and receive the complete transcript in the response.
The SubQ API accepts most common audio formats including MP3, WAV, AAC, FLAC, OGG, Opus, WebM, M4A, and raw PCM. For details on how pre-recorded and streaming transcription differ, see Transcription modes.
Prerequisites
- Python 3.8 or later
httpxinstalled. If you haven't already, follow the set up and installation guide.- A local audio file in a supported format (MP3, WAV, FLAC, etc.)
Transcribe a local file
In this example, you read an audio file from disk and send it to the SubQ API for transcription:
import os
import httpx
SUBQ_API_KEY = os.environ["SUBQ_API_KEY"]
# Read the audio file as raw bytes
with open("audio.wav", "rb") as f:
audio_data = f.read()
# Send the audio to the SubQ API
response = httpx.post(
"https://stt-api.subq.ai/v1/listen",
headers={"Authorization": f"Bearer {SUBQ_API_KEY}"},
content=audio_data,
timeout=60.0
)
result = response.json()
# Extract and print the transcript
transcript = (
result.get("results", {})
.get("channels", [{}])[0]
.get("alternatives", [{}])[0]
.get("transcript", "")
)
print(transcript)Run it:
python transcribe_file.pyHow it works
To transcribe a file, you read the audio, send it to the API, and extract the transcript from the response:
-
Read the file:
open("audio.wav", "rb").read()loads the entire file into memory as raw bytes. The API detects the audio format automatically from the file contents, so you don't need to specify a content type. -
POST to
/v1/listen:httpx.post()sends the audio bytes as the request body to the SubQ API. TheAuthorizationheader authenticates the request with your API key using theBearerscheme. Thetimeout=60.0sets a 60-second timeout but you can increase it for longer files. -
Parse the response: The transcript sits inside a nested JSON structure:
results→channels→alternatives. The chained.get()calls navigate this safely, returning empty defaults if any key is missing. This avoidsKeyErrorexceptions if the response structure is unexpected.
Prerequisites
- Node.js 18 or later
- A local audio file in a supported format (MP3, WAV, FLAC, etc.)
Transcribe a local file
In this example, you read an audio file from disk and send it to the SubQ API for transcription:
import { readFileSync } from "node:fs";
const SUBQ_API_KEY = process.env.SUBQ_API_KEY;
// Read the audio file as raw bytes
const audioData = readFileSync("audio.wav");
// Send the audio to the SubQ API
const response = await fetch("https://stt-api.subq.ai/v1/listen", {
method: "POST",
headers: { "Authorization": `Bearer ${SUBQ_API_KEY}` },
body: audioData,
});
const result = await response.json();
// Extract and print the transcript
const transcript = result?.results?.channels?.[0]?.alternatives?.[0]?.transcript ?? "";
console.log(transcript);Run it:
node transcribe_file.jsHow it works
To transcribe a file, you read the audio, send it to the API, and extract the transcript from the response:
-
Read the file:
readFileSync("audio.wav")loads the entire file into memory as aBuffer. The API detects the audio format automatically from the file contents, so you don't need to specify a content type. -
POST to
/v1/listen:fetch()sends the audio bytes as the request body to the SubQ API. TheAuthorizationheader authenticates the request with your API key using theBearerscheme. -
Parse the response: The transcript sits inside a nested JSON structure:
results→channels→alternatives. Optional chaining (?.) navigates this safely, returningundefinedif any key is missing. The?? ""fallback ensures you always get a string.
Prerequisites
- Go 1.21 or later
- A local audio file in a supported format (MP3, WAV, FLAC, etc.)
Transcribe a local file
In this example, you read an audio file from disk and send it to the SubQ API for transcription:
package main
import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
"os"
)
func main() {
apiKey := os.Getenv("SUBQ_API_KEY")
// Read the audio file as raw bytes
audioData, err := os.ReadFile("audio.wav")
if err != nil {
fmt.Println("Error reading file:", err)
return
}
// Send the audio to the SubQ API
req, _ := http.NewRequest("POST", "https://stt-api.subq.ai/v1/listen", bytes.NewReader(audioData))
req.Header.Set("Authorization", "Bearer "+apiKey)
resp, err := http.DefaultClient.Do(req)
if err != nil {
fmt.Println("Error:", err)
return
}
defer resp.Body.Close()
body, _ := io.ReadAll(resp.Body)
if resp.StatusCode != 200 {
fmt.Printf("Error %d: %s\n", resp.StatusCode, string(body))
return
}
// Extract and print the transcript
var result struct {
Results struct {
Channels []struct {
Alternatives []struct {
Transcript string `json:"transcript"`
} `json:"alternatives"`
} `json:"channels"`
} `json:"results"`
}
json.Unmarshal(body, &result)
fmt.Println(result.Results.Channels[0].Alternatives[0].Transcript)
}Run it:
go run transcribe_file.goHow it works
To transcribe a file, you read the audio, send it to the API, and extract the transcript from the response:
-
Read the file:
os.ReadFile("audio.wav")loads the entire file into memory as a byte slice. The API detects the audio format automatically from the file contents, so you don't need to set a content type. -
POST to
/v1/listen:http.NewRequestbuilds a POST request with the audio bytes as the body. TheAuthorizationheader authenticates the request with your API key using theBearerscheme. -
Parse the response: You read the response body with
io.ReadAlland unmarshal it into a struct that mirrors the JSON structure. Go's typed struct fields give you compile-time safety when you accessResults.Channels[0].Alternatives[0].Transcript.
Prerequisites
- Rust 1.70 or later with
reqwest,tokio, andserde_jsonin yourCargo.toml. If you haven't already, follow the set up and installation guide. - A local audio file in a supported format (MP3, WAV, FLAC, etc.)
Transcribe a local file
In this example, you read an audio file from disk and send it to the SubQ API for transcription:
use serde_json::Value;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let api_key = std::env::var("SUBQ_API_KEY").expect("SUBQ_API_KEY not set");
// Read the audio file as raw bytes
let audio_data = tokio::fs::read("audio.wav").await?;
// Send the audio to the SubQ API
let client = reqwest::Client::new();
let response = client
.post("https://stt-api.subq.ai/v1/listen")
.header("Authorization", format!("Bearer {}", api_key))
.body(audio_data)
.send()
.await?;
let status = response.status();
let text = response.text().await?;
if !status.is_success() {
eprintln!("Error {}: {}", status, text);
return Ok(());
}
// Extract and print the transcript
let json: Value = serde_json::from_str(&text)?;
let transcript = json["results"]["channels"][0]["alternatives"][0]["transcript"]
.as_str()
.unwrap_or("");
println!("{}", transcript);
Ok(())
}Run it:
cargo run --bin transcribe_fileHow it works
To transcribe a file, you read the audio, send it to the API, and extract the transcript from the response:
-
Read the file:
tokio::fs::read("audio.wav")loads the entire file into aVec<u8>. The API detects the audio format automatically from the file contents, so you don't need to set a content type. -
POST to
/v1/listen:reqwest::Client::new().post(url)builds a POST request. The.header("Authorization", ...)call adds Bearer authentication, and.body(audio_data)attaches the raw audio bytes. -
Parse the response: You deserialize the response body into a
serde_json::Valueand index it to extract the transcript. Theas_str().unwrap_or("")chain handles the case where a field is missing.
Response structure
The API returns a JSON response. The key fields are:
| Field | Type | Description |
|---|---|---|
results.channels | array | One entry per audio channel. Mono audio has one channel. |
channels[].alternatives | array | Ranked transcript alternatives. The first alternative ([0]) has the highest confidence. |
alternatives[].transcript | string | The full transcript text for this alternative. |
alternatives[].confidence | float | Confidence score from 0.0 to 1.0. |
alternatives[].words | array | Per-word details including word, start, end, and confidence. |
Example response:
{
"results": {
"channels": [
{
"alternatives": [
{
"transcript": "Houston we've had a problem",
"confidence": 0.98,
"words": [
{ "word": "houston", "start": 0.08, "end": 0.56, "confidence": 0.99 },
{ "word": "we've", "start": 0.56, "end": 0.72, "confidence": 0.97 }
]
}
]
}
]
}
}You can customize transcriptions with query parameters for smart formatting, speaker diarization, language detection, and more.
If the request fails, the API returns an HTTP error status with a description.
Next steps
- Transcribe from URL - transcribe a remote audio file without downloading it first
- Real-time streaming - transcribe audio as it's being recorded
- PII redaction - automatically remove sensitive data from transcripts