Core Objects
Heify revolves around two main objects: Configurations and Transcriptions. Understanding these objects is essential for working with the API effectively.Configuration Object
A Configuration is a reusable template that defines how audio/video files should be processed. Think of it as a preset that you can apply to multiple transcription jobs.Configuration Structure
| Attribute | Type | Required | Description |
|---|---|---|---|
configuration_id | string | Auto-generated | Unique identifier for the configuration (UUID format) |
client_id | string | Auto-assigned | Your client identifier (linked to your API key) |
tag | string | Yes | Descriptive name for the configuration (max 255 characters) |
vocabulary | array<string> | No | Custom words/phrases to improve recognition accuracy |
extraction_fields | array<object> | No | Structured data fields to extract (max 20 fields) |
webhooks | object | No | URLs for success/error notifications |
summary | boolean | No | Generate a summary of the transcription (default: false) |
summary_language | string | No | Language for summary generation (default: "df" - auto-detect). See Supported Languages. |
analytics_language | string | No | Language for the Executive and Qualitative Analysis Report (default: "df" - auto-detect). See Supported Languages. |
created_at | string | Auto-generated | ISO 8601 timestamp of creation |
Example Configuration
Extraction Fields
Define structured data to extract from transcriptions using AI.| Attribute | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Field identifier (e.g., "ticket_id", "customer_name") |
type | string | Yes | Data type: string, number, boolean, array |
description | string | Yes | Detailed description to guide the AI extraction |
Best Practices for Extraction Fields
Use bounded responses
Use bounded responses
Define a clear, limited set of possible values to improve consistency and accuracy.Example:This approach ensures the AI returns predictable, standardized values instead of varied descriptions.
Provide specific context
Provide specific context
Give clear descriptions and specific examples to guide the AI for more accurate results.Poor description:Good description:The more context you provide, the better the extraction quality.
Webhooks
Configure automatic notifications when transcription jobs complete or fail.| Attribute | Type | Description |
|---|---|---|
success_url | string | URL to receive POST notifications on successful completion |
error_url | string | URL to receive POST notifications on failure |
Transcription Object
A Transcription represents an individual audio or video processing job. Its structure changes based on the job’s current status.Transcription Structure
| Attribute | Type | Description |
|---|---|---|
transcription_id | string | Unique identifier for the transcription (UUID format) |
status | string | Current status: PENDING, IN_PROGRESS, COMPLETED, FAILED |
configuration_id | string | ID of the configuration used |
configuration_tag | string | Tag of the configuration used |
name | string | Custom name for the transcription (can be null) |
group | string | Audio group/phase (can be null). See available groups |
duration | number | Media duration in seconds |
details | object | Full transcription results (only if COMPLETED) |
error | object | Error information (only if FAILED) |
Status Values
1
PENDING
The transcription job is queued and waiting to start processing.
2
IN_PROGRESS
The audio is currently being transcribed and analyzed.
3
COMPLETED
Transcription finished successfully. The
details object contains all results.4
FAILED
The transcription failed. The
error object contains details about why.Transcription Details (when COMPLETED)
When a transcription completes successfully, thedetails object includes:
| Attribute | Type | Description |
|---|---|---|
language | string | Detected language of the media |
num_speakers | number | Number of unique speakers identified |
created_at | string | Timestamp when transcription started |
completed_at | string | Timestamp when transcription finished |
conversation | object | Full conversation with speaker-separated segments |
summary | object | Generated summary (if enabled) |
fields | object | Extracted structured data (if configured) |
Conversation Structure
Theconversation object contains speaker-separated segments:
| Field | Type | Description |
|---|---|---|
text | string | Transcribed text for this segment |
speaker | string | Speaker identifier (SPEAKER_00, SPEAKER_01, etc.) |
Complete Transcription Example
Groups
Use thegroup field to manage transcription group/phase:
| Group Value | Description |
|---|---|
PENDING_REVIEW | Transcription needs manual review |
UNDER_REVIEW | Currently being reviewed |
ARCHIVED | Completed and archived |
null | No group assigned |
Groups are managed using the
/update-transcription-group endpoint. See Update Transcription Group for details.Supported Languages
The following languages are supported for transcriptions, summaries (summary_language), and analytics reports (analytics_language).
How the default language (
df) worksYou can use "df" for summary_language and analytics_language for automatic language detection, but the behavior differs:- For summaries (
summary_language): The summary will be generated in the language detected in that specific audio file. - For analytics (
analytics_language): The report will be generated in the majority language found across all audio files associated with the configuration.
- A-G
- H-P
- R-Z
| Language | ISO Code |
|---|---|
| Afrikaans | af |
| Albanian | sq |
| Arabic | ar |
| Azerbaijani | az |
| Basque | eu |
| Belarusian | be |
| Bengali | bn |
| Bosnian | bs |
| Bulgarian | bg |
| Catalan | ca |
| Chinese | zh |
| Croatian | hr |
| Czech | cs |
| Danish | da |
| Dutch | nl |
| English | en |
| Estonian | et |
| Finnish | fi |
| French | fr |
| Galician | gl |
| German | de |
| Greek | el |
| Gujarati | gu |
Best Practices
Organize with meaningful tags
Organize with meaningful tags
Reuse configurations
Reuse configurations
Create configurations for common use cases and reuse them across multiple transcriptions. You can have up to 20 configurations.
Use custom vocabulary
Use custom vocabulary
Add industry-specific terms, product names, or acronyms to improve accuracy:
Leverage extraction fields
Leverage extraction fields
Extract structured data automatically instead of parsing transcripts manually:
- Customer IDs
- Order numbers
- Dates and times
- Monetary amounts
- Yes/no answers