Live Audio Analysis and Enrichment (Experimental API)

Overview

Hyperia supports live audio stream analysis and NLP enrichment. You may stream audio data to Hyperia directly over a secure websocket connection, or leverage our AI Notetaker to join a Zoom, Google Meet, or Microsoft Teams meeting and stream live audio from the conversation. In addition to automatic speech recognition, many forms of NLP enrichment are provided including dialog labeling, sentiment analysis, topic and named entity extraction, and intent + action identification.

Live Streaming Enrichments

Hyperia outputs utterance-level transcriptions and NLP enrichments for analyzed live audio streams. Below is an example of the JSON message format:

RESPONSE
{
	"dialog_label": {
		"confidence": DIALOG_LABEL_CONFIDENCE,
		"label": "DIALOG_TYPE_LABEL"
	},
	"dialog_sentiment": {
		"confidence": SENTIMENT_CONFIDENCE,
		"sentiment": "SENTIMENT_POLARITY"
	},
	"document_type": "h:fm:Utterance",
	"endTime": END_TIME_IN_SECONDS,
	"id": SEQUENCE_ID,
	"length": UTTERANCE_LENGTH_IN_SECONDS,
	"primary_value": "transcript",
	"printableStartTime": "START_TIME_IN_MM:SS_FORMAT",
	"startTime": START_TIME_IN_SECONDS,
	"topics": [
	  {
		"score": TOPIC_SCORE,
		"text": "TOPIC_NAME",
		"type": "TOPIC_TYPE"
	  }
	],
	"transcript": "TRANSCRIPT_TEXT",
	"wordTimings": [
	  {
		"confidence": WORD_CONFIDENCE_SCORE,
		"endTime": WORD_START_TIME_IN_SECONDS,
		"startTime": WORD_END_TIME_IN_SECONDS,
		"word": "WORD_TEXT"
	  }
	],
	"tags": [
	  {
		"tag_id": "TAG_GUID",
		"tag_name": "TAG_NAME"
	  }
	],
	"stream_id": "GUID_OF_AUDIO_STREAM"
}

Dialog Label Enrichments

Dialog labels are generated for each transcribed utterance indicating the type of expression (fact, opinion, question, etc). A list of supported dialog labels is provided below:

agreement
command
fact
opinion
pleasantries
question
task
uncertain
unknown

Sentiment Enrichments

Sentiment extraction is performed on each transcribed utterance indicating the type of expression (positive, negative). Confidence values are provided to show the intensity of sentiment expression. To leverage sentiment enrichments, choose a threshold value for each polarity such as 0.3 - 0.5 depending on the intensity of expressions you desire.

Topic Enrichments

Topic and named entity extraction is performed on each transcribed utterance indicating specific topics or entities that are being discussed (people, companies, locations, etc). Supported topic types include:

Affiliation
Cardinal
Date
Location
Organization
Percent
Person
Product
Quantity
Time
Topic

Intent Tagging Enrichments

Hyperia performs intent analysis and tagging on utterances, identifying intents and actions such as Follow-ups, Next-steps, Requests-for-information, and so on. Dozens of intents and actions are supported, with new ones being added on a regular basis. Currently supported intents and actions include:

Agenda
Apology
Appreciation
Cannot Login
Commitments
Company About
Company Background
Concern
Confusion
Customers
Decision
Difficulty
Disapproval
Expensive
Frustration
Interest
Lateness
Make an Intro
Need Confirmation
Next Steps
Not Interested
Options
Personal Background
Pricing
Problems
Recommendation
Request for Information
Screen Sharing
Skepticism
Slowness
Something Broken
System Problems
Timeline
Uncertainty
Want to Cancel
Want to Return
Want to Try

Create Audio Processing Stream

Creates an audio processing websocket stream for performing streaming speech recognition and natural language processing. This endpoint provisions 2 websockets: One that can be used for streaming audio data (16000hz, signed little endian 16bit PCM format), and another that can be used to receive streaming JSON events (transcript, NLP enrichments, live insights, etc). Websockets are automatically closed if a connection is not made to the audio endpoint within 60 seconds of provisioning. Streams may be active for a maximum of 3 hours.

Endpoint:

/v1/stream/create

HTTP Method:

PUT

URI Parameters:

None

Returns:

If successful: HTTP 200

Return Payload:

RESPONSE
{
  	"status": "ok",
	"result": {
	  	"stream_id": "ID_OF_CREATED_REALTIME_STREAM",
	    "audio_socket": "HTTPS_URI_OF_CREATED_AUDIO_WEBSOCKET",
		"event_socket": "HTTPS_URI_OF_CREATED_JSON_EVENT_WEBSOCKET"
	}
}

Code Sample:

from hyperia import Hyperia
import json
import sys
import websocket
import time
import threading


def sender_thread(ws, file):
    print("Starting sender loop.")
    bytes = file.read(640)
    while bytes:
        ws.send_binary(bytes)

        bytes = file.read(640)

        time.sleep(0.02)
    print("Finished sending")


def receiver_thread(ws):
    print("Starting receiver loop.")
    while True:
        message = ws.recv()

        print(message)
    print("Receiver exiting")

def open_websockets(socket_id, audio_socket, transcript_socket, file_path):
    print(f"Connencting to socket {socket_id}")

    ws_send = websocket.WebSocket()
    socket_url = audio_socket
    print(socket_url)
    ws_send.connect(socket_url)

    ws_recv = websocket.WebSocket()
    socket_url = transcript_socket
    print(socket_url)
    ws_recv.connect(socket_url)

    print("Connected..")

    file = open(file_path, "rb")

    send_thread = threading.Thread(target=sender_thread, args=(ws_send, file))

    recv_thread = threading.Thread(target=receiver_thread, args=(ws_recv,))

    print("Starting receiver.")
    recv_thread.start()

    print("Starting sender.")
    send_thread.start()

    send_thread.join()


file_path = "SOME_FILE_PATH_OF_16000HZ_L16_PCM_DATA"


# Create the Hyperia Object
hyperia = Hyperia()

response = hyperia.stream_create()

stream_id = response['result']['stream_id']

print(f"Created stream {stream_id}")

audio_socket = response['result']['audio_socket']
event_socket = response['result']['event_socket']

open_websockets(stream_id, audio_socket, event_socket, file_path)

List Active Media Streams

Lists media streams that are currently active.

Endpoint:

/v1/stream/list

HTTP Method:

GET

URI Parameters:

None

Returns:

If successful: HTTP 200

Return Payload:

RESPONSE
{
  	"status": "ok",
	"results": [
	  {
	  	"stream_id": "ID_OF_CREATED_REALTIME_STREAM"
	  }
	]
}

Code Sample:

from hyperia import Hyperia
import json
import sys
import time

# Create the Hyperia Object
hyperia = Hyperia()

response = hyperia.stream_list()

for stream in response['results']:
    print(f"Active stream {stream['stream_id']}")

Check For Stream Existence

Checks to see if an active realtime stream exists using a stream ID.

Endpoint:

/v1/stream/id/<stream_id>/exists

HTTP Method:

GET

URI Parameters:

None

Returns:

If successful: HTTP 200

Return Payload:

RESPONSE
{
    "status": "ok",
	"exists": true | false
}

Code Sample:

from hyperia import Hyperia
import json
import sys
import time

stream_id = "SOME_STREAM_ID"

# Create the Hyperia Object
hyperia = Hyperia()

response = hyperia.stream_exists(stream_id)

print(response['exists'])

Sign up to Hyperia

Try the Notetaker

Capture, transcribe and summarize your Zoom, Meet and Teams meetings with the Notetaker

Get an API Key

Integrate Hyperia into your application or workflow