Live Audio Analysis and Enrichment (Experimental API)
Overview
Hyperia supports live audio stream analysis and NLP enrichment. You may stream audio data to Hyperia directly over a secure websocket connection, or leverage our AI Notetaker to join a Zoom, Google Meet, or Microsoft Teams meeting and stream live audio from the conversation. In addition to automatic speech recognition, many forms of NLP enrichment are provided including dialog labeling, sentiment analysis, topic and named entity extraction, and intent + action identification.
Live Streaming Enrichments
Hyperia outputs utterance-level transcriptions and NLP enrichments for analyzed live audio streams. Below is an example of the JSON message format:
{
"dialog_label": {
"confidence": DIALOG_LABEL_CONFIDENCE,
"label": "DIALOG_TYPE_LABEL"
},
"dialog_sentiment": {
"confidence": SENTIMENT_CONFIDENCE,
"sentiment": "SENTIMENT_POLARITY"
},
"document_type": "h:fm:Utterance",
"endTime": END_TIME_IN_SECONDS,
"id": SEQUENCE_ID,
"length": UTTERANCE_LENGTH_IN_SECONDS,
"primary_value": "transcript",
"printableStartTime": "START_TIME_IN_MM:SS_FORMAT",
"startTime": START_TIME_IN_SECONDS,
"topics": [
{
"score": TOPIC_SCORE,
"text": "TOPIC_NAME",
"type": "TOPIC_TYPE"
}
],
"transcript": "TRANSCRIPT_TEXT",
"wordTimings": [
{
"confidence": WORD_CONFIDENCE_SCORE,
"endTime": WORD_START_TIME_IN_SECONDS,
"startTime": WORD_END_TIME_IN_SECONDS,
"word": "WORD_TEXT"
}
],
"tags": [
{
"tag_id": "TAG_GUID",
"tag_name": "TAG_NAME"
}
],
"stream_id": "GUID_OF_AUDIO_STREAM"
}
Dialog Label Enrichments
Dialog labels are generated for each transcribed utterance indicating the type of expression (fact, opinion, question, etc). A list of supported dialog labels is provided below:
agreement |
command |
fact |
opinion |
pleasantries |
question |
task |
uncertain |
unknown |
Sentiment Enrichments
Sentiment extraction is performed on each transcribed utterance indicating the type of expression (positive, negative). Confidence values are provided to show the intensity of sentiment expression. To leverage sentiment enrichments, choose a threshold value for each polarity such as 0.3 - 0.5 depending on the intensity of expressions you desire.
Topic Enrichments
Topic and named entity extraction is performed on each transcribed utterance indicating specific topics or entities that are being discussed (people, companies, locations, etc). Supported topic types include:
Affiliation |
Cardinal |
Date |
Location |
Organization |
Percent |
Person |
Product |
Quantity |
Time |
Topic |
Intent Tagging Enrichments
Hyperia performs intent analysis and tagging on utterances, identifying intents and actions such as Follow-ups, Next-steps, Requests-for-information, and so on. Dozens of intents and actions are supported, with new ones being added on a regular basis. Currently supported intents and actions include:
Agenda |
Apology |
Appreciation |
Cannot Login |
Commitments |
Company About |
Company Background |
Concern |
Confusion |
Customers |
Decision |
Difficulty |
Disapproval |
Expensive |
Frustration |
Interest |
Lateness |
Make an Intro |
Need Confirmation |
Next Steps |
Not Interested |
Options |
Personal Background |
Pricing |
Problems |
Recommendation |
Request for Information |
Screen Sharing |
Skepticism |
Slowness |
Something Broken |
System Problems |
Timeline |
Uncertainty |
Want to Cancel |
Want to Return |
Want to Try |
Create Audio Processing Stream
Creates an audio processing websocket stream for performing streaming speech recognition and natural language processing. This endpoint provisions 2 websockets: One that can be used for streaming audio data (16000hz, signed little endian 16bit PCM format), and another that can be used to receive streaming JSON events (transcript, NLP enrichments, live insights, etc). Websockets are automatically closed if a connection is not made to the audio endpoint within 60 seconds of provisioning. Streams may be active for a maximum of 3 hours.
Endpoint:
/v1/stream/create
HTTP Method:
PUT
URI Parameters:
None
Returns:
If successful: HTTP 200
Return Payload:
{
"status": "ok",
"result": {
"stream_id": "ID_OF_CREATED_REALTIME_STREAM",
"audio_socket": "HTTPS_URI_OF_CREATED_AUDIO_WEBSOCKET",
"event_socket": "HTTPS_URI_OF_CREATED_JSON_EVENT_WEBSOCKET"
}
}
Code Sample:
from hyperia import Hyperia
import json
import sys
import websocket
import time
import threading
def sender_thread(ws, file):
print("Starting sender loop.")
bytes = file.read(640)
while bytes:
ws.send_binary(bytes)
bytes = file.read(640)
time.sleep(0.02)
print("Finished sending")
def receiver_thread(ws):
print("Starting receiver loop.")
while True:
message = ws.recv()
print(message)
print("Receiver exiting")
def open_websockets(socket_id, audio_socket, transcript_socket, file_path):
print(f"Connencting to socket {socket_id}")
ws_send = websocket.WebSocket()
socket_url = audio_socket
print(socket_url)
ws_send.connect(socket_url)
ws_recv = websocket.WebSocket()
socket_url = transcript_socket
print(socket_url)
ws_recv.connect(socket_url)
print("Connected..")
file = open(file_path, "rb")
send_thread = threading.Thread(target=sender_thread, args=(ws_send, file))
recv_thread = threading.Thread(target=receiver_thread, args=(ws_recv,))
print("Starting receiver.")
recv_thread.start()
print("Starting sender.")
send_thread.start()
send_thread.join()
file_path = "SOME_FILE_PATH_OF_16000HZ_L16_PCM_DATA"
# Create the Hyperia Object
hyperia = Hyperia()
response = hyperia.stream_create()
stream_id = response['result']['stream_id']
print(f"Created stream {stream_id}")
audio_socket = response['result']['audio_socket']
event_socket = response['result']['event_socket']
open_websockets(stream_id, audio_socket, event_socket, file_path)
List Active Media Streams
Lists media streams that are currently active.
Endpoint:
/v1/stream/list
HTTP Method:
GET
URI Parameters:
None
Returns:
If successful: HTTP 200
Return Payload:
{
"status": "ok",
"results": [
{
"stream_id": "ID_OF_CREATED_REALTIME_STREAM"
}
]
}
Code Sample:
from hyperia import Hyperia
import json
import sys
import time
# Create the Hyperia Object
hyperia = Hyperia()
response = hyperia.stream_list()
for stream in response['results']:
print(f"Active stream {stream['stream_id']}")
Check For Stream Existence
Checks to see if an active realtime stream exists using a stream ID.
Endpoint:
/v1/stream/id/<stream_id>/exists
HTTP Method:
GET
URI Parameters:
None
Returns:
If successful: HTTP 200
Return Payload:
{
"status": "ok",
"exists": true | false
}
Code Sample:
from hyperia import Hyperia
import json
import sys
import time
stream_id = "SOME_STREAM_ID"
# Create the Hyperia Object
hyperia = Hyperia()
response = hyperia.stream_exists(stream_id)
print(response['exists'])