Searching Conversations
Overview
Hyperia stores and indexes analyzed meetings and uploaded media files in a searchable online index. APIs are provided for performing aggregate search operations against uploaded files. A variety of search filter mechanisms (keyword, speaker, datetime, topic, intent/action tag, etc) are provided.
Transcript Search
Searches transcripts of conversations in a workspace. Both utterance (keyword, speaker) and document-level (document title, date-time, etc) filters may be used when performing search operations.
Endpoint:
/v1/workspace/id/<workspace_id>/doc/search/transcript
HTTP Method:
PUT
URI Parameters:
Name | Parameter |
---|---|
workspace_id | ID of workspace to search |
PUT JSON Payload:
Name | Parameter |
---|---|
utterance_phrase_match | An array of one or more search terms (unigram or multigram phrase queries) |
Document-level filters may also be leveraged when performing this operation. Supported search filters are described here.
Returns:
If successful: HTTP 200
Return Payload:
{
"status": "ok",
"results": [
{
"about": {
"description": "DESCRIPTION_OF_CONVERSATION",
"labels": [
{
"label_id": "GUID_OF_LABEL",
"label_name": "NAME_OF_LABEL"
}
]
},
"duration": DURATION_OF_CONVERSATION,
"speakers": [
"id": "GUID_OF_SPEAKER",
"name": "NAME_OF_SPEAKER"
],
"utterances": [
{
"startTime": UTTERANCE_START_TIME_IN_SECONDS,
"endTime": UTTERANCE_END_TIME_IN_SECONDS,
"speaker": {
"id": "SPEAKER_GUID",
"name": "SPEAKER_NAME"
},
"transcript": "TRANSCRIPT_TEXT"
}
]
}
]
}
Code Sample:
from hyperia import Hyperia
import sys
workspace_id = "SOME_WORKSPACE_ID"
search_phrase = "SOME SEARCH PHRASE"
hyperia = Hyperia()
// optional filter
filters['title_phrase_match'] = "SOME_TITLE_FILTER"
response = hyperia.workspace_search_transcript(workspace_id, search_phrase,
filter_params=filters)
if not response:
sys.exit(-1)
results = response['results']
for result in results:
print(f"{about['description']}: {len(about['utterances'])} matches")
Supported Search Filters
Hyperia's various search APIs provide the ability to filter on a variety of utterance and document-level fields in your data. Listed below are currently supported search fields:
Name | Description |
---|---|
utterance_phrase_match_list | Array of strings, where each string is a unigram or multigram (phrase) search. |
topic_list | Array of topics to match against. Array is applied as an OR query (only one topic in the array must match) |
date_range | Date constraint object. Contains: "time_zone": "+01:00", "gte": "YYYY-MM-DD", "lte": "YYYY-MM-DD" |
time_range | Time constraint object. Contains: "time_zone": "+01:00", "gte": "HH:MM:SS-0600", "lte": "HH:MM:SS-0600" |
speaker_list | Array of speaker IDs. Results will be restricted to documents containing at least one of the specified speaker IDs. |
tag_id_list | Array of tag IDs. Results will be restricted to documents containing at least one of the specified tags. |
doc_id_list | Array of document IDs. Results will be restricted to documents matching one of the specified IDs. |
These fields may be specified when using the Hyperia Transcript Search and Aggregation APIs.