S3 Ingress "How-to"


Step 1: AWS Bucket Setup


  1. In your AWS Account, navigate to the Cloudformation Console.
  2. Click Create Stack > With new resources (Standard)
  3. In the Amazon S3 URL box, enter the Cloudformation Template URL.
  4. In the Stack Name field, type in a name for your Cloudformation Stack, e.g. "voicebase-s3-ingress"
  5. In the CompanyName field, enter a unique string to represent your company (e.g. "mycompanyinc". This will be used to name your resources.
  6. In the ExternalID field, enter your External ID. This is provided to you by Voicebase and is in the format xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.
  7. Choose the region your Voicebase account is hosted in (US or EU). Then click Next.
  8. The next screen of stack options is optional. Click Next.
  9. Finally, on the review screen, scroll down to check the box I acknowledge that AWS CloudFormation might create IAM resources with custom names,
    then click Create Stack.
  10. Once the Cloudformation stack has completed creation, click the Outputs tab and copy the IAMRoleArn (in the format arn:aws:iam:[account number]:role/vb-[CompanyName]-s3-ingress-role) and the S3Bucket (in the format vb-[CompanyName]-s3-bucket. You will need these values to enter into the Voicebase Portal.
  11. You will find the S3 bucket created in your list of S3 buckets with the name vb-[CompanyName]-s3-bucket.

Step 2: VoiceBase Portal Setup


  1. In your VoiceBase account go to Manage Data/Connections/Add Connection to add your s3 connection.
  2. Choose Custom and Audio from the dropdown menus.
  3. Fill fields as follows from the AWS Cloudformation/Stacks/Outputs Tab:
    • S3 Role = IAMRoleArn
    • S3 URL = s3Bucket URL
    • SQS URL = SQS Notification Queue
    • SQS Response URL = SQS Response Queue
    • DLQ URL = SQS Dead Letter Queue
    • Folder = folder name from s3 bucket
  4. Configure specific features for your upload in the Configuration tab if needed. This may include creating a Custom Vocabulary list for a more accurate transcription:
    • Go to Manage Data > Custom Vocab List > Add List
    • Create your list of out of dictionary terms like proper nouns, acronyms, etc that you would like to add to your account’s vocabulary.
    • You may add Sounds like verbiage and a weighting between 0-5, with 0 being the default weight. We recommend 2 to start.
  5. Check AWS SQS/Queues to see Responses sent from VoiceBase.
  6. Go to the Workbench in your account and query to see uploaded results from your AWS bucket.
    Example: SELECT * FROM media WHERE dateCreated > "2021-01-01T00:00:00.000+0000"


Step 3: Metadata Upload to S3 Bucket


The VoiceBase ETL schema includes two main fields for metadata: "extended" and "callDetails". Other fields available are "externalId”,"title", and "description".
The "callDetails" fields are indexed by default, along with "externalId","title", and "description".

The callDetails fields include many common fields typically used by customers. The "extended" field is for custom fields that may be unique for a particular use case. For example, it may include information such as location, agent ID, CSAT score, campaign name, call drivers, call dispositions, or call types. Any custom "extended" fields must be indexed by the user before uploading so they are searchable.

Example syntax for AWS upload: Example syntax for AWS upload: {'Metadata':{'title':'REO.wav','callDetails':'agent'.'externalId’:'agentkqrw_7575', 'extended.location':'12345'}}

Example in json format: { "title":"RE0.wav", "callDetails": { "agent": { "externalId": "agentkqrw_7575", "extended":{ "location":"12345" } }


As seen in the above examples, AWS requires nested fields in metadata to be delimited by dots. When uploaded, AWS downcases all fields. VoiceBase will convert the fields to its required case-sensitive syntax as part of the Ingress process from s3.


Glossary of Terms and Expanded Notes


Cloudformation Template Permissions
Permissions allow the VoiceBase platform being able to get objects in the bucket (GetObject) and list objects in the bucket (ListBucket).
Permissions also allow for actions in the SQS (Simple Queue Service), specifically to DeleteMessage,DeleteMessageBatch,GetQueueAttributes,GetQueueUrl, ReceiveMessage,SendMessage, and SendMessageBatch.

IAMRoleArn
Amazon Resource Names (ARNs) uniquely identify AWS resources. AWS requires an ARN when you need to specify a resource unambiguously across all of AWS, such as in Identity Access Management (IAM) policies and the roles attached to those policies. The IAMRole created through the Cloud Formation Template is used by VoiceBase to access the S3 Bucket & SQS Queues generated.

SQS Queues
The Amazon Simple Queue Service allows you to send, store, and receive messages between software components. To view the SQS Notification Queue, Response Queue, or Dead Letter Queue, go to Amazon SQS > Queues.

SQS Notification Queue: The URL of the SQS Queue containing source file notifications.
SQS Response Queue: The URL of the SQS Queue where responses will be delivered.
SQS Dead Letter Queue: The URL of the SQS Queue for messages that can't be delivered due to client errors or server errors. These messages are held in the dead-letter queue for further analysis or reprocessing.

VoiceBase Analytics Workbench
Access to the VoiceBase Analytics Workbench is through your VoiceBase account. The Workbench allows users to filter data with the VoiceBase Query Language (VBQL). Documentation with a guide to syntax with examples is accessible in your account.