Intelligent Voice (IV) introduces new concepts when it comes to provides the following features for ASR models:
IV supports automatic language detection and the transcription of multi-language calls. When transcribing a call, language detection requires the definition of up to 3 languages (plus the language detection model).
IV allows easy customization of ASR models (the lexical model), enabling customers to create new models (the process is called model adaptation, for more information, see https://support.intelligentvoice.com/hc/en-us/articles/360044447274-Model-adaptation). Both partners and customers can adopt models. Models adapted by customers should be In a multi-tenant configuration, models adapted for specific tenants are considered private and not be shared with accessible to any other customertenant. Models adapted by partners can be potentially shared with multiple customers.
IV license articles limit the number of languages used to transcribe calls for individual users. A basic license only allows to use of a single language per user (different users can have different languages configured), other licenses allow the usage of multiple languages per user.
In order to support the new requirements, VFC introduces the concept of ASR model management.
ASR model administration
...
ASR model administration
The ASR Models are administered on the Intelligent Voice portal. VFC synchronizes the data between the two systems, and the Reference Environment Administrators can enable the ASR Models to be used by specific tenants.
Tenant Administrators are only allowed to list the ASR Models, they cannot change anything.
Finding and listing ASR models
The synchronized ASR models can be listed under Data / Data Management / ASR Models
...
The ASR models need to be in sync with the IV system:
...
Manual synchronization would allow quick implementation but would not be prone to errors and would generate unnecessary support issues
...
Automatic synchronization would be ideal, but requires more work, the Get ASR Model List API returns all the available models in the IV system: https://api-docs.intelligentvoice.com/?version=latest#6b681c45-dc74-4645-94f5-b929669f2c0e
...
Mapping:
"id": 1 → this field is an auto increment ID in IV, we should use it as a primary means of identifying the model in IV
"createdBy": "IntelligentVoice" → this field can contain any text value and should be used only for information purposes
"languageCode": "en-GB" → this field identifies the language of the model, we need this information to enforce language restrictions per user (the basic license allows one language per user, but there can be multiple ASR models for the same language)
"sampleRate": "8kHz" → not sure what is this, I don't think it is relevant
"lexiconSize": 145000 → informational only, we should store and display it
"description": "general" → free text description of the model
"version": "V3_ASRv5" → TODO: ask IV what is this, if this is generated or should follow a pattern
...
The ASR Models page will list the configured models in the system:
The list will display the following fields:
Created By: IntelligentVoice or VerintPartner1 or CustomerTenant2
Language: en-GB
Lexicon Size: 145000
Description: Trader Voice ASR model
Version: V3_ASRv5
Type: Intelligent Voice (for future if we will have models for other integrations)
The list must be tenant aware, and only display the models available in the tenant
Permission required: ASR Model Read
The ASR model edit page allows:
...
Viewing ASR model data (see above), editing is not allowed (IV does not support updating any of the fields)
...
Associating models with VFC tenants
As explained above, the models can be restricted to specific tenants. By default, models should only be available to the reference environment (0000).
Autocomplete field to select one or more VFC tenants, option to configure visibility to all tenants
Permission required: ASR Model Update
...
Deletion
Delete button
Permission required: ASR Model Delete
Confirmation required: "Are you sure you want to permanently delete the "description" ASR Model?"
Check required: if the model is currently configured for any of the policies, "The "description" ASR Model cannot be deleted, because it is currently used by the following transcription policies: ..."
When a model gets deleted, it has to be deleted in IV as well: https://api-docs.intelligentvoice.com/?version=latest#15f500ef-5b34-4ba5-bcf5-d69d73b6fe95
...
Adaption
Adapt buttonPermission required: ASR Model CreateA new page opens with a form to submit the data for model adaptation, TODO: design the adaptation pageWhen a model gets adapted, the following IV API must be used:https://api-docs.intelligentvoice.com/?version=latest#40d58fb1-e542-4024-b73f-eaaf726e5dddFor now, administrators can adapt a model using the Intelligent Voice administration portal (/JumpToWeb/admin/config/jumptoweb/asr-models/adapt). Once the new model becomes available in the IV system, the models can be synchronized to VFC and the desired tenants can be selected. This model prefers system and tenant administrators doing the adaptation, not customers. The adaptation puts an extra load on the IV infrastructure, so tenant administrators will want to keep control and not allow customers to start model adaptation on their own.
Deactivation:
...
Deactivate button (is the model is active)
...
Permission required: ASR Model Update
...
. The list provides the following information about the available ASR models:
Name | Description | Sample Value |
---|---|---|
Full Name | The name of the ASR model defined in the IV system. | IntelligentVoice_en-001_8kHz_94000_general_V5.1_ASRv6 |
Data Processor | The name of the data processor defined in the Verba system which was used to synchronize in the ASR model. | - |
Model ID | The ID of the ASR model in the IV system. | 1 |
Created By | The name of the creator of the ASR model as defined in the IV system. | IntelligentVoice |
Language Code | The language code associated with the ASR model as defined in the IV system. | en-001 |
Lexicon Size | The size of the lexicon in the ASR model. | 94000 |
Description | The description for the ASR model as defined in the IV system. | general |
Version | The version of the ASR model as defined in the IV system. | V5.1_ASRv6 |
Associating ASR models with environments
By default, the ASR models are only available in the reference environment in a multi tenant configuration. In a single tenant system, the ASR models are automatically available. In order to associate an ASR model with one or more tenants, follow the steps below:
Step 1 - Navigate to Data / Data Management / ASR Models and select the ASR model.
Step 2 - Under Choose Environments, select the environments/tenants and click on the >> button to select the environment(s) to be associated with the ASR model. If you want to make the ASR model available for all existing and future environments, click the Visible for All Tenants checkbox.
Step 3 - Press the Save button to save the new configuration.
Activating and deactivating ASR models
If you decide to not use an ASR model temporarily, you can deactivate the ASR model. When deactivating an ASR model, the system checks if the ASR model is currently configured for any of the
...
When a model gets deactivated, it has to be deactivated in IV as well: https://api-docs.intelligentvoice.com/?version=latest#3b534770-773e-4a34-90cb-c83bd353dff8
...
Activation
Activate button (if the model is deactivated)
Permission required: ASR Model Update
When a model gets activated, it has to be activated in IV as well: https://api-docs.intelligentvoice.com/?version=latest#ed7be295-1808-4126-bad3-ce9ee9c6c3e7
ASR model selection in transcription policies
Rename Language to ASR Model (it appears that it is more accurate)
If the processor is IV, allow specifying up to 4 models, IV recommends defining up to 3 languages plus the language detection model, but we don't differentiate so we should enforce 4 for now
The list should contain all models configured in the system available in the given tenant
TODO: how to align the model selection with the license restrictions
License enforcement
...
The key difference from other speech integrations is the limit on the number of models a user can have. The basic license only allows a single ASR model to be used for transcribing the calls of a specific user.
...
A system can have a mix of licenses, so certain users will be limited to a single model, while others won't be limited.
...
The current enforcement logic, which uses permission to assign a license to a user is suitable for IV as well, but it has to be extended.
...
To enforce a single model, the model has to be selected when permission is added to the role.
...
Only models available in the tenant can be listed for selection.
...
transcription policies, and it doesn’t allow deactivating until it is configured for at least one policy. Deactivated ASR models are also automatically deactivated in the IV system. You can activate deactivated ASR models again. In order to activate or deactivate an ASR model, follow the steps below:
Step 1 - Navigate to Data / Data Management / ASR Models and select the ASR model.
Step 2 - Press the Deactivate or Activate button to deactivate the ASR model.
Deleting ASR models
If you decide to not use an ASR model anymore (e.g. you have a new adopted model to replace the old one), you can delete the ASR model permanently. When deleting an ASR model, the system checks if the ASR model is currently configured for any of the transcription policies, and it doesn’t allow deleting until it is configured for at least one policy. Deleted ASR models are also automatically activated in the IV system. In order to delete an ASR model, follow the steps below:
Step 1 - Navigate to Data / Data Management / ASR Models and select the ASR model.
Step 2 - Press the Delete button to permanently delete the ASR model.