Skip to Main Content
Shape the future of IBM watsonx Orchestrate

Start by searching and reviewing ideas others have posted, and add a comment (private if needed), vote, or subscribe to updates on them if they matter to you.

If you can't find what you are looking for, create a new idea:

  1. stick to one feature enhancement per idea

  2. add as much detail as possible, including use-case, examples & screenshots (put anything confidential in Hidden details field or a private comment)

  3. Explain business impact and timeline of project being affected

[For IBMers] Add customer/project name, details & timeline in Hidden details field or a private comment (only visible to you and the IBM product team).

This all helps to scope and prioritize your idea among many other good ones. Thank you for your feedback!

Specific links you will want to bookmark for future use
Learn more about IBM watsonx Orchestrate - Use this site to find out additional information and details about the product.
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

Status Not under consideration
Created by Guest
Created on Jan 18, 2022

Watson Speech to Text should return timestamps accurate to milliseconds for transcription

Real-life scenario:

Researchers from Brandeis University, Boston University, Harvard, Boston College, and Northeastern University are investigating cognitive aging and biomarkers of dementia. Currently, they have been hand-scoring cognitive interviews which is arduous. To automate some of the manual work, Watson Text to Speech service is being used to run audio transcriptions through the service while getting back transcriptions with timestamps of when a word starts to be spoken and when it is ended in speech. The timestamps returned from the service return with 2 decimal places which is not enough precision that is required to compare to prior work (which uses 3 decimal places -- precision is up to milliseconds).


Problem statement:

The issue that researchers mentioned above are running into is comparing transcription timestamps generated by Speech to Text service to hand-scoring done for interviews prior to using Watson. The precision mismatch does not allow the researchers to use Watson effectively.


Current workaround:

No workaround is possible since there is a mismatch in precision level returned by Watson Speech-to-text service.


Proposed solution:

Having timestamps returned with 3 decimal places (with millisecond precision) would enable the researchers to automate the longitudinal research with Watson Speech-to-Text service.


Benefits/Value:

Watson Speech-to-Text service returning timestamps with milliseconds precision will enable customers around the world to get more accurate transcriptions and use the service more confidently in research globally. With precision, researchers will be more likely to use the service and cite it in publications.


Users impacted:

Every user using Watson Speech-to-Text around the world will get more precise data points for transcription without their current downstream applications breaking due to this change.

Idea priority Urgent
  • Admin
    Marco Noel
    May 9, 2022

    I'm not sure I understand the business value of adding an extra decimal to a timestamp. Please expand.

    Once we have more details, we will review but at this time, we are dealing with higher priorities.

  • Guest
    Mar 3, 2022
    Could I get an update on this please, this is a high priority request which will determine if the research will keep using IBM Watson or switch to a different Speech to text service.