Auto Transcribe Video to Text Free of Charge: 7 methods tested

We’re an affiliate: We hope you love our articles and the products we recommend! Just so you know, when you buy through links on this site, we may earn an affiliate commission. Thank you if you use our links, we really appreciate it!

There are plenty of occasions when having a transcript of your video is useful. You could have several contributors and want to edit the content on paper. Perhaps you’re using video interviews in your recruitment process. Or maybe you want to repurpose video content. Before moving to a paid service, which can easily eat into small budgets, there are free alternatives you can try.

If you want to auto transcribe video to text free of charge, there are 7 methods that are available to you. In our tests, the accuracy ranged from 97.6% to 82.4%. The services were Youtube auto transcribe, iPhone Speech Recognition, Speechmatics demo, Google Docs Voice Typing, Microsoft Word Dictate, Google Cloud Speech-to-Text demo, IBM Watson Speech to Text.

Let’s look at the order, individual accuracies, and our experience. Hopefully, this will help you decide whether you want to use these free auto-transcribe video-to-text services.

Table of Contents

And the Auto-Transcribe Recorded Speech to Text winners are…

Youtube auto transcribe service (97.6% accuracy)
iPhone Speech Recognition (93.9% accuracy)
Speechmatics demo (93.9% accuracy)
Google Docs Voice Typing (92.7% accuracy)
Microsoft Word Dictate (92.1% accuracy)
Google Cloud Speech-to-Text demo (89.7% accuracy)
IBM Watson Speech to Text service (82.4% accuracy)

If all you want is the accuracy results of my test, you can see them above. But if you want to know how I tested those seven free audio-to-text converters, what I really thought of each service, and how to optimize your audio so you can use these auto-transcribe services, keep reading.

Methods used to transcribe audio from video to text

Recently I recorded interviews with 36 teachers. Although they were all English speakers, they were Scottish and represent more of a challenge for the speech recognition software.

Since I knew one of the apps would only load files that were under 60sec I chose an interview clip that was 55.5 seconds as my test file. The audio file format was 48kHz, 16 bit, stereo .wav file.

Auto Transcribe Video Interviews Free of Charge - Pinterest Graphic

Even though I don’t enjoy hand transcribing videos, I did that to make sure the transcription was correct. It turned out to be 165 words, so, a word rate of almost 3 words per minute.

Although I tested seven different options, each audio-to-text converter involved one of two ways of getting the data, that’s the recorded speech, into the speech-to-text program.

The first involved playing the audio recording into a microphone connected to a computer, or in one case, my iPhone.

While the second involved uploading the audio file to an online platform or cloud service.

Options to transcribe a video to text

YouTube auto transcribe service (97.6% accuracy)
iPhone Speech Recognition (93.9% accuracy)
Speechmatics demo (93.9% accuracy)
Google Docs Voice Typing (92.7% accuracy)
Microsoft Word Dictate (92.1% accuracy)
Google Cloud Speech-to-Text demo (89.7% accuracy)
IBM Watson Speech to Text service (82.4% accuracy)

YouTube auto transcribe service

YouTube provides an excellent free method to transcribe from video to text. It’s their auto-generated closed captions or subtitles.

In my test video to text test, YouTube produced an automatic transcription with an accuracy of 97.7%. That seems to put YouTube way ahead of the pack if you are looking for a free audio to text converter.

The quality of the transcription is good but there are some cons.

Firstly, YouTube auto transcription is not quick. In fact, it was the slowest in the test group.

My test file was only a minute long, but it took several minutes before the closed captions became available.

Some colleagues have told me their long videos can take 12 hours to process, but that hasn’t been my experience.

But what is a pain is that there’s no notification from YouTube to tell you the auto-generated closed captions are ready. You’ll just have to keep checking.

The second con is that when you download the closed captions they come with timestamps. After all, they are closed captions so it makes sense that the timing data should be included. You might think that this means you will have to go through the text deleting the timing information line by line. Don’t worry, you can get rid of the timestamps.

Look at the top right of the window and click on the three vertical dots. A menu opens with a single item, “Toggle Timestamps”. Click on this and all the timestamps will disappear, and you can copy the transcript without all the annoying timestamps.

However, if you want the extra accuracy and don’t mind waiting for your transcript this is how to transcribe videos on YouTube.

Your video clip must be in a video format YouTube supports such as MPEG4, MP4, AVI, WMV, MPEGPS, 3GPP, WebM, DNxHR, ProRes, CineForm or HEVC (h265).

You’ll also need a YouTube account. Log in and click the “Create a video or post” icon that looks like a movie camera with a plus sign in the middle.

Follow the instructions but make sure you make the video Unlisted or Private. You don’t want to notify your subscribers each time you upload a video clip to be transcribed.

Once your video is uploaded, you’ll need to wait a while until the YouTube auto transcribe service has finished completed the transcription.

You can now download the closed captions under the Advanced tab of the Video Details on YouTube. It will be a .sbv closed caption file that can be opened in a text editor such as Word.

Unless you toggle the timestamps in YouTube’s transcript window, the document will contain the time-stamped transcript, so if you don’t want the time information, you’ll have to manually delete it. Once that has been deleted, you’ll need to punctuate the text since the YouTube auto transcribe service does not attempt to punctuate the text.

iPhone Speech Recognition

This one really surprised me.

I knew my iPhone usually did a decent job of correctly converting my speech to text when I spoke slowly and as clearly as possible.

However, I had no idea about how it would handle natural speech at 3 words a second.

As it turned out, the answer was, very well.

I had my iPhone’s microphone within a few inches of the loudspeaker and playback was at a reasonable level.

The Notes app was used to capture the resulting text.

Unlike the YouTube auto transcribe service, Apple’s Speech Recognition punctuated the text.

It also recognized that “Professional dialogues critically important…” meant “Professional dialogue’s critically important…”.

Apple’s Speech Recognition changed it to the more formal “Professional dialogue is critically important…”.

In context, I considered this to be a plus mark for Apple’s AI tech in the software.

But the AI wasn’t always correct.

Since I could see the speech recognition doing its stuff on the screen, I noticed the app initially wrote down the correct text, only to “correct it” to something that was slightly wrong.

However, comparing my master hand transcribed text file with the one produced by Apple’s Speech Recognition, I counted only 10 errors.

So, this seemingly low-tech method of playing audio into my iPhone returned an impressive 93.9% accuracy.

I really wasn’t expecting such a good result.

Speechmatics cloud subscription service

Although Speechmatic came joint second in terms of accuracy, I’ve placed it third in my ranking because of the limitations of the free online service.

You can upload either a video or audio file that’s up to 2 minutes long and a maximum of 50MB in size. Although the upload process may allow you to choose a file that’s over 2 minutes long but still under 50MB, you’ll find the file is truncated to 2minutes.

Click here to go to the Speechmatics free transcription demo page and click on “Transcribe a media file”. Then follow the on-page instructions to choose a media file to upload. You will also need to enter an email address where the speech to text transcription will be sent as a .txt text file in under five minutes.

I like Speechmatics very much, except for the limitations imposed on the auto transcribe video to text free demo service. Considering the power of Speechmatics it’s understandable that access to the unrestricted platform is only available as a paid-for service.

That’s not quite true. If you register for the Speechmatics cloud subscription service you will get 60 minutes worth of credits free of charge. So, you can technically use the full service free of charge, albeit for 60 minutes of audio.

If you do require more minutes, you can then buy additional credits from just £10. The pricing is £0.06 per minute when you buy £10 worth of credits.

So, you can use the restricted service if your files are less than 2 minutes long. Or register and get 60 free minutes.

But what is Speechmatics like to use?

Like most speech recognition software, it’s AI-driven. You can use the API or the web app, which is simple to use, and the transcripts are great.

When transcribing my test file Speechmatics only made ten errors, which represents almost 94% accuracy. This is close to the claimed, “up to 95% accuracy”.

The web app also correctly punctuated the output. The output works well enough for editing dialogue “on paper” but would need cleaning up if it were to be used in front of the public, although not by much.

Google Docs Voice Typing

I still think there’s something almost magical to see words appear on my computer monitor a moment or two after hearing them spoken.

Google Docs Voice Typing was one of the best performers in my tests and quite different from the results from Google Cloud Speech to Text.

I would imagine that Google’s Voice Typing and the Cloud AI services would both use the same speech recognition technology, but the results were miles apart.

Even though Voice Typing required the speech to be entered through a microphone, it was more accurate than uploading the audio file to Google Cloud service.

The only negative point is that Voice Typing is a real-time process. The audio must be played into the microphone in real time. The other issue is that you probably will need two computers, one to play the video file and the second to run Google Docs Voice Typing.

If you aren’t a Google Docs user and you want to try this method, you’ll need a Google account. That, of course, is free, and so is access to Google Docs.

Once you’ve logged in to Google, go to Google Apps in Chrome and click on Docs.

Once you’ve created a blank document go to Tools in the menu bar and select Voice Typing, or use the keyboard shortcut Ctrl+Shift+S. Once voice typing has been activated you can start playing your audio file into your microphone.

As if by magic the speech appears as words on your screen.

Although Google Docs Voice Typing only made 12 mistakes out of 165 words, an accuracy of 92.7%, there is one drawback. The software expects you to add punctuation and formatting through voice commands. So, your video transcript will effectively be one long sentence.

You will have to punctuate and format the transcript once the speech recognition has been completed.

For full instructions on using Google Docs Voice Typing are available on Google’s support pages.

Microsoft Word Dictate

MS Word Dictate is not very different from Google Docs Voice Typing.

Both use the same method of playing the audio into the app by use of a microphone placed in front of a loudspeaker. Like the Google offering, MS Word Dictate is a real-time method. You will have to play your entire audio file into word.

So, when you want to auto transcribe video to text free of charge you need to decide whether you are willing to put up with the inconvenience of a real-time service. Or whether you just want to upload your file and get on with another task until you get the transcribed text.

Like it’s Google cousin MS Word Dictate will not try to add any punctuation to the text. Again, that’s not surprising since the app expects to hear voice commands for punctuation.

In terms of accuracy, MS Word Dictate and Google Docs Voice Typing were close. Dictate came in with an accuracy of 92.1% while Voice Typing was only slightly ahead at 92.7%.

The accuracy scores are so close that I would be happy calling them equal. I wouldn’t be surprised if the results were the other way round when using a different test file.

If I were to be fussy with MS Word Dictate it would be the words that it did get wrong.

“Importantly” became “Important Li”. I’ve read that Li is the second most common surname in China but there was nothing in my test file that would suggest the content was about a Chinese person of that name.

Another example is changing, “Within departments or faculties…” to “Within department store faculties…”. If there’s some AI behind Word Dictate it should have realized the whole file was about education and nothing to do with retail shopping.

But I’m deliberately looking for something negative to say about this free audio to text converter. On the whole MS Word Dictate performed well in converting my test file to text.

Google Cloud Speech-to-Text demo

Google Cloud Speech to Text screenshot — Google Cloud Speech to Text demo screenshot

The Google Cloud Speech to Text service web page includes a free demo that’s based on a sample application/UI that was built using the Cloud Text-to-Speech API.

You can upload audio or video files that are up to 1 minute long. Once the speech recognition AI has done it’s thing the text appears in the window and you can copy and paste it into your word processor.

The webpage invites you to sign-up to the for the full service for free. Looking at the actual pricing details showed each month you get the first 60 minutes of audio conversion for free.

But sticking with the free 1-minute demo, once your file has been uploaded and transcribed you can choose between a few different speech recognition models.

Although I uploaded an audio .wav file you can also upload the video file containing the audio. I would advise you to do so.

With an audio file, you can only switch between the Default and the Command/Search speech models. I found the Command/Search model to be slightly more accurate.

If you upload a video file you can try two further speech recognition models, Phone Call and Video.

The Phone call model shows two levels of speech to Text conversions, Basic and Enhanced, side by side. With my test file, the Basic version seemed more accurate than the Enhanced version.

With my test file, the different models produced small differences in the resultant text. But I was curious about how the AI would cope with a slightly different British accent.

My test file was part of an interview with a well-spoken Scottish CEO from Edinburgh. But I also had a video clip of a businessman from North West England. These two parts of the UK are only about 250 miles apart, but the accents are quite different.

Boy, what a difference that caused in Google Cloud speech recognition!

Remember, the different speech-to-text models produced minor differences when the uploaded file was a well-spoken Scottish accent. With the Lancashire accent, the different speech-to-text models were all over the place.

The Video model gave the best result. Not perfect but usable with some corrections. Here’s the result.

“So here we are at Expo Northwest at Burnley football club, and we’re here for hashtag events organized by Andrew Charlton as usual great events always gets lots of great people here. Lots of great exhibitors and absolutely fabulous speakers are say that we some degree of modesty because I’m one of them, so if you haven’t been to one of Andrews events before you should do anything around the burn area today and glad to see you.”

The Basic Phone Call model produced the following text.

“Expo Northwest applicable Coke and we’re here for her spring event openings for Andrew Charlton as usual great event scheduled to drive safely. Love you guys exhibitors actually fabulous Vegas aside that we some degree Mother’s day goes on one of them. So if you haven’t been to one of our events before you should do here and we are in today exactly.”

It’s complete nonsense. The Enhanced Phone Call model wasn’t much better, which produced this.

“So he will act expand Northwest at Burnley football club and we’re here for hashtag events organized by Angela Charlton as usual great events always gets lots of lay people here. Lots of great exhibit just an absolutely fabulous Vegas. I say that with some degree of modesty cuz I’m one of them. So if you haven’t been to one of Andrew’s against before you should do anything around the Bernie earlier today if you’d like to see them and”

From that point on the speech to text took a nose-dive. The Command/Search model came up with this.

“Northwest activity football club and we hit 4,000 as usual right events always gets. What should I pay play Lottery by Jade Jupiter actually public speakers. I say that with some degree Mother’s day goes on one of them. So if you haven’t been to one of Andrew’s it ends before you should do any fear and the billionaire in today and you got to say”

I have no idea how “play Lottery by Jade Jupiter” appeared, but “Mother’s Day” is obviously meant to be “modesty”.

The Default model produces the following result.

“So he we all act Expo northwest activity football club. And as usual Reggie Banks always gets what should I pay play Lottery by Jade Jupiter actually public speakers. I say that with some degree Mother’s day goes on one of them. So if you have Around the building are arranged. I would like to see.”

Depending on which model you choose, the Google Cloud Speech to Text demo is quite fun. Although the correct model produces acceptable results the others would be great for 1970s psychedelic song lyrics.

This inevitably poses the question, what did I think of the demo?

I like the fact that I can upload audio and video files directly to the cloud service. I also liked the fact that there are a lot of language choices, including many versions of English. What I wasn’t keen on were the results.
Perhaps I’m not being hard on Google Cloud Speech to Text.

Maybe the full AI service with sixty free minutes per month is more capable. But according to my test, if you want a free audio to text converter there are more accurate competitors out there.

In the final analysis, Google Cloud Speech to Text, using the Command/Search model, produced 17 errors out of 165 words. An accuracy of 89.7%. Not bad considering it’s free, but not great either.

The best thing about this service is that it’s quick. You upload your audio or video file and back comes your transcript.

IBM Watson Speech to Text Demo

Watson Speech to Text is can convert Arabic, English, Spanish, French, Brazilian Portuguese, Japanese, Korean, German, and Mandarin speech into text.

IBM points out that the webpage and the Speech to Text are for demonstration purposes only and is not intended to process Personal Data. And that you are authorized to use the Application and associated content for your own personal use for informational purposes only.

Having said that, let’s see what it can do.

To transcribe your audio, you can either use a microphone to record your audio or upload an audio file from your video. Acceptable audio formats include .mp3, .mpeg, .wav, .flac, or .opus.

The application allows you to select between 8kHz narrowband and 16kHz broadband models. Since your audio should be of the highest quality select a 16kHz model. Do the same if you decide to use a microphone.

If you are going to use a microphone, click on the Record Audio button and start playing your audio file.

If you are using an audio file, click on the Upload Audio File button, then choose your file to upload.

Your converted speech will appear in the text box in real time, or with a small delay.

In my test, the speech recognition was broadly correct. It even correctly picked up a couple of “uh” fillers in the speech that none of the other services picked up.

At the top of the text box, you can switch between the text and word timings and alternatives. If you choose the latter, you can then hover over each word and you’ll see alternative suggestions and Watson’s confidence level in each suggestion.

Like some of the other services, there’s no punctuation in the converted text.

Unfortunately, Watson was the least accurate of the 7 speech to text services, with 29 errors in my 165-word file. That represents an accuracy of 82.4%.

How to optimize your audio transcription accuracy

Like many computer processes, the quality of your result depends in part on the quality of your data. In speech-to-text transcription, your data is the audio you feed into the system.

So, when you transcribe recorded speech to text the better the quality of your audio, the better your results are likely to be. The more you optimize your audio for accuracy the better.

Here’s what you can do to boost your chances of getting a good transcript.

Minimize the background noise on your recording

Ideally, your recording should have as little background noise as possible.

Although some of the paid-for AI speech-to-text transcription services are good at coping with distracting background noise, here we’re dealing with auto transcribe video to text free services that are likely to be less accurate.

As a rule of thumb, keep the voice audio as clear as possible.

Start off with capturing clear speech during the video production stage. Obviously, this isn’t always possible so plan ahead and make sure you record several seconds of ambient noise at the start and finish of each take.

When that ambient noise as a reference you can use software to isolate the dialogue from background noise. Try the noise reduction in Adobe Audition or Audacity. Better still, run your audio through iZotope’s RX 7 Audio Repair Software.

Avoid multiple voices speaking at once

By multiple voices I don’t mean something like an interview situation where you might be recording two or three people. I mean having more than one person speaking simultaneously. The speech recognition software will be confused and produce poor results.

Use a good audio file

If you are going to upload your audio to an online service, make sure you don’t skimp on quality.

Ideally, create a 48kHz, 16-bit, stereo .wav file. If you need to use an mp3 file don’t drop the bitrate down too low.

Remember, the better the audio, the better your results will be.

Get the microphone close to your loudspeaker

Get your microphone close to the loudspeaker

If you are using speech recognition dictation software on your PC or phone, also make sure you keep the background noise to a minimum.

To improve the clarity of the speech, get your microphone as close to your playback loudspeaker as possible. This way the sound you want your dictation software to hear will be much louder than any other background sounds in your office, such as cooling fans and other people chatting.

One last point, playback your audio on good quality loudspeakers rather than the small built-in speakers on your desktop or laptop. The latter will produce poor quality sound that might even buzz.

Use a unidirectional microphone

When recording from your loudspeaker try and use a unidirectional microphone.

Although an Omni-directional microphone should be OK, a cardioid or super-cardioid microphone would be a better choice.

These unidirectional mics are designed to reject sound from the sides and rear, so they should mainly pick up the audio coming from the loudspeaker.

Don’t use audio with music beds and sound effects

Music in the background is effectively noise to Speech recognition software, so avoid using the final audio mix from your video.

Keep the voice track clean of any music, especially vocals that might confuse the software.

Avoid reverberation

If you recorded your video in a large space and distinct echo or reverb, your software will be working harder.

Try picking a room with a better acoustic or use an audio plugin to remove the reverberation.

Accents and speech clarity

Generally, the clearer your speaker’s speech the better.

A broad accent or regional dialect will cause the speech to text software to struggle.

Although transcription software is simple to use it needs to be told which language you are using. Although Americans, Brits, and Australians all speak English if there is a setting for the national nature of the language on your recording make sure you choose it.

Why should you transcribe recorded speech to text?

So why should you transcribe audio to text from your videos?

I can think of at least the following four main reasons.

A good transcript allows you to create good quality Closed Captions or subtitles for YouTube.
It helps you create the description for your video.
The transcript can be an aid to the on-paper editing of a large video project.
You can quickly repurpose your video into blog posts, articles, email newsletters, ebooks, etc.

Create closed captions

Decent quality closed captions can help improve the viewer experience, boost your on-page SEO, and therefore your video’s potential ranking in YouTube and Google search.

Creating captions and subtitles is not just a way of helping hearing-impaired individuals engage with your video. About 80% of Facebook videos are watched muted, so it makes sense to add captions. Overall, you will increase your potential audience and engagement.

Not everyone will be watching your video with the sound up or they might find it difficult to hear what is being said. So closed captions may be incredibly useful in keeping your viewer watching. If the viewer can’t follow along, they will probably skip to the next video to find an answer to what they were searching for.

By keeping the viewer watching you are improving your video’s watch time. Watch time is a metric YouTube really considers to be important. Anything you can do to keep eyeballs glued to the screen, such as providing closed captions, is good news for your video and your YouTube channel. The better the watch time the more likely YouTube will suggest your video to viewers.

Your closed captions will also help with your SEO. It’s one of four ways you can directly provide YouTube with information about your video. The others being the title, tags, and description. Which neatly takes us to the next reason.

Help create your video description

Along with your closed captions, your video’s description is another opportunity for YouTube and Google to understand the content.

Having a transcript of your video will help you quickly create a description that will be of great SEO value.

Instead of creating a description by starting afresh, you can take your transcript and edit it to be like an informative mini-blog.

On-paper video editing

If you are recording video for a longer project, such as a documentary or training video, you can easily end up working with several hours of footage. Keeping track of everything in your head will be almost impossible.

By having a transcript it’s much easier to plan your edit ahead of sitting down in front of your computer. If you are anything like me, you’ll also want to make handwritten notes on the transcript.

Repurpose your video content

When you record and produce a video or podcast don’t just think of it as a stand-alone piece. Instead, think of it as the foundation of several pieces of content.

Although nearly 5 billion videos are watched every day on YouTube, don’t get fooled into thinking it’s the only way people can consume your content. Without much extra effort, you can repurpose your video for several different platforms and types of online marketing.

Depending on your video content, viewers might even pay to gain access to the same content but in a written form. For instance, an Audible book and the matching Kindle Book are frequently sold in one bundle.

You’ve probably done a lot of hard work creating your video. Some research, setting up the shoot, editing, and then upload to YouTube.

So, what your video potentially should not be is a single piece of content that only resides on YouTube. It’s way more valuable to you than just that.

Summary

Free products and services are often interior to their paid-for equivalents. But they may be good enough for what you need.

Although I wasn’t testing free against paid-for services, the best of the free services performed well.

Since 2015 my experience of using the YouTube auto transcribe service has been largely positive. I knew the auto-closed captions were mostly accurate, but I didn’t expect YouTube to head my results list.

However, even though I could edit the text on YouTube because the downloaded file included all the time stamps equates to a lot of editing in Word or Google Docs.

Apple Speech Recognition on my iPhone was another surprise. I didn’t expect it to come in joint second place, especially since the audio had to be played into the microphone.

Here are the results using my test audio file, in terms of accuracy.

Youtube auto transcribe service (97.6% accuracy)
iPhone Speech Recognition (93.9% accuracy)
Speechmatics demo (93.9% accuracy)
Google Docs Voice Typing (92.7% accuracy)
Microsoft Word Dictate (92.1% accuracy)
Google Cloud Speech-to-Text demo (89.7% accuracy)
IBM Watson Speech to Text service (82.4% accuracy)

If you want to auto transcribe video to text free of charge most of the options are close in terms of accuracy.

Even before I did my test, I knew Watson was less accurate than Speechmatics but I didn’t expect an accuracy as low as 82%. Perhaps Watson didn’t like the Scottish accent? It is worth pointing out that if you do sign up for the service you do get a lot of free minutes per month. Hopefully, the full service is more accurate.

So, which free audio-to-text converter would I use?

I signed up for Speechmatics and have 60 minutes of free speech to text. To be honest, I probably will pay the 6 pence per minute to continue using the service once my free minutes are used up. That’s simply because I can upload my video and get back my text. It’s accurate and easy to use.

But if I wanted to keep true to not spending any money on the transcription, I would use my iPhone, Google Docs Voice Typing or Microsoft Word Dictate. I don’t think there’s much to choose between them, in terms of accuracy, so the deciding factor may be how easy are they to use.

For the iPhone, the only downside is that you need to get the text off the device and on to your desktop or laptop.

For the other two services, the only negative is that you need an external microphone on a second PC. The deciding factor may simply be whether you are a Microsoft Word user or a Google Docs user.

But if you want a totally free service I guess you would choose Google Docs Voice Typing since Docs is free whereas you must pay for Word.

Reordering my list in terms of being free, accuracy, and ease of use, the standings become.

Google Docs Voice Typing (92.7% accuracy)
iPhone Speech Recognition (93.9% accuracy)
Microsoft Word Dictate (92.1% accuracy)
Youtube auto transcribe service (97.6% accuracy)
Speechmatics demo (93.9% accuracy)
Google Cloud Speech-to-Text demo (89.7% accuracy)
IBM Watson Speech to Text service (82.4% accuracy)

Tosh Lubek runs an audio and video production business in the UK and has been using the Canon EOS R since it was released in the Autumn of 2018. He has used the camera to shoot TV commercials for Sky TV, promotional business videos, videos of events and functions, and YouTube creator content. He has also won international awards for his advertising and promotional work. You can meet by visiting his “video booth” at HashTag business events across the country.

And the Auto-Transcribe Recorded Speech to Text winners are…

Methods used to transcribe audio from video to text

Options to transcribe a video to text

YouTube auto transcribe service

iPhone Speech Recognition

Speechmatics cloud subscription service

Google Docs Voice Typing

Microsoft Word Dictate

Google Cloud Speech-to-Text demo

IBM Watson Speech to Text Demo

How to optimize your audio transcription accuracy

Minimize the background noise on your recording

Avoid multiple voices speaking at once

Use a good audio file

Get the microphone close to your loudspeaker

Use a unidirectional microphone

Don’t use audio with music beds and sound effects

Avoid reverberation

Accents and speech clarity

Why should you transcribe recorded speech to text?

Create closed captions

Help create your video description

On-paper video editing

Repurpose your video content

Summary

Recent Posts