Member-only story

How to start using Amazon Transcribe service with Python

Yegor Kryukov
3 min readJun 25, 2020

--

Amazon Web Services (AWS) is a cloud platform that offers over 175 cloud products accessible from Amazon’s data centers worldwide. Amazon Transcribe is one of those that automatically converts speech to text. This article is a step-by-step guide on how to start using the Amazon Transcribe. You will need a working Python environment and an AWS account.

Photo by Jason Rosewell on Unsplash

Step 1. Install dependencies

Amazon has an AWS SDK for python called Boto. To install using Conda run this command:

conda install -c anaconda boto3

or with pip:

pip install boto3

You would also need time, urllib, and json libraries.

Step 2. Create AWS credentials

We need valid credentials to use AWS services. To create a new access key:

  • login to the AWS console, click your username at the top right corner, and click Security Credentials
  • on the new page that follows click Access keys and then Create New Access Key
  • download and save the file to a folder on your local machine

Step 3. Create S3 storage bucket and upload an mp3 file

Amazon Simple Storage Service (S3) is where we are going to store our audio files for transcription. S3 stores files in buckets.

Note. You are charged for storing your files and for moving them in and out of a bucket. Make sure to check the pricing.

To create a new bucket:

  • go to the S3 console
  • click Create bucket. You need to provide a unique name for it. Write it down, we would need it later. The rest of the settings stay unchanged for this tutorial.

To manually upload a file to S3 bucket:

Note. I’ve used the “Day of Affirmation Address at Cape Town University” by Robert F. Kennedy. You can download an mp3 copy of the speech from learnoutloud.com website or use…

--

--

Yegor Kryukov
Yegor Kryukov

Responses (1)