Convert PDF file to Speech with Text to Speech (TTS) library pyttsx3

  1. install library on Ubuntu
    sudo apt install libespeak1
    sudo apt-get install ffmpeg
    pip3 install pdfplumber
    pip3 install pyttsx3
  2. prepare the PDF file test.pdf (one page only)
WHAT language is thine, O sea?
   The language of eternal question.
What language is thy answer, O sky?
   The language of eternal silence.

3. write the python3 code read_pdf.py

import pdfplumber
pdf = pdfplumber.open("/home/ubuntu/test.pdf")
print("total pages:",len(pdf.pages))
print("-----------------------------------------")
first_page = pdf.pages[0]
print("current page:",first_page.page_number+1)
print("-----------------------------------------")

text = first_page.extract_text()
print(text)

import pyttsx3
engine = pyttsx3.init()
text = text.replace('\n','')
engine.say(text)
engine.runAndWait()

# save speech as mp3
engine.save_to_file(text, 'test.mp3')
engine.runAndWait()

4. execute the read_pdf.py, get test.mp3

Leave a Reply

Your email address will not be published. Required fields are marked *