Convert PDF file to Speech with Text to Speech (TTS) library pyttsx3

install library on Ubuntu
sudo apt install libespeak1
sudo apt-get install ffmpeg
pip3 install pdfplumber
pip3 install pyttsx3
prepare the PDF file test.pdf (one page only)

WHAT language is thine, O sea?
   The language of eternal question.
What language is thy answer, O sky?
   The language of eternal silence.

3. write the python3 code read_pdf.py

import pdfplumber
pdf = pdfplumber.open("/home/ubuntu/test.pdf")
print("total pages：",len(pdf.pages))
print("-----------------------------------------")
first_page = pdf.pages[0]
print("current page：",first_page.page_number+1)
print("-----------------------------------------")

text = first_page.extract_text()
print(text)

import pyttsx3
engine = pyttsx3.init()
text = text.replace('\n','')
engine.say(text)
engine.runAndWait()

# save speech as mp3
engine.save_to_file(text, 'test.mp3')
engine.runAndWait()

4. execute the read_pdf.py, get test.mp3

Leave a Reply Cancel reply