Text to Speech Project | Harsh - Class XI-A

Section 01

Certificate

Official certification for the submitted project

🏅

Certificate

Maples Academy Khatauli

This is to certify that Harsh, a bonafide student of Class XI-A, Roll Number 17 of Maples Academy Khatauli, has successfully completed the Computer Science project entitled "Text-to-Speech Program using Edge TTS" during the academic session 2025–26, under the guidance of Er. Pankaj Sir (PGT Computer Science).

This project has been prepared in accordance with the guidelines prescribed by the Central Board of Secondary Education (CBSE) for Class XI practical examination.

Harsh

Student

Er. Pankaj Sir

PGT Computer Science

Mrs. Garima Singh

Principal

Section 02

Acknowledgement

Expressing gratitude to those who helped

I would like to express my heartfelt gratitude to my Computer Science teacher, Er. Pankaj Sir, for his invaluable guidance, constant encouragement, and expert mentorship throughout the development of this project. His deep knowledge of Python programming and patience in teaching helped me understand complex concepts with ease.

I am deeply thankful to our esteemed Principal, Mrs. Garima Singh, for providing an excellent academic environment and all necessary facilities at Maples Academy Khatauli that enabled me to complete this project successfully.

I also extend my sincere thanks to my parents for their unwavering support, and to my classmates who motivated me throughout this journey.

— Harsh
Class XI-A | Roll No. 17
Maples Academy Khatauli
Session 2025–26

Section 03

About the Project

Understanding what this project does and its significance

Project Overview

This project is a Text-to-Speech (TTS) application built using the Python programming language. It takes text input from the user and converts it into natural-sounding human speech using Microsoft Edge's Neural AI voices.

The program supports multiple languages and voices including Hindi (Male & Female) and English (Male & Female with US & UK accents). Users can type any text — in Hindi or English — select their preferred voice, and hear the computer speak it out loud with realistic, human-like pronunciation and natural intonation.

Why This Project?

Real-World Application — TTS technology powers Google Assistant, Alexa, Siri, GPS navigation, and accessibility tools for visually impaired people.
AI & Cloud Computing — Demonstrates how Python can access powerful cloud-based AI services with just a few lines of code.
Multi-Language Support — Supports both Hindi and English, making it highly relevant for Indian users.
Interactive & Fun — Unlike static programs, this takes user input and produces audible output, making it engaging.
Learning Value — Covers dictionaries, functions, loops, file handling, async programming, and external library usage.

Key Features

🔊 Converts any typed text into natural AI-powered speech
🇮🇳 Supports Hindi Male & Female voices (Madhur & Swara)
🇺🇸 Supports English US Male & Female voices (Guy & Jenny)
🇬🇧 Supports English UK Male & Female voices (Ryan & Sonia)
🎛️ User-friendly menu-driven interface with numbered options
✅ Robust input validation — gracefully handles wrong inputs
🔁 Option to speak multiple texts in one session without restarting
🧹 Automatic cleanup of temporary audio files on exit

Section 04

Tools & Technologies Used

Software, libraries, and services powering this project

🐍

Python 3

The core programming language used for the entire project. Known for its simplicity and vast ecosystem. Version 3.8+ is required.

LANGUAGE

🗣️

Edge TTS

A Python library that connects to Microsoft Edge's text-to-speech service. Provides free access to 300+ neural AI voices in 40+ languages.

LIBRARY

🎮

Pygame

A multimedia library for Python. Used here specifically to load and play the generated MP3 audio through the computer's speakers.

LIBRARY

⚡

Asyncio

Python's built-in module for asynchronous operations. Required because Edge TTS communicates with Microsoft's servers over the internet.

MODULE

💻

VS Code / IDLE

The Integrated Development Environment (IDE) used for writing, editing, debugging, and testing the Python source code.

IDE

☁️

Microsoft Azure AI

The cloud-based artificial intelligence service that processes text and generates speech using deep neural network models.

CLOUD

Installation Commands

Before running the project, install the required libraries using pip:

Terminal / Command Prompt

# Install Edge TTS library
pip install edge-tts

# Install Pygame for audio playback
pip install pygame

Python Concepts Used

Concept	Where Used	Purpose
Dictionary	VOICES variable	Store voice names & IDs as key-value pairs
Functions	generate_speech(), play_audio(), main()	Modular & reusable code blocks
While Loop	Main menu & input validation	Repeat until valid input or user exits
For Loop	Displaying voice options	Iterate over dictionary items
Conditional (if-elif-else)	Input validation, menu choices	Decision making in program flow
String Methods	.strip(), .lower()	Clean and normalize user input
Async / Await	generate_speech() function	Handle network communication efficiently
File Handling	os.path.exists(), os.remove()	Manage temporary MP3 file
Modules & Imports	edge_tts, pygame, asyncio, os, time	Use external functionality in the program

Section 05

How It Works

Step-by-step execution flow of the program

Program Flowchart

▶ START PROGRAM

⬇

Display Welcome Message & Voice Menu

⬇

📥 User Selects Voice (1–6)

⬇

❓ Valid Choice? → If No, Ask Again

⬇ Yes

📥 User Enters Text to Speak

⬇

🌐 Send Text + Voice to Microsoft Edge TTS API

⬇

📥 Receive MP3 Audio & Save to File

⬇

🔊 Play Audio using Pygame Mixer

⬇

❓ Speak Again? → If Yes, Go Back to Menu

⬇ No

🧹 Delete Temporary MP3 File

⬇

⏹ END PROGRAM

1

Program Launches & Displays Menu

The program starts with a welcome banner and shows 6 voice options: Hindi Male/Female, English US Male/Female, and English UK Male/Female. The user selects by entering a number (1–6).

2

User Enters Text

The user types any text they want to hear spoken. It can be in Hindi (using Hindi keyboard/typing) or English. The program validates that the input is not empty.

3

Text Sent to Microsoft's Cloud

The edge_tts library packages the text and selected voice ID, then sends it to Microsoft's Azure cloud servers over a secure internet connection using asynchronous communication.

4

AI Neural Network Generates Speech

Microsoft's deep learning model processes the text, understanding pronunciation, intonation, emphasis, and natural pauses. It generates a high-quality MP3 audio file and sends it back.

5

Audio is Played Through Speakers

The received MP3 file is saved locally as "speech.mp3" and immediately played through the speakers using Pygame's mixer module. The program waits until playback completes.

6

Repeat or Exit

The user is asked if they want to speak more text. If yes, the loop continues. If no, the temporary MP3 file is automatically deleted and the program exits cleanly.

Section 06

Source Code

Complete Python source code of the project

text_to_speech.py — Harsh | Class XI-A | Roll No. 17

# ==========================================
#   TEXT-TO-SPEECH PROGRAM (Edge TTS)
#   CBSE Class 11 - Computer Science Project
#   Made by: Harsh | Class XI-A | Roll No: 17
#   School: Maples Academy Khatauli
#   Session: 2025-26
# ==========================================

import edge_tts
import asyncio
import pygame
import time
import os

# ---------- AVAILABLE VOICES ----------
VOICES = {
    "1": ("Hindi Female",      "hi-IN-SwaraNeural"),
    "2": ("Hindi Male",        "hi-IN-MadhurNeural"),
    "3": ("English Female US", "en-US-JennyNeural"),
    "4": ("English Male US",   "en-US-GuyNeural"),
    "5": ("English Female UK", "en-GB-SoniaNeural"),
    "6": ("English Male UK",   "en-GB-RyanNeural"),
}

OUTPUT_FILE = "speech.mp3"


# ---------- GENERATE SPEECH ----------
async def generate_speech(text, voice):
    communicate = edge_tts.Communicate(text, voice)
    await communicate.save(OUTPUT_FILE)


# ---------- PLAY AUDIO ----------
def play_audio():
    pygame.mixer.init()
    pygame.mixer.music.load(OUTPUT_FILE)
    pygame.mixer.music.play()
    while pygame.mixer.music.get_busy():
        time.sleep(0.1)
    pygame.mixer.music.unload()
    pygame.mixer.quit()


# ---------- MAIN PROGRAM ----------
def main():
    print()
    print("  " + "=" * 40)
    print("      TEXT  TO  SPEECH  PROGRAM")
    print("      Powered by Microsoft Edge AI")
    print("  " + "=" * 40)

    # Show voice options
    print("\n  Available Voices:")
    print("  " + "-" * 35)
    for key, (name, _) in VOICES.items():
        print(f"    {key}. {name}")
    print("  " + "-" * 35)

    while True:

        # --- Choose Voice ---
        while True:
            choice = input("\n  Choose voice (1-6): ").strip()
            if choice in VOICES:
                break
            print("  Invalid! Enter 1 to 6.")

        voice_name, voice_id = VOICES[choice]

        # --- Enter Text ---
        text = input("  Enter text to speak: ").strip()
        if not text:
            print("  Text cannot be empty!")
            continue

        # --- Generate & Play ---
        print(f"\n  Generating speech in [{voice_name}]...")
        asyncio.run(generate_speech(text, voice_id))

        print("  Playing audio... 🔊")
        play_audio()
        print("  Done! ✅")

        # --- Play Again? ---
        again = input("\n  Speak again? (y/n): ").lower()
        if again != "y":
            break

    # Cleanup temp file
    if os.path.exists(OUTPUT_FILE):
        os.remove(OUTPUT_FILE)

    print("\n  Thanks for using! Goodbye! 👋\n")


# --- RUN ---
main()

Section 07

Output

Sample output of the program during execution

OUTPUT — Run 1 (Hindi Female Voice)

  ========================================
      TEXT  TO  SPEECH  PROGRAM
      Powered by Microsoft Edge AI
  ========================================

  Available Voices:
  -----------------------------------
    1. Hindi Female
    2. Hindi Male
    3. English Female US
    4. English Male US
    5. English Female UK
    6. English Male UK
  -----------------------------------

  Choose voice (1-6): 1
  Enter text to speak: नमस्ते, मैं हर्ष हूँ, कक्षा ग्यारहवीं का छात्र

  Generating speech in [Hindi Female]...
  Playing audio... 🔊
  Done! ✅

  Speak again? (y/n): y

OUTPUT — Run 2 (English Male US Voice)

  Choose voice (1-6): 4
  Enter text to speak: Hello! This is my Computer Science project for Class 11.

  Generating speech in [English Male US]...
  Playing audio... 🔊
  Done! ✅

  Speak again? (y/n): y

  Choose voice (1-6): 6
  Enter text to speak: Welcome to Maples Academy Khatauli!

  Generating speech in [English Male UK]...
  Playing audio... 🔊
  Done! ✅

  Speak again? (y/n): n

  Thanks for using! Goodbye! 👋

What Happens During Output

The program displays a clean menu with 6 AI voice options.
After text input and voice selection, it connects to Microsoft's servers over the internet.
Within 2–3 seconds, a natural AI voice speaks the text through the speakers.
The voice sounds remarkably human-like — with proper pronunciation, intonation, pauses, and emotion.
Hindi text is spoken in perfect Hindi accent, and English text in clear native accents.

Section 08

Advantages & Limitations

Advantages

Completely free — no API key or payment needed
Very natural sounding neural AI voices
Supports Hindi + English with US & UK accents
Simple menu-driven interface
Useful for visually impaired users
Great for language learning
Code is short & readable

Limitations

Requires internet connection
Depends on Microsoft's servers
Cannot work offline
Long texts may take more time
Quality depends on internet speed
Limited to Microsoft Edge voices
Terminal-based (no GUI)

Section 09

Future Enhancements

Possible Improvements

🖥️ Add a GUI (Graphical User Interface) using Tkinter or PyQt for a visual experience.
💾 Add option to save audio as MP3/WAV file for later use or sharing.
📄 Add file reading support — read text from a .txt file and speak it aloud.
🎚️ Allow user to control speed, pitch, and volume of the generated voice.
🌐 Add support for more Indian languages like Tamil, Bengali, Marathi, Telugu.
📱 Convert into a web application using Flask or Django framework.
🎤 Add Speech-to-Text feature for reverse functionality (voice input).
📋 Add clipboard reading — automatically speak copied text.

Section 10

Conclusion

📝 Project Conclusion

This Text-to-Speech project has been an enriching and insightful learning experience. Through this project, I have gained practical knowledge of Python programming, including core concepts like functions, dictionaries, loops, conditional statements, file handling, and working with external libraries.

I also learned about cloud-based AI services and how modern applications leverage APIs to access powerful capabilities like neural text-to-speech. The project demonstrates that even a relatively simple Python program can harness cutting-edge artificial intelligence technology.

The program is practical and impactful — it can assist visually impaired individuals, aid in language learning, and serve as a solid foundation for more advanced projects like voice assistants and accessibility tools.

I am sincerely grateful to Er. Pankaj Sir for his expert guidance throughout this project, and to Mrs. Garima Singh (Principal) for fostering an environment of innovation at Maples Academy Khatauli.

Section 11

Bibliography

References and resources consulted during the project

1

NCERT Computer Science Textbook — Class XI, Chapter: Getting Started with Python, Python Fundamentals

2

Python Official Documentation — docs.python.org — Modules, Functions, Asyncio reference

3

Edge TTS Library (PyPI) — pypi.org/project/edge-tts — Library documentation and voice list

4

Pygame Documentation — pygame.org/docs — Audio mixer and playback methods

5

Microsoft Azure AI Speech Services — azure.microsoft.com — Neural TTS technology overview

6

GeeksforGeeks & W3Schools — Python programming tutorials, examples, and concept explanations

Computer Science Python Project

Certificate

Certificate

Acknowledgement

About the Project

Project Overview

Why This Project?

Key Features

Tools & Technologies Used

Python 3

Edge TTS

Pygame

Asyncio

VS Code / IDLE

Microsoft Azure AI

Installation Commands

Python Concepts Used

How It Works

Program Flowchart

Program Launches & Displays Menu

User Enters Text

Text Sent to Microsoft's Cloud

AI Neural Network Generates Speech

Audio is Played Through Speakers

Repeat or Exit

Source Code

Output

What Happens During Output

Advantages & Limitations

Advantages

Limitations

Future Enhancements

Possible Improvements

Conclusion

📝 Project Conclusion

Bibliography

Computer Science
Python Project