Source Code of Speech To Text Converter

Page Contents

This project is a web-based Speech To Text converter. Users can speak into their device’s microphone, and the spoken words are transcribed into text. The interface includes language selection, a conversion button, a text area to display the result, and buttons to copy, download, and clear the text. The code is implemented using HTML, CSS, and JavaScript, and it provides a user-friendly interface for Speech To Text converter. Below, we will explain the code structure and functionality.

Folder Structure

Speech to text converter | Cosas Learning

Resources

Prerequisite Sites

https://fonts.google.com/

Images

Codes

HTML

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Speech-To-Text | Cosas Learning</title>
    <!-- Importing the CSS file -->
    <link rel="stylesheet" href="style.css">     
</head>
<body>
  <div class="container">
    <header>
      <div class="logo"><img src="logo.png" alt="Logo"></div>
      <div class="heding"><h1>Speech-To-Text Converter</h1></div>
    </header>
    <div class="hero_section">
      <!-- Speech-To-Text -->
      <div class="Speech_to_text">
        <!-- Languages Option -->
        <div class="row_box">
          <label>Select Language</label>
          <div class="select_box">
            <select name="input-language" id="language"></select>
          </div>
        </div>
        <!-- Convert Button -->
        <button id="convert_speech">Start Converting</button>
        <!-- Textarea -->
        <div class="row_box">
          <label>Result</label>
          <textarea spellcheck="false" placeholder="Text will be shown here" disabled></textarea>
        </div>
        <!-- Buttons -->
        <div class="btnBox">
          <button id="copyBtn" class="hide">Copy Text</button>
          <button id="downloadBtn" class="hide">Download</button>
        </div>
        <button id="clearBtn" class="hide">Clear Text</button>
      </div>
    </div>
  </div>
  <!-- Importing Languages Option file-->
  <script src="languages-option.js"></script>
  <!-- Importing the JavaScript file -->
  <script src="script.js"></script>
</body>
</html>

The HTML file sets up the structure of the web page for a Speech To Text converter.

Header with a logo and the main heading.
Speech To Text converter section with language selection, conversion button, result textarea, and action buttons.
External script references for language options (languages-option.js) and functionality (script.js).

CSS

/* CSS styles optimized for the project */

/* Importing Google Font */
@import url('https://fonts.googleapis.com/css2?family=Poppins:wght@400;500;600&display=swap');

/* Base styling */
* {
    margin: 0;
    padding: 0;
    box-sizing: border-box;
    font-family: 'Poppins', sans-serif;
}

/* Variable definitions */
:root {
    /* Colors */ 
    --white_color: rgb(255, 255, 255);
    --orange_color: rgb(255, 165, 89);
    --orange_dark_color: rgb(255, 96, 0);
    --background_color: linear-gradient(to top left, #b156d9, #213780);
    --box_shadow : rgba(0,0,0,0.1);
}

/* Container styling */
.container {
    width: 100%;
    min-height: 100vh;
    background-image: var(--background_color);
    padding-bottom: 1rem;
}

/* Header styling */
header {
    padding: 2rem 0;
    margin: 0 3rem;
    display: flex;
    align-items: center;
    justify-content: space-between;
}

/* Logo styling */
.logo {
    width: 20rem;
    height: 6rem;
}

.logo img {
    width: 100%;
    height: 100%;
}

/* Heading styling */
h1 {
    color: var(--white_color);
    text-align: center;
    padding: 1rem;
    border-bottom: 0.2rem var(--orange_color) solid;
    text-transform: uppercase;
}
/* Hero Section*/ 
.hero_section {
    display: flex;
    justify-content: center;
    align-items: center;
    flex-direction: column;
}
/* Speech-To-Text Styling Start */
.Speech_to_text {
   width: 23.125rem;
   margin-top: 2rem;
}
.row_box {
    display: flex;
    margin-bottom: 1.25rem;
    flex-direction: column;
}
.row_box label {
    font-size: 1.2rem;
    font-weight: bold;
    color: var(--white_color);
    margin: 1rem 0 0.5rem 0;
}
.Speech_to_text :where(textarea, select, button) {
    outline: none;
    width: 100%;
    height: 100%;
    border: none;
    border-radius: 0.313rem;
}
.row_box textarea {
    resize: none;
    height: 6.875rem;
    font-size: 1.2rem;
    background-color: var(--white_color);
    padding: 0.5rem 0.625rem;
    border: 0.125rem solid var(--orange_color);
    box-shadow: 0rem 0rem 0.625rem 1rem var(--box_shadow);
}
.row_box textarea::-webkit-scrollbar {
    width: 0rem;
}
.row_box .select_box {
    height: 2.938rem;
    display: flex;
    padding: 0 0.625rem;
    align-items: center;
    border-radius: 0.313rem;
    justify-content: center;
    border: 0.125rem solid var(--orange_color);
    box-shadow: 0rem 0rem 0.625rem 1rem var(--box_shadow);
}
.row_box select {
    font-size: 1rem;
    background: none;
    cursor: pointer;
}
.row_box select::-webkit-scrollbar {
    width: 0.5rem;
  }
.row_box select::-webkit-scrollbar-track {
    background: var(--white_color);
}
.row_box select::-webkit-scrollbar-thumb {
    background: var(--orange_color);
    border-radius: 0.5rem;
    border-right: 0.125rem solid var(--white_color);
}
.btnBox {
    display: flex;
    gap: 1rem;
}
button {
    height: 3.25rem;
    color: var(--white_color);
    font-size: 1rem;
    font-weight: 500;
    cursor: pointer;
    margin-top: 0.5rem;
    background: var(--orange_color);
    transition: 0.3s ease;
    padding: 1rem;
    box-shadow: 0rem 0rem 0.625rem 1rem var(--box_shadow);
}
button:hover {
    background: var(--orange_dark_color);
}
.hide {
    display: none;
}
.show {
    display: block;
}

Provides styles for the entire web page, optimizing for the project’s design.
Uses the Poppins font from Google Fonts.
Defines color variables, background, and box shadow.

Styles for the main container, header, heading, and the Speech To Text section.

JavaScript

script.js

// Speech To Text Variables
const recordBtn = document.querySelector("#convert_speech"),
  result = document.querySelector("textarea"),
  downloadBtn = document.querySelector("#downloadBtn"),
  copyBtn = document.querySelector("#copyBtn"),
  inputLanguage = document.querySelector("#language"),
  clearBtn = document.querySelector("#clearBtn");

let SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition,
    recognition,
    recording = false;

// Getting Languages Option Function   
function languagesOption() {
  languages.forEach((lang) => {
    const option = document.createElement("option");
    option.value = lang.code;
    option.innerHTML = lang.name;
    inputLanguage.appendChild(option);
  });
}

languagesOption();

// Speech To Text Function
function speechToText() {
  try {
    recognition = new SpeechRecognition();
    recognition.lang = inputLanguage.value;
    recognition.interimResults = true;
    recordBtn.innerHTML = "Listening Your Speech...";
    recognition.start();
    recognition.onresult = (event) => {
      const speechResult = event.results[0][0].transcript;
      if (event.results[0].isFinal) {
        result.innerHTML += speechResult + " " ;
      } 
     clearBtn.classList.add("show");
     clearBtn.classList.remove("hide");
     downloadBtn.classList.add("show");
     downloadBtn.classList.remove("hide");
     copyBtn.classList.add("show");
     copyBtn.classList.remove("hide");
    };
    recognition.onspeechend = () => {
      speechToText();
    };
    recognition.onerror = (event) => {
      stopRecording();
      if (event.error === "no-speech") {
        alert("No speech was detected. Stopping...");
      } else if (event.error === "audio-capture") {
        alert(
          "No microphone was found. Ensure that a microphone is installed."
        );
      } else if (event.error === "not-allowed") {
        alert("Permission to use microphone is blocked.");
      } else if (event.error === "aborted") {
        alert("Listening Stopped.");
      } else {
        alert("Error occurred in recognition: " + event.error);
      }
    };
  } catch (error) {
    recording = false;

    console.log(error);
  }
}

// Converting Button Function
recordBtn.addEventListener("click", () => {
  if (!recording) {
    speechToText();
    recording = true;
  } else {
    stopRecording();
  }
});

// Stop Converting Button Function
function stopRecording() {
  recognition.stop();
  recordBtn.innerHTML = "Start Converting";
  recording = false;
}

// Hiding Buttons Function
function hideBtns() {
    result.innerHTML = "";
    clearBtn.classList.add("hide");
    clearBtn.classList.remove("show");
    downloadBtn.classList.add("hide");
    downloadBtn.classList.remove("show");
    copyBtn.classList.add("hide");
    copyBtn.classList.remove("show");
}

// Download Button Function
function download() {
  const text = result.innerHTML;
  const filename = "speech-cosas-learning.txt";

  const element = document.createElement("a");
  element.setAttribute(
    "href",
    "data:text/plain;charset=utf-8," + encodeURIComponent(text)
  );
  element.setAttribute("download", filename);
  element.style.display = "none";
  document.body.appendChild(element);
  element.click();
  document.body.removeChild(element);
  hideBtns();
}

downloadBtn.addEventListener("click", download);

// Clear Button Function
clearBtn.addEventListener("click", () => {
    hideBtns();
});

// Copy Button Function 
copyBtn.addEventListener("click", () => {
    navigator.clipboard.writeText(result.innerHTML);
    copyBtn.innerHTML = "Copied!";
    setTimeout(() => copyBtn.innerText = "Copy Code", 1000);
});

Implements the functionality of the Speech To Text converter using the Web Speech API.
languagesOption(): Populates the language selection dropdown.

speechToText(): Initiates speech recognition and converts speech to text.
stopRecording(): Stops speech recognition.
hideBtns(): Hides result-related buttons.

download(): Downloads the text result.
Event listeners for buttons to start/stop conversion, clear text, copy text, and download text.

languages-option.js

const languages = [
  {
    no: "16",
    name: "English",
    native: "English",
    code: "en",
  },
  {
    no: "1",
    name: "Afrikaans",
    native: "Afrikaans",
    code: "af",
  },
  {
    no: "2",
    name: "Albanian",
    native: "Shqip",
    code: "sq",
  },
  {
    no: "3",
    name: "Arabic",
    native: "عربي",
    code: "ar",
  },
  {
    no: "4",
    name: "Armenian",
    native: "Հայերէն",
    code: "hy",
  },
  {
    no: "5",
    name: "Azerbaijani",
    native: "آذربایجان دیلی",
    code: "az",
  },
  {
    no: "6",
    name: "Basque",
    native: "Euskara",
    code: "eu",
  },
  {
    no: "7",
    name: "Belarusian",
    native: "Беларуская",
    code: "be",
  },
  {
    no: "8",
    name: "Bulgarian",
    native: "Български",
    code: "bg",
  },
  {
    no: "9",
    name: "Catalan",
    native: "Català",
    code: "ca",
  },
  {
    no: "10",
    name: "Chinese (Simplified)",
    native: "中文简体",
    code: "zh-CN",
  },
  {
    no: "11",
    name: "Chinese (Traditional)",
    native: "中文繁體",
    code: "zh-TW",
  },
  {
    no: "12",
    name: "Croatian",
    native: "Hrvatski",
    code: "hr",
  },
  {
    no: "13",
    name: "Czech",
    native: "Čeština",
    code: "cs",
  },
  {
    no: "14",
    name: "Danish",
    native: "Dansk",
    code: "da",
  },
  {
    no: "15",
    name: "Dutch",
    native: "Nederlands",
    code: "nl",
  },
  {
    no: "17",
    name: "Estonian",
    native: "Eesti keel",
    code: "et",
  },
  {
    no: "18",
    name: "Filipino",
    native: "Filipino",
    code: "tl",
  },
  {
    no: "19",
    name: "Finnish",
    native: "Suomi",
    code: "fi",
  },
  {
    no: "20",
    name: "French",
    native: "Français",
    code: "fr",
  },
  {
    no: "21",
    name: "Galician",
    native: "Galego",
    code: "gl",
  },
  {
    no: "22",
    name: "Georgian",
    native: "ქართული",
    code: "ka",
  },
  {
    no: "23",
    name: "German",
    native: "Deutsch",
    code: "de",
  },
  {
    no: "24",
    name: "Greek",
    native: "Ελληνικά",
    code: "el",
  },
  {
    no: "25",
    name: "Haitian Creole",
    native: "Kreyòl ayisyen",
    code: "ht",
  },
  {
    no: "26",
    name: "Hebrew",
    native: "עברית",
    code: "iw",
  },
  {
    no: "27",
    name: "Hindi",
    native: "हिन्दी",
    code: "hi",
  },
  {
    no: "28",
    name: "Hungarian",
    native: "Magyar",
    code: "hu",
  },
  {
    no: "29",
    name: "Icelandic",
    native: "Íslenska",
    code: "is",
  },
  {
    no: "30",
    name: "Indonesian",
    native: "Bahasa Indonesia",
    code: "id",
  },
  {
    no: "31",
    name: "Irish",
    native: "Gaeilge",
    code: "ga",
  },
  {
    no: "32",
    name: "Italian",
    native: "Italiano",
    code: "it",
  },
  {
    no: "33",
    name: "Japanese",
    native: "日本語",
    code: "ja",
  },
  {
    no: "34",
    name: "Korean",
    native: "한국어",
    code: "ko",
  },
  {
    no: "35",
    name: "Latvian",
    native: "Latviešu",
    code: "lv",
  },
  {
    no: "36",
    name: "Lithuanian",
    native: "Lietuvių kalba",
    code: "lt",
  },
  {
    no: "37",
    name: "Macedonian",
    native: "Македонски",
    code: "mk",
  },
  {
    no: "38",
    name: "Malay",
    native: "Malay",
    code: "ms",
  },
  {
    no: "39",
    name: "Maltese",
    native: "Malti",
    code: "mt",
  },
  {
    no: "40",
    name: "Norwegian",
    native: "Norsk",
    code: "no",
  },
  {
    no: "41",
    name: "Persian",
    native: "فارسی",
    code: "fa",
  },
  {
    no: "42",
    name: "Polish",
    native: "Polski",
    code: "pl",
  },
  {
    no: "43",
    name: "Portuguese",
    native: "Português",
    code: "pt",
  },
  {
    no: "44",
    name: "Romanian",
    native: "Română",
    code: "ro",
  },
  {
    no: "45",
    name: "Russian",
    native: "Русский",
    code: "ru",
  },
  {
    no: "46",
    name: "Serbian",
    native: "Српски",
    code: "sr",
  },
  {
    no: "47",
    name: "Slovak",
    native: "Slovenčina",
    code: "sk",
  },
  {
    no: "48",
    name: "Slovenian",
    native: "Slovensko",
    code: "sl",
  },
  {
    no: "49",
    name: "Spanish",
    native: "Español",
    code: "es",
  },
  {
    no: "50",
    name: "Swahili",
    native: "Kiswahili",
    code: "sw",
  },
  {
    no: "51",
    name: "Swedish",
    native: "Svenska",
    code: "sv",
  },
  {
    no: "52",
    name: "Thai",
    native: "ไทย",
    code: "th",
  },
  {
    no: "53",
    name: "Turkish",
    native: "Türkçe",
    code: "tr",
  },
  {
    no: "54",
    name: "Ukrainian",
    native: "Українська",
    code: "uk",
  },
  {
    no: "55",
    name: "Urdu",
    native: "اردو",
    code: "ur",
  },
  {
    no: "56",
    name: "Vietnamese",
    native: "Tiếng Việt",
    code: "vi",
  },
  {
    no: "57",
    name: "Welsh",
    native: "Cymraeg",
    code: "cy",
  },
  {
    no: "58",
    name: "Yiddish",
    native: "ייִדיש",
    code: "yi",
  },
];

Contains an array of language options with codes, names, and native names.

languages: An array of objects representing different languages.

The web page uses HTML for structure, CSS for styling, and JavaScript for functionality. It incorporates the Web Speech API to convert spoken language to text. Users can choose a language, start/stop speech recognition, and perform actions like copying, downloading, and clearing the resulting text. The language options are stored in a separate JavaScript file for modularity.

In summary, the code integrates web technologies to create a user-friendly Speech To Text converter with a clean and visually appealing interface.