語音辨識與語音輸入：HTML5 Web Speech Recognition API

使用 HTML5 Web Speech Recognition API ，需要建立 SpeechRecognition(言語辨識) 物件，但 SpeechRecognition 還在草擬階段
而且實際上暫時 (文章發佈日期) 只能在 Google Chrome 瀏覽器中建立，但在下仍然保留跨瀏覽器的建立方法

var SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
var speechRecognition = new SpeechRecognition();

確保不同瀏覽器的都能建立 SpeechRecognition

SpeechRecognition 一些屬性設定會影響言語辨識的準確度

SpeechRecognition.continuous 設定傳回連續結果，預設為 false
設定為 true 時會傳回多個結果，準確度比較高，需要較多時間
設定為 false 時只傳回一個結果，準確度比較低，需要較少時間
SpeechRecognition.grammars 設定使用的 SRGS (Speech Recognition Grammar Specification 言語辨識文法規範)
如果沒有特別還是使用預設
SpeechRecognition.interimResults 設定是否傳回臨時結果，預設為 false
如果設定為 true ，在言語辨識時，可以即時知道辨識的內容

SpeechRecognition.lang 為設定的支援言語，預設為 HTML 原本的 lang 或使用者的系統言語

言語名稱	言語編碼
南非語	af-ZA
阿姆哈拉語	am-ET
阿塞拜疆語	az-AZ
孟加拉語(孟加拉)	bn-BD
孟加拉語(印度)	bn-IN
印尼語	id-ID
馬來語	ms-MY
加泰隆尼亞語	ca-ES
捷克語	cs-CZ
丹麥語	da-DK
德語	de-DE
英語(澳洲)	en-AU
英語(加拿大)	en-CA
英語(印度)	en-IN
英語(肯雅)	en-KE
英語(坦桑尼亞)	en-TZ
英語(加納)	en-GH
英語(紐西蘭)	en-NZ
英語(尼日利亞)	en-NG
英語(南非)	en-ZA
英語(菲律賓)	en-PH
英語(英國)	en-GB
英語(美國)	en-US
西班牙語(阿根廷)	es-AR
西班牙語(玻利維亞)	es-BO
西班牙語(智利)	es-CL
西班牙語(哥倫比亞)	es-CO
西班牙語(哥斯達黎加)	es-CR
西班牙語(厄瓜多爾)	es-EC
西班牙語(薩爾瓦多)	es-SV
西班牙語(西班牙)	es-ES
西班牙語(美國)	es-US
西班牙語(危地馬拉)	es-GT
西班牙語(洪都拉斯)	es-HN
西班牙語(墨西哥)	es-MX
西班牙語(尼加拉瓜)	es-NI
西班牙語(巴拿馬)	es-PA
西班牙語(巴拉圭)	es-PY
西班牙語(秘魯)	es-PE
西班牙語(波多黎各)	es-PR
西班牙語(多明尼加)	es-DO
西班牙語(烏拉圭)	es-UY
西班牙語(委內瑞拉)	es-VE
巴斯克語	eu-ES
菲律賓語	fil-PH
法語	fr-FR
爪哇語	jv-ID
加利西亞語	gl-ES
古吉拉特語	gu-IN
克羅地亞語	hr-HR
祖魯語	zu-ZA
冰島語	is-IS
意大利語(意大利)	it-IT
意大利語(瑞士)	it-CH
康納達語	kn-IN
高棉語	km-KH
拉脫維亞語	lv-LV
立陶宛語	lt-LT
馬拉雅拉姆語	ml-IN
馬拉提語	mr-IN
匈牙利語	hu-HU
老撾語	lo-LA
荷蘭語	nl-NL
尼泊爾語	ne-NP
挪威語	nb-NO
波蘭語	pl-PL
葡萄牙語(巴西)	pt-BR
葡萄牙語(葡萄牙)	pt-PT
羅馬尼亞語	ro-RO
僧伽羅語	si-LK
斯洛文尼亞語	sl-SI
巽他語	su-ID
斯洛伐克語	sk-SK
芬蘭語	fi-FI
瑞典語	sv-SE
斯瓦希里語(坦桑尼亞)	sw-TZ
斯瓦希里語(肯雅)	sw-KE
喬治亞語	ka-GE
亞美尼亞語	hy-AM
泰米爾語(印度)	ta-IN
泰米爾語(星加坡)	ta-SG
泰米爾語(斯里蘭卡)	ta-LK
泰米爾語(馬來西亞)	ta-MY
泰盧固語	te-IN
越南語	vi-VN
土耳其語	tr-TR
烏爾都語(巴基斯坦)	ur-PK
烏爾都語(印度)	ur-IN
希臘語	el-GR
保加利亞語	bg-BG
俄語	ru-RU
塞爾維亞語	sr-RS
烏克蘭語	uk-UA
韓語	ko-KR
普通話(中國)	cmn-Hans-CN
普通話(香港)	cmn-Hans-HK
國語(台灣)	cmn-Hant-TW
粵語(香港)	yue-Hant-HK
日語	ja-JP
印度語	hi-IN
泰語	th-TH

SpeechRecognition.maxAlternatives 由於言語辨識有偏差內容，設定輸出最多偏差數量，預設為 1

了解基本資料後便可以設定 SpeechRecognition

speechRecognition.continuous = true;
speechRecognition.interimResults = true;
speechRecognition.lang = "yue-Hant-HK";
speechRecognition.maxAlternatives = 1;

設定 SpeechRecognition 後，可以使用

SpeechRecognition.start()
開始言語辨識
SpeechRecognition.stop()
停止言語辨識

開始言語辨識後，還需要設定 Event Handler 來獲取言語辨識後的內容

SpeechRecognition.start
開始言語辨識時的操作
SpeechRecognition.result
正在言語辨識時的操作，會傳回 SpeechRecognitionEvent
當 SpeechRecognition 連續 8 秒接收不到有效的言語，會自動停止
SpeechRecognition.end
停止言語辨識時的操作

Event Handler 設定時
當以屬性設定時，需要加上 on ，即是

speechRecognition.onstart = function(){
// do something
};

當以 Event Listener 設定時則不需要

speechRecognition.addEventListener("start", function(){
// do something
});

比較關鍵的是 result Event Handler
當 SpeechRecognition 開始後，便需要通過 result 來不斷獲取 SpeechRecognitionEvent
SpeechRecognitionEvent.results 的 SpeechRecognitionResultList 是保存著言語辨識後的資料

SpeechRecognitionResultList 是 SpeechRecognitionResult 陣列
SpeechRecognitionResult.isFinal 來用測試當前的內容是否最後結果
SpeechRecognitionResult 保存著 SpeechRecognitionAlternative 即是言語辨識時的偏差資料
SpeechRecognitionAlternative.transcript 就是言語辨識後的內容
可以使用

speechRecognition.addEventListener("result", function(event){
    for (var i in event.results){
        for (var j in event.results[i]){
            if (event.results[i].isFinal){
                // do something event.results[i][j].transcript when it is final
            } else {
                // do something event.results[i][j].transcript when it is not final
            }
        }
    }
});

參考程式

Javascript

function convertToPunctuation(string){

var punctuations = {

"斷行號": "\n",

"逗號": "，",

"句號": "。",

"頓號": "、",

"冒號": "：",

"分號": "；",

"問號": "？",

"感嘆號": "！",

"破折號": "——",

"省略號": "……",

"開括號": "（",

"關括號": "）",

"開引號": "「",

"關引號": "」",

"開雙引號": "『",

"關雙引號": "』",

"開書名號": "《",

"關書名號": "》"

// 開此增加自動匹配詞語轉換

};

for (var i in punctuations){

string = string.split(i).join(punctuations[i]);

}

return string;

}

var SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;

window.addEventListener("load", function(){

if (SpeechRecognition){

speechRecognition = new SpeechRecognition();

speechRecognition.continuous = true;

speechRecognition.interimResults = true;

speechRecognition.addEventListener("start", function(){

console.log(new Date());

document.getElementById("toggle").value = "Stop Speech Recognition";

});

speechRecognition.addEventListener("result", function(event){

var bufferContainer = document.getElementById("bufferContainer");

var resultContainer = document.getElementById("resultContainer");

var resultList = event.results;

for (var i = 0; i < resultList.length; i++){

var result = resultList.item(i);

try{

var alternative = result.item(0);

var text = convertToPunctuation(alternative.transcript);

bufferContainer.value = resultContainer.value + text;

} catch (ex){

console.log(ex);

}

if (result.isFinal){

this.stop();

break;

}

});

speechRecognition.addEventListener("end", function(){

var bufferContainer = document.getElementById("bufferContainer");

var resultContainer = document.getElementById("resultContainer");

resultContainer.value = bufferContainer.value;

var toggle = document.getElementById("toggle");

var autoResume = document.getElementById("autoResume");

if (toggle.value == "Stop Speech Recognition" && autoResume.checked){

this.start();

}

});

}

});

function toggleSpeechRecognition(){

if (SpeechRecognition){

var toggle = document.getElementById("toggle");

if (toggle.value == "Stop Speech Recognition"){

toggle.value = "Start Speech Recognition";

speechRecognition.stop();

} else {

speechRecognition.lang = document.getElementById("language").value;

speechRecognition.start();

}

} else {

window.alert("This browser does not support Web Speech Recognition API.");

}

function selectAllText(element){

element.select();

}

function clearAllText(element){

element.value = "";

}

function clearContainer(message){

if (window.confirm(message)){

clearAllText(document.getElementById('bufferContainer'));

clearAllText(document.getElementById('resultContainer'));

}

HTML

</select>

<label>Continuously<input id="autoResume" type="checkbox" checked="checked"/></label>

</colgroup>

<thead>

<tr>

</tr>

</thead>

<tbody>

</tr>

</tbody>

ChatGPT-Assist-AI-Solving

搜尋此網誌

語音辨識與語音輸入：HTML5 Web Speech Recognition API

語音辨識與語音輸入：HTML5 Web Speech Recognition API