Wyszukiwanie w plikach audio

Post on 18-Jul-2015

135 views 2 download

Transcript of Wyszukiwanie w plikach audio

Audio searchAndrzej Dudziec

Outline

● Introduction

● Speech recognition

● Phonetic algorithms

● Evaluation

● Results

● Conclusions

Introduction

Introduction

audio text

Speech recognition

● Words consists of letters e.g. ‘ONE’ - ‘O’, ‘N’, ‘E’● Speech consists of phonemes e.g. /wʌn/ - ‘W’, ‘AH’, ‘N’

Speech recognition

phonemes

AM

Speech recognition

● one W AH N● two T UW● three TH R IY● four F AO R● five F AY V● six S IH K S● seven S EH V AH N● eight EY T● nine N AY N● ten T EH N

phonemes

words

sentences

AM

dict

LM

Speech recognition

phonemes

words

sentences

AM

dict

LM

Issues

● Acoustic level○ background noise○ multiple speakers○ accent, dialect, sex, mood○ coarticulation

● Dictionary level○ homonyms (be & bee, I scream & ice cream)

Phonetic algorithmsThompson -> thompsonthompson -> th3mps3nth3mps3n -> th3mpS3nth3mpS3n -> Th3mpS3nTh3mpS3n -> Th3mPS3nTh3mPS3n -> Th3MPS3nTh3MPS3n -> Th3MPS3NTh3MPS3N -> T23MPS3NT23MPS3N -> TMPSNTMPSN111111 -> TMPSN1

sixteen sixty

Soundex

Metaphone

Caverphone

Soundex

Metaphone

Caverphone

S235

SKST

SKTN11

S230

SKST

SKTA11

● Soundex● Metaphone● Caverphone

Phonetic algorithms

Ackermann AzuronSoundex SoundexA265 A265

Metaphone code computation algorithm

Remove all repeating neighboring letters except letter C.

The beginning of the word should be transformed using the

following rules:

KN → N

GN → N

PN → N

AE → E

WR → R

Remove B letter at the end, if it is after M letter.

Replace C using the rules below:

With Х: CIA → XIA, SCH → SKH, CH → XH

With S: CI → SI, CE → SE, CY → SY

With K: C → K

Replace D using the following rules:

With J: DGE → JGE, DGY → JGY, DGI → JGY

With T: D → T

Replace GH → H, except it is at the end or before a vowel.

Replace GN → N and GNED → NED, if they are at the end.

Replace G using the following rules

With J: GI → JI, GE → JE, GY → JY

With K: G → K

Remove all H after a vowel but not before a vowel.

Perform following transformations using the rules below:

CK → K

PH → F

Q → K

V → F

Z → S

Replace S with X:

SH → XH

SIO → XIO

SIA → XIA

Replace T using the following rules

With X: TIA → XIA, TIO → XIO

With 0: TH → 0

Remove: TCH → CH

Transform WH → W at the beginning. Remove W if there is no vowel

after it.

If X is at the beginning, then replace X → S, else replace X → KS

Remove all Y which are not before a vowel.

Remove all vowels except vowel at the start of the word.

Daitch-Mokotoff SoundexLetter combination At the

startAfter a vowel

Other

SCHTSCH, SCHTSH, SCHTCH, SHTCH, SHCH, SHTSH, STCH, STSCH, STRZ, STRS, STSH, SZCZ, SZCS

2 4 4

SHT, SCHT, SCHD, ST, SZT, SHD, SZD, SD 2 43 43

CSZ, CZS, CS, CZ, DRZ, DRS, DSH, DS, DZH, DZS, DZ, TRZ, TRS, TRCH, TSH, TTSZ, TTZ, TZS, TSZ, SZ, TTCH, TCH, TTSCH, ZSCH, ZHSH, SCH, SH, TTS, TC, TS, TZ, ZH, ZS

4 4 4

Phonetic algorithms

Evaluation

Resultshelp ≠ helped

hell ≠ heaven

Results

Results

Results

preprocessing audio snippets

XMLtext

audio snippets

Results

Results

Results

Conclusions

● good recognition model and audio preprocessing is crucial, consider speed vs accuracy

● phonetic filtering increases recall but decreases precision

● phonetic filters as improvement, not standalone

● consider fuzzy search

Use cases

● audio archive

● looking up broadcast○ opinion mining○ collecting information

● voice control

● dictation○ short notes○ voice mail -> text messages

Discussion?