OSSSpeechKit was developed to provide easier accessibility options to apps.
Apple does not make it easy to get the right voice, nor do they provide a simple way of selecting a language or using speech to text. OSSSpeechKit makes the hassle of trying to find the right language go away.
- Swift 5.0 or higher
- iOS 13.0 or higher
- Cocoapods
The table below shows the original 37 languages first supported. Since v0.3.3, an additional 10 languages have been added.
English - Australian ๐ฆ๐บ | Hebrew ๐ฎ๐ฑ | Japanese ๐ฏ๐ต | Romanian ๐ท๐ด | Swedish ๐ธ๐ช | Norsk ๐ณ๐ด |
Portuguese - Brazilian ๐ง๐ท | Hindi - Indian ๐ฎ๐ณ | Korean ๐ฐ๐ท | Russian ๐ท๐บ | Chinese - Taiwanese ๐น๐ผ | Dutch - Belgium ๐ง๐ช |
French - Canadian ๐จ๐ฆ | Hungarian ๐ญ๐บ | Spanish - Mexican ๐ฒ๐ฝ | Arabic - Saudi Arabian ๐ธ๐ฆ | Thai ๐น๐ญ | French ๐ซ๐ท |
Chinese ๐จ๐ณ | Indonesian ๐ฎ๐ฉ | Norwegian ๐ณ๐ด | Slovakian ๐ธ๐ฐ | Turkish ๐น๐ท | Finnish ๐ซ๐ฎ |
Chinese - Hong Kong ๐ญ๐ฐ | English - Irish ๐ฎ๐ช | Polish ๐ต๐ฑ | English - South African ๐ฟ๐ฆ | English - United States ๐บ๐ธ | Danish ๐ฉ๐ฐ |
Czech ๐จ๐ฟ | Italian ๐ฎ๐น | Portuguese ๐ต๐น | Spanish ๐ช๐ธ | English ๐ฌ๐ง | Dutch ๐ณ๐ฑ |
Greek ๐ฌ๐ท |
OSSSpeechKit offers simple text to speech and speech to text in 47 different languages.
OSSSpeechKit is built on top of both the AVFoundation and Speech frameworks.
You can achieve text to speech or speech to text in as little as two lines of code.
The speech will play over the top of other sounds such as music.
OSSSpeechKit is available through CocoaPods. To install it, simply add the following line to your Podfile:
pod 'OSSSpeechKit'
These methods enable you to pass in a string and hear the text played back using.
import OSSSpeechKit
.....
// Declare an instance of OSSSpeechKit
let speechKit = OSSSpeech.shared
// Set the voice you wish to use - currently upper case for formality or language and country name
speechKit.voice = OSSVoice(quality: .enhanced, language: .Australian)
// Set the text in the language you have set
speechKit.speakText(text: "Hello, my name is OSSSpeechKit.")
import OSSSpeechKit
.....
// Declare an instance of OSSSpeechKit
let speechKit = OSSSpeech.shared
// Create a voice instance
let newVoice = OSSVoice()
// Set the language
newVoice.language = OSSVoiceEnum.Australian.rawValue
// Set the voice quality
newVoice.quality = .enhanced
// Set the voice of the speech kit
speechKit.voice = newVoice
// Initialise an utterance
let utterance = OSSUtterance(string: "Testing")
// Set the recognition task type
speechKit.recognitionTaskType = .dictation
// Set volume
utterance.volume = 0.5
// Set rate of speech
utterance.rate = 0.5
// Set the pitch
utterance.pitchMultiplier = 1.2
// Set speech utterance
speechKit.utterance = utterance
// Ask to speak
speechKit.speakText(text: utterance.speechString)
Currently speech to text is offered in a very simple format. Starting and stopping of recording is handled by the app.
SpeechKit implements delegates to handle the recording authorization, output of text and failure to record.
speechKit.delegate = self
// Call to start and end recording.
speechKit.recordVoice()
// Call to end recording
speechKit.endVoiceRecording()
It is important that you have included in your info.plist
the following:
Privacy - Speech Recognition Usage Description
Privacy - Microphone Usage Description
Without these, you will not be able to access the microphone nor speech recognition.
Handle returning authentication status to user - primary use is for non-authorized state.
func authorizationToMicrophone(withAuthentication type: OSSSpeechKitAuthorizationStatus)
When the microphone has finished accepting audio, this delegate will be called with the final best text output.
func didFailToCommenceSpeechRecording()
If the speech recogniser and request fail to set up, this method will be called.
func didFinishListening(withText text: String)
For further information you can check out the Apple documentation directly.
let allLanguages = OSSVoiceEnum.allCases
// All support languages
let allVoices = OSSVoiceEnum.allCases
// Language details
let languageInformation = allVoices[0].getDetails()
// Flag of country
let flag = allVoices[0].flag
The getDetails()
method returns a struct containing:
OSSVoiceInfo {
/// The name of the voice; All AVSpeechSynthesisVoice instances have a persons name.
var name: String?
/// The name of the language being used.
var language: String?
/// The language code is what is internationally used in Locale settings.
var languageCode: String?
/// Identifier is a unique bundle url provided by Apple for each AVSpeechSynthesisVoice.
var identifier: Any?
}
The OSSVoiceEnum
contains other methods, such as a hello message, title variable and subtitle variable so you can use it in a list.
You can also set the speech:
- volume
- pitchMultiplier
- rate
As well as using an NSAttributedString
.
There are plans to implement flags for each country as well as some more features, such as being able to play the voice if the device is on silent.
If the language or voice you require is not available, this is either due to:
- Apple have not made it available through their AVFoundation;
- or the SDK has not been updated to include the newly added voice.
Apple do not make the voice of Siri available for use.
This kit provides Apple's AVFoundation voices available and easy to use, so you do not need to know all the voice codes, among many other things.
To say things correctly in each language, you need to set the voice to the correct language and supply that languages text; this SDK is not a translator.
You wish for you app to use a Chinese voice, you will need to ensure the text being passed in is Chinese.
Disclaimer: I do not know how to speak Chinese, I have used Google translate for the Chinese characters.
speechKit.voice = OSSVoice(quality: .enhanced, language: .Chinese)
speechKit.speakText(text: "ไฝ ๅฅฝๆ็ๅๅญๆฏ ...")
speechKit.voice = OSSVoice(quality: .enhanced, language: .Australian)
speechKit.speakText(text: "ไฝ ๅฅฝๆ็ๅๅญๆฏ ...")
OR
speechKit.voice = OSSVoice(quality: .enhanced, language: .Chinese)
speechKit.speakText(text: "Hello, my name is ...")
This same principle applies to all other languages such as German, Saudi Arabian, French, etc.. Failing to set the language for the text you wish to be spoken will not sound correct.
If you have a question, please create a ticket or email me directly.
If you wish to contribute, please create a pull request.
To run the example project, clone the repo, and run pod install
from the Example directory first.
For further examples, please look at the Unit Test class.
App Dev Guy
OSSSpeechKit is available under the MIT license. See the LICENSE file for more info.