Recognize spoken words in recorded or live audio using Speech.

Speech Documentation

Posts under Speech tag

67 Posts
Sort by:
Post not yet marked as solved
0 Replies
662 Views
For my project, I would really benefit from continuous on-device speech recognition without the automatic timeout, or at least with a much longer one. In the WebKit web speech implementation, it looks like there are some extra setters for SFSpeechRecognizer exposing exactly this functionality: https://github.com/WebKit/WebKit/blob/8b1a13b39bbaaf306c9d819c13b0811011be55f2/Source/WebCore/Modules/speech/cocoa/WebSpeechRecognizerTask.mm#L105 Is there a chance Apple could enable programmable duration/time-out? If it’s available in WebSpeech, then why not in native applications?
Posted
by
Post not yet marked as solved
1 Replies
850 Views
Xcode Version 15.0 beta 4 (15A5195m) or Version 14.3.1 (14E300c) same issue when running iOS Simulator iPhone 14 Pro (iOS 17 or iOS 16.4) or iPhone 12 (iOS 17.0 build 21A5277j) just started to play around with SFSpeechRecognition. Ran into the issue with SFSpeechURLRecognitionRequest. the simple project is just a ContentView with 2 buttons ( one for selecting audio file, one for starting transcribing ), a SpeechRecognizer ( from Apple sample code "Transcribing speech to text" with minor additions ) After selecting an audio file, tap the transcribe button, the following error logs appear in the debugger console after the execution of recognitionTask(with:resultHandler:). 2023-07-18 13:58:16.562706-0400 TranscriberMobile[6818:475161] [] <<<< AVAsset >>>> +[AVURLAsset _getFigAssetCreationOptionsFromURLAssetInitializationOptions:assetLoggingIdentifier:figAssetCreationFlags:error:]: AVURLAssetHTTPHeaderFieldsKey must be a dictionary 2023-07-18 13:58:16.792219-0400 TranscriberMobile[6818:475166] [plugin] AddInstanceForFactory: No factory registered for id <CFUUID 0x60000023dd00> F8BB1C28-BAE8-11D6-9C31-00039315CD46 2023-07-18 13:58:16.824333-0400 TranscriberMobile[6818:475166] HALC_ProxyObjectMap.cpp:153 HALC_ProxyObjectMap::_CopyObjectByObjectID: failed to create the local object 2023-07-18 13:58:16.824524-0400 TranscriberMobile[6818:475166] HALC_ShellDevice.cpp:2609 HALC_ShellDevice::RebuildControlList: couldn't find the control object 2023-07-18 13:58:16.872935-0400 TranscriberMobile[6818:475165] [] <<<< FAQ Offline Mixer >>>> FigAudioQueueOfflineMixerCreate: [0x10744b8d0] failed to query kAudioConverterPrimeInfo err=561211770, assuming zero priming 2023-07-18 13:58:16.890002-0400 TranscriberMobile[6818:474951] [assertion] Error acquiring assertion: <Error Domain=RBSServiceErrorDomain Code=1 "originator doesn't have entitlement com.apple.runningboard.mediaexperience" UserInfo={NSLocalizedFailureReason=originator doesn't have entitlement com.apple.runningboard.mediaexperience}> 2023-07-18 13:58:16.890319-0400 TranscriberMobile[6818:474951] [AMCP] 259 HALRBSAssertionGlue.mm:98 Failed to acquire the AudioRecording RBSAssertion for pid: 6818 with code: 1 - RBSServiceErrorDomain 2023-07-18 13:58:16.893137-0400 TranscriberMobile[6818:474951] [assertion] Error acquiring assertion: <Error Domain=RBSServiceErrorDomain Code=1 "originator doesn't have entitlement com.apple.runningboard.mediaexperience" UserInfo={NSLocalizedFailureReason=originator doesn't have entitlement com.apple.runningboard.mediaexperience}> 2023-07-18 13:58:16.893652-0400 TranscriberMobile[6818:474951] [AMCP] 259 HALRBSAssertionGlue.mm:98 Failed to acquire the MediaPlayback RBSAssertion for pid: 6818 with code: 1 - RBSServiceErrorDomain Since the AVURLAsset is not created manually, how to get around the initial error "AVURLAssetHTTPHeaderFieldsKey must be a dictionary"? SpeechRecognizer.swift import Foundation import AVFoundation import Speech import SwiftUI /// A helper for transcribing speech to text using SFSpeechRecognizer and AVAudioEngine. actor SpeechRecognizer: ObservableObject { // ... @MainActor func startTranscribingAudioFile(_ audioURL: URL?) { Task { await transcribeAudioFile(audioURL) } } // ... private func transcribeAudioFile(_ audioURL: URL?) { guard let recognizer, recognizer.isAvailable else { self.transcribe(RecognizerError.recognizerIsUnavailable) return } guard let audioURL else { self.transcribe(RecognizerError.nilAudioFileURL) return } let request = SFSpeechURLRecognitionRequest(url: audioURL) request.shouldReportPartialResults = true self.audioURLRequest = request self.task = recognizer.recognitionTask(with: request, resultHandler: { [weak self] result, error in self?.audioFileRecognitionHandler(result: result, error: error) }) } // ... nonisolated private func audioFileRecognitionHandler(result: SFSpeechRecognitionResult?, error: Error?) { if let result { transcribe(result.bestTranscription.formattedString) } if let error { Task { @MainActor in await reset() transcribe(error) } } } } ContentView.swift import SwiftUI struct ContentView: View { @State var showFileBrowser = false @State var audioFileURL: URL? = nil @StateObject var speechRecognizer = SpeechRecognizer() var body: some View { VStack(spacing: 24) { Button { self.showFileBrowser.toggle() } label: { Text("Select an audio file to transcribe") } Text(audioFileURL != nil ? audioFileURL!.absoluteString : "No audio file selected") .multilineTextAlignment(.center) Button { speechRecognizer.startTranscribingAudioFile(audioFileURL) } label: { Text("Transcribe") } Text(speechRecognizer.transcript == "" ? "No transcript yet" : speechRecognizer.transcript) .multilineTextAlignment(.leading) } .padding() .fileImporter(isPresented: $showFileBrowser, allowedContentTypes: [.audio]) { result in switch result { case .success(let fileURL): fileURL.startAccessingSecurityScopedResource() audioFileURL = fileURL print(fileURL) case .failure(let error): NSLog("%s", error.localizedDescription) } } } }
Posted
by
Post not yet marked as solved
2 Replies
756 Views
We are using the Speech framework to enable users to interact with our app via voice commands. When a user says "start test" we send DispatchQueue.main.async { self.startButton.sendActions(for: .touchUpInside) } This works beautifully, except that the screen goes into auto lockout in the middle of a test. Apparently, using sendActions does not actually send a touch event to the OS. My question is how can I tell the OS that a touch event happened programmatically? Thank you
Posted
by
Post not yet marked as solved
2 Replies
1.2k Views
Download this Apple Speech Project https://developer.apple.com/documentation/accessibility/wwdc21_challenge_speech_synthesizer_simulator The project uses IOS15 deployment, when building and running I receive below errors. Setting deployment to IOS17 results in same errors. Appreciate if anyone else has determined how to re-engage this basic functionality. TTS appears to no longer to work. __ Folder ), NSFilePath=/Library/Developer/CoreSimulator/Volumes/iOS_21A5277g/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS 17.0.simruntime/Contents/Resources/RuntimeRoot/System/Library/TTSPlugins, NSUnderlyingError=0x600000c75d40 {Error Domain=NSPOSIXErrorDomain Code=2 "No such file or directory"}} Failed to get sandbox extensions Query for com.apple.MobileAsset.VoiceServicesVocalizerVoice failed: 2 #FactoryInstall Unable to query results, error: 5 Unable to list voice folder Query for com.apple.MobileAsset.VoiceServices.GryphonVoice failed: 2 Unable to list voice folder Query for com.apple.MobileAsset.VoiceServices.CustomVoice failed: 2 Unable to list voice folder Query for com.apple.MobileAsset.VoiceServices.GryphonVoice failed: 2 Unable to list voice folder
Posted
by
Post not yet marked as solved
1 Replies
455 Views
CFBundleSpokenName = "Apple 123" CFBundleName = "Apple" Accessibility Bundle Name don't work without opening app. When I touch the application on device home screen, voiceover reads the app as "Apple". After the app launched, it reads as "Apple 123". I want reading as "Apple 123" on home screen, too. Can you help me?
Posted
by
Post not yet marked as solved
1 Replies
800 Views
AVSpeechSynthesizer was not working. it was working perfect before. below is my code objective - c. -(void)playVoiceMemoforMessageEVO:(NSString*)msg { [[AVAudioSession sharedInstance] overrideOutputAudioPort:AVAudioSessionPortOverrideSpeaker error:nil]; AVSpeechSynthesizer *synthesizer = [[AVSpeechSynthesizer alloc]init]; AVSpeechUtterance *speechutt = [AVSpeechUtterance speechUtteranceWithString:msg]; speechutt.volume=90.0f; speechutt.rate=0.50f; speechutt.pitchMultiplier=0.80f; [speechutt setRate:0.3f]; speechutt.voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"en-us"]; [synthesizer speakUtterance:speechutt]; } please help me to solve this issue.
Posted
by
Post not yet marked as solved
1 Replies
1.3k Views
Hello, Using iOS 17.0, I can see a list of available voices. However, some will just not work, meaning that when selected there will be no sound produced and no errors. This is true when using my app and AVSpeechUtterance, but it is also true in the settings where the preview button does nothing.
Posted
by
Post not yet marked as solved
3 Replies
1k Views
Hello, I am deaf and blind. So my Apple studies are in text vi aBraille. One question: how do I add my voice as voice synthesis? Do I have to record it somewhere first? What is the complete process, starting with recording my voice? Do I have to record my voice reading something and then add it as voice synthesis? What's the whole process of that? There is no text explaining this' I found one about authorizing personal voice, but not the whole process starting the recording and such' Thanks!
Posted
by
Post not yet marked as solved
1 Replies
673 Views
AVSpeechSynthesisVoice.speechVoices() returns voices that are no longer available after upgrading from iOS 16 to iOS 17 (although this has been an issue for a long time, I think). To reproduce: On iOS 16 download 1 or more enhanced voices under “Accessibility > Spoken Content > Voices”. Upgrade to iOS 17 Call AVSpeechSynthesisVoice.speechVoices() and note that the voices installed in step (1) are still present, yet they are no longer downloaded, therefore they don’t work. And there is no property on AVSpeechSynthesisVoice to indicate if the voice is still available or not. This is a problem for apps that allow users to choose among the available system voices. I receive many support emails surrounding iOS upgrades about this issue. I have to tell them to re-download the voices which is not obvious to them. I've created a feedback item for this as well (FB12994908).
Posted
by
Post not yet marked as solved
0 Replies
476 Views
How can i show a link in my app to direct access a deep system setting. if a user click a link, app should directly open the deep settings page. For Ex: "Enable Dictation" (Settings-&gt;General-&gt;Keyboards) App Type: Multiplatform(Swift) minimum deployments: ios: 16.4 Mac os: 13.3 Any Help really appreciated.
Posted
by
Post not yet marked as solved
0 Replies
364 Views
The WWDC video "Extend Speech Synthesis with personal and custom voices" here: https://developer.apple.com/wwdc23/10033 Shows what appears to be an icon for "Personal Voice at time 10:46. Suggest this be made available to developers for final release.
Posted
by
Post not yet marked as solved
1 Replies
530 Views
Hi, When attempting to use the my Personal Voice with AVSpeechSythesizer with application in background I receive the below message: > Cannot use AVSpeechSynthesizerBufferCallback with Personal Voices, defaulting to output channel. Other voices can be used without issue. Is this a published limitation of Personal Voice within applications, i.e. no background playback?
Posted
by
Post not yet marked as solved
1 Replies
552 Views
I've been deaf and blind for 15 years' I'm not good at pronunciation in English, since I don't hear what I say, much less hear it from others. When I went to read the phrases to record my personal voice in Accessibility > Personal Voice, the 150 phrases to read are in English' How do I record phrases in Brazilian Portuguese? I speak Portuguese well' My English is very bad in pronunciation and deafness contributed' Help me.
Posted
by
Post not yet marked as solved
1 Replies
542 Views
Hello, I have struggled to resolve issue above question. I could speak utterance when I turn on my iPhone, but when my iPhone goes to background mode(turn off iPhone), It doesn't speak any more. I think it is possible to play audio or speak utterance because I can play music on background status in youtube. Any help please??
Posted
by
Post not yet marked as solved
0 Replies
300 Views
As per apple's document didCancel method will be called after stopSpeaking(at:) method call, instead didFinish method is being called. But in reality, so far I've checked: it's working perfectly on iOS 13.2.2, but after iOS 15. Is there anything I'm missing to configure. But it did work perfectly without doing anything on previous version.
Posted
by
Post not yet marked as solved
0 Replies
446 Views
I did add this word "sactional" in speechRequest.contextualStrings but when speaking it always autocorrects to sectional even i tried with training model for speech by adding SFCustomLanguageModelData.PhraseCount(phrase: "sactional", count: 10) and generating a model but didnt work either. is there a better way to make it work ?
Posted
by
Post not yet marked as solved
1 Replies
688 Views
As of iOS 17 SFSpeechRecognizer.isAvailable returns true, even when recognition tasks cannot be fulfilled and immediately fail with error “Siri and Dictation are disabled”. The same speech recognition code works as expected in iOS 16. In iOS 16, neither Siri or Dictation needed to be enabled to have SpeechRecognition to be available and it works as expected. In the past, once permissions given, only an active network connection is required to have functional SpeechRecognition. There seems to be 2 issues in play: In iOS 17, SFSpeechRecognizer.isAvailable incorrectly returns true, when it can’t fulfil requests. In iOS 17 dictation or Siri being enabled is required to handle SpeechRecognition tasks, while in iOS 17 this isn’t the case. If issue 2. Is expected behaviour (I surely hope not), there is no way to actually query if Siri or dictation is enabled to properly handle those cases in code and inform the user why speech recognition doesn’t work. Expected behaviour: Speech recognition is available when Siri and dictation is disabled SFSpeechRecognizer.isAvailable returns correctly false when no SpeechRecognition requests can be handled. iOS Version 17.0 (21A329) Xcode Version 15.0 (15A240d) Anyone else experiencing the same issues or have a solution? Reported this to Apple as well -> FB13235751
Posted
by
Post marked as solved
2 Replies
685 Views
libiconv convert utf8 to gbk fialed when text contains ellipsis(……) only in iOS17, and in Xcode 15 I update the libiconv because it's old. I have test libiconv.tbd and libicon.2.tbd the same result. Condition: only iOS 17, iOS 16- is OK; Text contains ellipsis(……) such as ……测试字符串; Convert to gbk or gb18030 is failed and return -1, but gb2312 return is OK。 int code_convert(const char *from_charset, const char *to_charset, char *inbuf, size_t inlen, char *outbuf, size_t outlen) { iconv_t cd; char **pin = &inbuf; char **pout = &outbuf; cd = iconv_open(to_charset, from_charset); if (cd == 0) return -1; memset(outbuf, 0, outlen); if ((int)iconv(cd, pin, &inlen, pout, &outlen) == -1) { iconv_close(cd); std::cout<< "转换失败" << std::endl; return -1; } iconv_close(cd); return 0; } int u2g(char *inbuf, size_t inlen, char *outbuf, size_t outlen) { //gb18030 , gb2312 return code_convert("utf-8", "gb2312", inbuf, inlen, outbuf, outlen); } std::string UTFtoGBK(const char* utf8) { int length = strlen(utf8); char *temp = (char*)malloc(sizeof(char)*length); if(u2g((char*)utf8,length,temp,length) >= 0) { std::string str_result; str_result.append(temp); free(temp); return str_result; }else { free(temp); return ""; } }
Posted
by
Post not yet marked as solved
1 Replies
403 Views
I'm developing a project where I want to transcribe live speech from the user on IOS devices. I wanted to test out the Speech framework by downloading the sample code from https://developer.apple.com/documentation/speech/recognizing_speech_in_live_audio. I'm using Xcode 15 and running it on an Ipad with IOS 17 installed. I run the app and manage to approve the permissions to use the microphone and to use live speech transcription, but as soon as I press 'start recording', I get the following error in Xcode, and nothing happens on the ipad screen. +[SFUtilities issueReadSandboxExtensionForFilePath:error:] issueReadSandboxExtensionForFilePath:error:: Inaccessible file (/var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab) : error=Error Domain=kAFAssistantErrorDomain Code=203 "Failed to access path: /var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab method:issueReadSandboxExtensionForFilePath:error:" UserInfo={NSLocalizedDescription=Failed to access path: /var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab method:issueReadSandboxExtensionForFilePath:error:} Can someone guide me in the right direction to fix this?
Posted
by