Speech

Recognizing Speech in live Audio ; impossible to run data generator with 'fr' local identifier

Hy, I'm French developer and I downloaded the Recognizing Speech in live Audio sample code from Developer Apple website. I tried to execute data generator command after changing the local identifier from 'en_US' to 'fr' in data generator main file , but when I ran the command in Xcode, I had this error message : " Identifier 'fr' does not parse into two elements." I checked the xml files associated to the bin archive file and the identifiers are no correct (they keep 'en-US' value). Thanks for your help !

Speech

Posted

by

Lilian_Bauchet.

Last updated

.

TTS problem iOS 17 beta

I see a lot of crashes on iOS 17 beta regarding some problem of "Text To Speech". Does anybody has a clue why TTS crashes? Anybody else seeing the same problem? Exception Type: EXC_BAD_ACCESS (SIGSEGV) Exception Subtype: KERN_INVALID_ADDRESS at 0x000000037f729380 Exception Codes: 0x0000000000000001, 0x000000037f729380 VM Region Info: 0x37f729380 is not in any region. Bytes after previous region: 3748828033 Bytes before following region: 52622617728 REGION TYPE START - END [ VSIZE] PRT/MAX SHRMOD REGION DETAIL MALLOC_NANO 280000000-2a0000000 [512.0M] rw-/rwx SM=PRV ---> GAP OF 0xd20000000 BYTES commpage (reserved) fc0000000-1000000000 [ 1.0G] ---/--- SM=NUL ...(unallocated) Termination Reason: SIGNAL 11 Segmentation fault: 11 Terminating Process: exc handler [36389] Triggered by Thread: 9 ..... Thread 9 name: Thread 9 Crashed: 0 libobjc.A.dylib 0x000000019eeff248 objc_retain_x8 + 16 1 AudioToolboxCore 0x00000001b2da9d80 auoop::RenderPipeUser::~RenderPipeUser() + 112 (AUOOPRenderPipePool.mm:400) 2 AudioToolboxCore 0x00000001b2e110b4 -[AUAudioUnit_XPC internalDeallocateRenderResources] + 92 (AUAudioUnit_XPC.mm:904) 3 AVFAudio 0x00000001bfa4cc04 AUInterfaceBaseV3::Uninitialize() + 60 (AUInterface.mm:524) 4 AVFAudio 0x00000001bfa894bc AVAudioEngineGraph::PerformCommand(AUGraphNodeBaseV3&, AVAudioEngineGraph::ENodeCommand, void*, unsigned int) const + 772 (AVAudioEngineGraph.mm:3317) 5 AVFAudio 0x00000001bfa93550 AVAudioEngineGraph::_Uninitialize(NSError**) + 132 (AVAudioEngineGraph.mm:1469) 6 AVFAudio 0x00000001bfa4b50c AVAudioEngineImpl::Stop(NSError**) + 396 (AVAudioEngine.mm:1081) 7 AVFAudio 0x00000001bfa4b094 -[AVAudioEngine stop] + 48 (AVAudioEngine.mm:193) 8 TextToSpeech 0x00000001c70b3c5c __55-[TTSSynthesisProviderAudioEngine renderSpeechRequest:]_block_invoke + 1756 (TTSSynthesisProviderAudioEngine.m:613) 9 libdispatch.dylib 0x00000001ae4b0740 _dispatch_call_block_and_release + 32 (init.c:1519) 10 libdispatch.dylib 0x00000001ae4b2378 _dispatch_client_callout + 20 (object.m:560) 11 libdispatch.dylib 0x00000001ae4b990c _dispatch_lane_serial_drain + 748 (queue.c:3885) 12 libdispatch.dylib 0x00000001ae4ba470 _dispatch_lane_invoke + 432 (queue.c:3976) 13 libdispatch.dylib 0x00000001ae4c5074 _dispatch_root_queue_drain_deferred_wlh + 288 (queue.c:6913) 14 libdispatch.dylib 0x00000001ae4c48e8 _dispatch_workloop_worker_thread + 404 (queue.c:6507) ... Thread 9 crashed with ARM Thread State (64-bit): x0: 0x0000000283309360 x1: 0x0000000000000000 x2: 0x0000000000000000 x3: 0x00000002833093c0 x4: 0x00000002833093c0 x5: 0x0000000101737740 x6: 0x0000000000000013 x7: 0x00000000ffffffff x8: 0x0000000283309360 x9: 0x3c788942d067009a x10: 0x0000000101547000 x11: 0x0000000000000000 x12: 0x00000000000007fb x13: 0x00000000000007fd x14: 0x000000001ee24020 x15: 0x0000000000000020 x16: 0x0000b1037f729360 x17: 0x000000037f729360 x18: 0x0000000000000000 x19: 0x0000000000000000 x20: 0x00000001016a8de8 x21: 0x0000000283e21d00 x22: 0x0000000283b3f1f8 x23: 0x0000000283098000 x24: 0x00000001bfb4fc35 x25: 0x00000001bfb4fc43 x26: 0x000000028033a688 x27: 0x0000000280c93090 x28: 0x0000000000000000 fp: 0x000000016fc86490 lr: 0x00000001b2da9d80 sp: 0x000000016fc863e0 pc: 0x000000019eeff248 cpsr: 0x1000 esr: 0x92000006 (Data Abort) byte read Translation fault

Posted

by

And0Austria.

Last updated

.

iOS speech recognition: webkitSpeechRecognition in a WKWebview vs. native SFSpeechRecognizer

I have a prototype web view (in a WKWebView) that uses webkitSpeechRecognition for getting short snippets of text from speech. I'm not thrilled with the quality of the "recognition" - the text generally isn't very accurate. I'm wondering if I'll get any more accuracy by using the "native" SFSpeechRecognizer. It seems to me that webkitSpeechRecognition is likely just a Javascript wrapper interface for SFSpeechRecognizer, and the quality of the speech recognition won't improve. Does anyone know for sure if this is the case? Does webKitSpeechRecognition on iOS use SFSpeechRecognizer under the hood? Or are they two completely different recognition systems, and one could be more accurate than the other?

Posted

by

mfpjacobt.

Last updated

.

Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000056f023efbeb0

Application is getting Crashed: AXSpeech EXC_BAD_ACCESS KERN_INVALID_ADDRESS 0x000056f023efbeb0 Crashed: AXSpeech 0 libobjc.A.dylib 0x4820 objc_msgSend + 32 1 libsystem_trace.dylib 0x6c34 _os_log_fmt_flatten_object + 116 2 libsystem_trace.dylib 0x5344 _os_log_impl_flatten_and_send + 1884 3 libsystem_trace.dylib 0x4bd0 _os_log + 152 4 libsystem_trace.dylib 0x9c48 _os_log_error_impl + 24 5 TextToSpeech 0xd0a8c _pcre2_xclass_8 6 TextToSpeech 0x3bc04 TTSSpeechUnitTestingMode 7 TextToSpeech 0x3f128 TTSSpeechUnitTestingMode 8 AXCoreUtilities 0xad38 -[NSArray(AXExtras) ax_flatMappedArrayUsingBlock:] + 204 9 TextToSpeech 0x3eb18 TTSSpeechUnitTestingMode 10 TextToSpeech 0x3c948 TTSSpeechUnitTestingMode 11 TextToSpeech 0x48824 AXAVSpeechSynthesisVoiceFromTTSSpeechVoice 12 TextToSpeech 0x49804 AXAVSpeechSynthesisVoiceFromTTSSpeechVoice 13 Foundation 0xf6064 __NSThreadPerformPerform + 264 14 CoreFoundation 0x37acc CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION + 28 15 CoreFoundation 0x36d48 __CFRunLoopDoSource0 + 176 16 CoreFoundation 0x354fc __CFRunLoopDoSources0 + 244 17 CoreFoundation 0x34238 __CFRunLoopRun + 828 18 CoreFoundation 0x33e18 CFRunLoopRunSpecific + 608 19 Foundation 0x2d4cc -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 20 TextToSpeech 0x24b88 TTSCFAttributedStringCreateStringByBracketingAttributeWithString 21 Foundation 0xb3154 NSThread__start + 732 com.livingMedia.AajTakiPhone_issue_3ceba855a8ad2d1af83655803dc13f70_crash_session_9081fa41ced440ae9a57c22cb432f312_DNE_0_v2_stacktrace.txt 22 libsystem_pthread.dylib 0x24d4 _pthread_start + 136 23 libsystem_pthread.dylib 0x1a10 thread_start + 8

Speech

Posted

by

Rupendra.

Last updated

.

IsFormatSampleRateAndChannelCountValid false when playing outside audio

My app listens for verbal commands "Roll" & "Skip". It was working well until I used it while listening to a podcast in another app. I am getting a crash with the error: Thread 1: "required condition is false: IsFormatSampleRateAndChannelCountValid(format)" . It crashes when I am playing audio from the apps Snipd (a podcast app) or the Apple Podcast app. When I am playing audio from Youtube or the Apple Music it does not crash. This is the code for when I start listening for the commands: // MARK: - Speech Recognition func startListening() { do { try configureAudioSession() createRecognitionRequest() try prepareAudioEngine() } catch { print("Audio Engine error: \(error.localizedDescription)") } } private func configureAudioSession() throws { let audioSession = AVAudioSession.sharedInstance() try audioSession.setCategory(.playAndRecord, mode: .measurement, options: [.interruptSpokenAudioAndMixWithOthers, .duckOthers]) try audioSession.setActive(true, options: .notifyOthersOnDeactivation) } private func createRecognitionRequest() { recognitionRequest = SFSpeechAudioBufferRecognitionRequest() guard let recognitionRequest = recognitionRequest else { return } recognitionRequest.shouldReportPartialResults = true recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest, resultHandler: handleRecognitionResult) } private func prepareAudioEngine() throws { let inputNode = audioEngine.inputNode inputNode.removeTap(onBus: 0) let inputFormat = inputNode.inputFormat(forBus: 0) inputNode.installTap(onBus: 0, bufferSize: 1024, format: inputFormat) { [weak self] (buffer, _) in self?.recognitionRequest?.append(buffer) } audioEngine.prepare() try audioEngine.start() isActuallyListening = true } Thanks

Posted

by

TonyAsking.

Last updated

.

Audio Recognition and Live captioning

Hi Apple Team, We have a technical query regarding one feature- Audio Recognition and Live captioning. We are developing an app for deaf community to avoid communication barriers. We want to know if there is any possibility to recognize the sound from other applications in an iPhone and show live captions in our application (based on iOS).

Speech

Posted

by

akhileshy.

Last updated

.

AVAudioEngine & AVAudioPlayer Voice Processing Volume.

As the title suggests I am using AVAudioEngine for SpeechRecognition input & AVAudioPlayer for sound output. Apple says in this talk https://developer.apple.com/videos/play/wwdc2019/510 that the setVoiceProcessingEnabled function very usefully cancels the output from speaker to the mic. I set voiceProcessing on the Input and output nodes. It seems to work however the volume is low, even when the system volume is turned up. Any solution to this would be much appreciated.

Posted

by

sleep9.

Last updated

.

Microphone not working in iOS simulators under macos Sonoma 14.1.2

Hello, I am trying to test a speech to text feature in several iPhone simulators, but microphones don't seem to work. The microphone and speech recognition permissions are correctly asked for the feature. My internal and external microphones are detected in I/O options in simulators. But nothing happens when I launch the recognition. The recognition doesn't work also for speech to text in native messages keyboard or Siri. This problem is the same for all the simulators so I believe the issue is about Xcode permissions not accessing microphone. In my settings > Privacy & Security > Microphone : can't see Xcode (Considering an other issue, I can't see Xcode Source Editor in Extensions as well) I've already tried to uninstall and reinstall Xcode. I use Xcode 15.0.1 under Sonoma 14.1.2. Any help is welcome.

Posted

by

adipso_raphael.

Last updated

.

NSSpeechRecognitionUsageDescription not working

I have gotten an error stating, "This app has crashed because it attempted to access privacy-sensitive data without a usage description. The app's Info.plist must contain an NSSpeechRecognitionUsageDescription key with a string value explaining to the user how the app uses this data." But I have already added NSSpeechRecognitionUsageDescription to my info.plist and the error is still occuring. Anyone have a solution to this?

Speech
Xcode

Posted

by

aaryan_nagaria.

Last updated

.

SpeechSynthesis rate is not adjustable

In my app I have: let utterance = AVSpeechUtterance(string: recognizedTextView.text) utterance.voice = AVSpeechSynthesisVoice(language: "zh-HK") utterance.rate = 0.3 synthesizer.speak(utterance) However, after I upgraded my iOS to 17.1.1 the speed doesn't get adjusted. I tried 0.01 and 0.3 for rate, and there is no difference.

Speech

Posted

by

webcompanist.

Last updated

.

libiconv convert utf8 to gbk fialed when text contains `ellipsis(……)` only in iOS17

libiconv convert utf8 to gbk fialed when text contains ellipsis(……) only in iOS17, and in Xcode 15 I update the libiconv because it's old. I have test libiconv.tbd and libicon.2.tbd the same result. Condition: only iOS 17, iOS 16- is OK; Text contains ellipsis(……) such as ……测试字符串; Convert to gbk or gb18030 is failed and return -1, but gb2312 return is OK。 int code_convert(const char *from_charset, const char *to_charset, char *inbuf, size_t inlen, char *outbuf, size_t outlen) { iconv_t cd; char **pin = &inbuf; char **pout = &outbuf; cd = iconv_open(to_charset, from_charset); if (cd == 0) return -1; memset(outbuf, 0, outlen); if ((int)iconv(cd, pin, &inlen, pout, &outlen) == -1) { iconv_close(cd); std::cout<< "转换失败" << std::endl; return -1; } iconv_close(cd); return 0; } int u2g(char *inbuf, size_t inlen, char *outbuf, size_t outlen) { //gb18030 , gb2312 return code_convert("utf-8", "gb2312", inbuf, inlen, outbuf, outlen); } std::string UTFtoGBK(const char* utf8) { int length = strlen(utf8); char *temp = (char*)malloc(sizeof(char)*length); if(u2g((char*)utf8,length,temp,length) >= 0) { std::string str_result; str_result.append(temp); free(temp); return str_result; }else { free(temp); return ""; } }

Posted

by

yuanjilee.

Last updated

.

Speech Recognition, ContextualStrings

Is it possible to limit the words recognized to a list of specific words to be provided in a contextualString or other similar parameter. My impression is that ContextualStrings, add additional words that might not be in the dictionary. Instead I want to limit the words recognized.

Speech

Posted

by

skcureton.

Last updated

.

Speech-to-text input in tvOS

How to achieve Speech-to-text input in tvOS, like - Tapping a mic button on speaking by the user, the audio should convert into text.

Speech
tvOS

Posted

by

ShankarKr.

Last updated

.

Recognizing Speech in Live Audio Sample Project

I'm developing a project where I want to transcribe live speech from the user on IOS devices. I wanted to test out the Speech framework by downloading the sample code from https://developer.apple.com/documentation/speech/recognizing_speech_in_live_audio. I'm using Xcode 15 and running it on an Ipad with IOS 17 installed. I run the app and manage to approve the permissions to use the microphone and to use live speech transcription, but as soon as I press 'start recording', I get the following error in Xcode, and nothing happens on the ipad screen. +[SFUtilities issueReadSandboxExtensionForFilePath:error:] issueReadSandboxExtensionForFilePath:error:: Inaccessible file (/var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab) : error=Error Domain=kAFAssistantErrorDomain Code=203 "Failed to access path: /var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab method:issueReadSandboxExtensionForFilePath:error:" UserInfo={NSLocalizedDescription=Failed to access path: /var/mobile/Containers/Data/Application/1F1AB092-95F2-4E5F-A369-475E15114F26/Library/Caches/Vocab method:issueReadSandboxExtensionForFilePath:error:} Can someone guide me in the right direction to fix this?

Speech

Posted

by

Victor_O24.

Last updated

.

AVSpeechSynthesizer problem in iOS 16 Beta

Setting a voice for AVSpeechSynthesizer leads to an heap buffer overflow. Turn on address sanitizer in Xcode 14 beta and run the following code. Anybody else experiencing this problem, is there any workaround? let synthesizer = AVSpeechSynthesizer() var synthVoice : AVSpeechSynthesisVoice? func speak() { let voices = AVSpeechSynthesisVoice.speechVoices() for voice in voices { if voice.name == "Daniel" { // select e.g. Daniel voice synthVoice = voice } } let utterance = AVSpeechUtterance(string: "Test 1 2 3") if let synthVoice = synthVoice { utterance.voice = synthVoice } synthesizer.speak(utterance) // AddressSanitizer: heap-buffer-overflow }

Posted

by

And0Austria.

Last updated

.

AVSpeechSynthesisVoice.speechVoices

It's a little bit unclear to me whether I get a list of available or installed voices on the device. If an app requests a voice which is available but not installed, what happened? Is it upon the app to install the missing voice or is this done by iOS automatically? For some reasons, a male and female gender is not offered for each language. What's the reason for?

Speech
iOS

Posted

by

frameworker1974.

Last updated

.

Speech contextual string not working

I did add this word "sactional" in speechRequest.contextualStrings but when speaking it always autocorrects to sectional even i tried with training model for speech by adding SFCustomLanguageModelData.PhraseCount(phrase: "sactional", count: 10) and generating a model but didnt work either. is there a better way to make it work ?

Posted

by

madhavsahuPS.

Last updated

.

AVSpeechSynthesizer not playing words objective - c ios 16

AVSpeechSynthesizer was not working. it was working perfect before. below is my code objective - c. -(void)playVoiceMemoforMessageEVO:(NSString*)msg { [[AVAudioSession sharedInstance] overrideOutputAudioPort:AVAudioSessionPortOverrideSpeaker error:nil]; AVSpeechSynthesizer *synthesizer = [[AVSpeechSynthesizer alloc]init]; AVSpeechUtterance *speechutt = [AVSpeechUtterance speechUtteranceWithString:msg]; speechutt.volume=90.0f; speechutt.rate=0.50f; speechutt.pitchMultiplier=0.80f; [speechutt setRate:0.3f]; speechutt.voice = [AVSpeechSynthesisVoice voiceWithLanguage:@"en-us"]; [synthesizer speakUtterance:speechutt]; } please help me to solve this issue.

Posted

by

mubarak.

Last updated

.

Can I create a Personal Voice on an iPad?

My iPad 8th gen running IPadOS 17.0 RC does not list Personal Voice in the Accessibility settings. Is Personal Voice creation supported on iPad? If so, on which iPad models is it supported?

Posted

by

risingtide.

Last updated

.

Failed to call didCancel - AVSpeechSynthesizerDelegate

As per apple's document didCancel method will be called after stopSpeaking(at:) method call, instead didFinish method is being called. But in reality, so far I've checked: it's working perfectly on iOS 13.2.2, but after iOS 15. Is there anything I'm missing to configure. But it did work perfectly without doing anything on previous version.

Speech

Posted

by

vinoth_zohocorp.

Last updated

.

Posts under Speech tag