Fix SAPI4 failing to load some voices (#17726)

The SAPI4 synth driver fails to load some voices, because the voices do not support certain parameters, or because the voices do not expect the client to create multiple instances of ITTSCentral objects (with feature flag TTSFEATURE_SINGLEINSTANCE). The synth driver tries to detect whether a parameter is supported or not when loading the voice. However, removeSetting is checking against the name attribute of each setting, which should now be called id. This can cause errors when loading a voice that does not support all the parameters. Description of user facing changes Some of the SAPI4 voices will no longer fail to load. Description of development approach In removeSetting, if s.name == name is changed to if s.id == name. Before creating a new ITTSCentral object, and when terminate is called, set both _ttsCentral and _ttsAttrs to None to release the previous ITTSCentral object. Ignore the exception thrown from _ttsCentral.UnRegister. Some voices do not handle this well, and the whole ITTSCentral will be released anyway. Some voices keep the pausing state even after resetting, meaning that they will still pause the audio after _ttsCentral.AudioReset and silence the output. A variable _paused is used to track the pausing state, and if it's paused, unpause it before resetting.
nvaccess · Feb 25, 2025 · ba0a057 · ba0a057
1 parent cde7967
commit ba0a057
Show file tree

Hide file tree

Showing 2 changed files with 23 additions and 2 deletions.
diff --git a/source/synthDrivers/sapi4.py b/source/synthDrivers/sapi4.py
@@ -188,10 +188,13 @@ def __init__(self):
 		self._rateDelta = 0
 		self._pitchDelta = 0
 		self._volume = 100
+		self._paused = False
 		self.voice = str(self._enginesList[0].gModeID)
 
 	def terminate(self):
 		self._bufSink._allowDelete = True
+		self._ttsCentral = None
+		self._ttsAttrs = None
 
 	def speak(self, speechSequence: SpeechSequence):
 		textList = []
@@ -268,6 +271,12 @@ def cancel(self):
 			# cancel all pending bookmarks
 			self._bookmarkLists.clear()
 			self._bookmarks = None
+			if self._paused:
+				# Unpause the voice before resetting,
+				# because some voices keep the pausing state
+				# even after resetting.
+				self._ttsCentral.AudioResume()
+				self._paused = False
 			self._ttsCentral.AudioReset()
 		except COMError:
 			log.error("Error cancelling speech", exc_info=True)
@@ -282,11 +291,12 @@ def pause(self, switch: bool):
 				log.debugWarning("Error pausing speech", exc_info=True)
 		else:
 			self._ttsCentral.AudioResume()
+		self._paused = switch
 
 	def removeSetting(self, name):
 		# Putting it here because currently no other synths make use of it. OrderedDict, where you are?
 		for i, s in enumerate(self.supportedSettings):
-			if s.name == name:
+			if s.id == name:
 				del self.supportedSettings[i]
 				return
 
@@ -305,7 +315,17 @@ def _set_voice(self, val):
 		self._ttsAudio = CoCreateInstance(CLSID_MMAudioDest, IAudioMultiMediaDevice)
 		self._ttsAudio.DeviceNumSet(_mmDeviceEndpointIdToWaveOutId(config.conf["audio"]["outputDevice"]))
 		if self._ttsCentral:
-			self._ttsCentral.UnRegister(self._sinkRegKey)
+			try:
+				# Some SAPI4 synthesizers may fail this call.
+				self._ttsCentral.UnRegister(self._sinkRegKey)
+			except COMError:
+				log.debugWarning("Error unregistering ITTSCentral sink", exc_info=True)
+			# Some SAPI4 synthesizers assume that only one instance of ITTSCentral
+			# will be created by the client, and will stop working if more are created.
+			# Here we make sure that the previous _ttsCentral is released
+			# before the next _ttsCentral is created.
+			self._ttsCentral = None
+			self._ttsAttrs = None
 		self._ttsCentral = POINTER(ITTSCentralW)()
 		self._ttsEngines.Select(self._currentMode.gModeID, byref(self._ttsCentral), self._ttsAudio)
 		self._ttsCentral.Register(self._sinkPtr, ITTSNotifySinkW._iid_, byref(self._sinkRegKey))

diff --git a/user_docs/en/changes.md b/user_docs/en/changes.md
@@ -126,6 +126,7 @@ In any document, if the cursor is on the last line, it will be moved to the end
 * When anchor links point to the same object as the virtual caret is placed, NVDA no longer fails to scroll to the link destination. (#17669, @nvdaes)
 * Voice parameters, such as rate and volume, will no longer be reset to default when using the synth settings ring to change between voices in the SAPI5 and SAPI4 synthesizer. (#17693, #2320, @gexgd0419)
 * The NVDA Highlighter Window icon is no longer fixed in the taskbar after restarting Explorer. (#17696, @hwf1324)
+* Fixed an issue where some SAPI4 voices (e.g. IBM TTS Chinese) cannot be loaded. (#17726, @gexgd0419)
 
 ### Changes for Developers