You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OtosakuStreamingASR-iOS is a real-time speech recognition engine for iOS, built with Swift and Core ML. It uses a fast and lightweight streaming Conformer model optimized for on-device inference. Designed for developers who need efficient audio transcription on mobile.
OtosakuStreamingASR is a lightweight on-device streaming speech recognition engine for iOS. It performs real-time audio processing using a Conformer-based architecture and CTC decoding.
import OtosakuStreamingASR
letasr=OtosakuStreamingASR()try asr.prepareModel(from: modelURL)
asr.subscribe{ text inprint("🗣 Recognized: \(text)")}
// Raw audio chunk: [Double] in range [-1.0, 1.0], strictly 2559 samples per chunk (80ms at 16kHz)
try asr.predictChunk(rawChunk: yourRawAudioChunk)try asr.stop() // Finalize and decode remaining buffer
asr.reset() // Reset internal model state
🧠 Model Details
Architecture: Fast Conformer (Cache-Aware Streaming)
Language: 🇷🇺 Russian (fine-tuned from English)
Training: 250 hours of Russian speech (30 epochs)
WER (Word Error Rate):
Russian (fine-tuned): 11%
English (before fine-tuning): 6.5% on LibriSpeech test-other
OtosakuStreamingASR-iOS is a real-time speech recognition engine for iOS, built with Swift and Core ML. It uses a fast and lightweight streaming Conformer model optimized for on-device inference. Designed for developers who need efficient audio transcription on mobile.