02-HarmonyOS5-SpeechRecognizer-Case

1 zhousg 0 6/11/2025, 9:06:42 AM

Case Description This is a real-time speech-to-text case implemented based on AI basic voice services. It collects audio through a microphone and converts it into text in real-time.

import { speechRecognizer } from '@kit.CoreSpeechKit' import { abilityAccessCtrl } from '@kit.AbilityKit' import { promptAction } from '@kit.ArkUI'

@Entry @ComponentV2 struct SpeechRecognizer { @Local isRecording: boolean = false @Local text: string = '' hasPermissions: boolean = false asrEngine?: speechRecognizer.SpeechRecognitionEngine

  aboutToAppear(): void {
    // Request microphone permissions
    this.requestPermissions()
  }

  async requestPermissions() {
    const atManager = abilityAccessCtrl.createAtManager();
    const res = await atManager.requestPermissionsFromUser(getContext(), ['ohos.permission.MICROPHONE'])
    this.hasPermissions =
      res.authResults.every(grantStatus => grantStatus === abilityAccessCtrl.GrantStatus.PERMISSION_GRANTED)
  }

  // Start microphone recognition
  async startRecord() {
    if (canIUse('SystemCapability.AI.SpeechRecognizer')) {
      if (!this.hasPermissions) {
        return promptAction.showToast({ message: 'Microphone not authorized' })
      }
      if (this.isRecording) {
        return promptAction.showToast({ message: 'Recording...' })
      }
      this.isRecording = true
      this.asrEngine = await speechRecognizer.createEngine({
        language: 'zh-CN',
        online: 1
      })
      const _this = this
      this.asrEngine.setListener({
        onStart(sessionId: string, eventMessage: string) {
        },
        onEvent(sessionId: string, eventCode: number, eventMessage: string) {
        },
        onResult(sessionId: string, result: speechRecognizer.SpeechRecognitionResult) {
          _this.text = result.result
          if (result.isLast) {
            _this.isRecording = false
          }
        },
        onComplete(sessionId: string, eventMessage: string) {
        },
        onError(sessionId: string, errorCode: number, errorMessage: string) {
        }
      })
      const audioParam: speechRecognizer.AudioInfo = {
        audioType: 'pcm',
        sampleRate: 16000,
        soundChannel: 1,
        sampleBit: 16
      }
      const extraParam: Record<string, Object> = {
        "recognitionMode": 0,
        "vadBegin": 2000,
        "vadEnd": 3000,
        "maxAudioDuration": 20000
      }
      const recognizerParams: speechRecognizer.StartParams = {
        sessionId: '10000',
        audioInfo: audioParam,
        extraParams: extraParam
      }
      this.asrEngine.startListening(recognizerParams)
    }
  }

  async closeRecord() {
    if (canIUse('SystemCapability.AI.SpeechRecognizer')) {
      this.asrEngine?.finish('10000')
      this.asrEngine?.cancel('10000')
      this.asrEngine?.shutdown()
    }
  }

  build() {
    Column() {
      Row() {
        Text(this.text)
          .width('100%')
          .lineHeight(32)
      }
      .alignItems(VerticalAlign.Top)
      .width('100%')
      .layoutWeight(1)

      Button(this.isRecording ? 'Start Speaking' : 'Press and Speak')
        .width('100%')
        .gesture(LongPressGesture()
          .onAction(() => {
            this.startRecord()
          })
          .onActionEnd(() => {
            this.closeRecord()
          })
          .onActionCancel(() => {
            this.closeRecord()
          }))

    }
    .padding(15)
    .height('100%')
    .width('100%')
  }
}

Ask HN: How do I give back to people helped me when I was young and had nothing?

How does feedback usually happen during projects?

Requesting ArXiv cs.PL endorsement for Py2C (Python-to-C compiler)

Ask HN: How has your company adapted to hiring with LLMs?

Ask HN: What is your fallback job if AI takes away your career?

Ask HN: Casual Math Book Suggestions

Tell HN: GitHub gists are great for private/public bookmarks

Ask HN: Is ageism in tech still a problem?

Ask HN: Seeking ways to improve my planning skills and follow-through

Ask HN: What cool skill or project interests you, but feels out of reach?

Ask HN: Why does EU institutions and member gov. websites have cookie banners?

Chinese internet is still in the 90s

Tell HN: Help restore the tax deduction for software dev in the US (Section 174)

Ask HN: Minecraft's UI element style (vs. modern flat glass interface)

Ask HN: Can anybody clarify why OpenAI reasoning now shows non-English thoughts?

Ask HN: How to learn CUDA to professional level

Ask HN: Is Firebase Down?

Ask HN: What is your ultimate AI-assisted coding setup?

Ask HN: What is the latest on treatment of Metastatic Breast Cancer?

Just how many $10 /MOS subscriptions do startups expect us to sign up for?

Ask HN: Who is your favorite historical person in computer science?

Ask HN: Dear Product Managers – How do you use LLM's in your day to day work?

Ask HN: Is anyone doing intelligent tiering for logs?

We Studied Procrastination on a Scale. Here's What Works

Ask HN: Who else is not using AI for coding?

Why Vertical AI Agents May Replace RPA in Complex Enterprise Workflows

Ask HN: In 15 years, what will a gas station visit look like?

Ask HN: Any good tools for viewing congressional bills?

PCL – Run Python and C Together in One File

Ask HN: Parental Controls You Use?

Why Companies Use AI to Cut Costs Instead of Building Ambitious Projects

Ask HN: Any Way to Sidestep Stripe's "alternate currency payout fee"

What's the Demand for Apache Spark?

Ask HN: When is it too little and when too much when you do market research?

Ask HN: What's your open source stack?

Ask HN: Does Apple's Liquid Glass hint to upcoming AR glasses?

Ask HN: What Happened to the Apple Vision Pro?

Building an Audit Readiness Platform for Startups – Would Love Your Feedback

Ask HN: What API or software are people using for transcription?

Control PC games with body movements using webcam andPython+ MediaPipe

Ask HN: Would You Use a Declarative Back End (APIs, DB, Auth, Sync)?

Best place for small remote gigs?

Ask HN: Is there any demand for Personal CV/Resume website?

Bill Atkinson has passed away

Apple's Liquid Glass UI marks the end of flat design

Ask HN: Does Google train on Google Docs?

Data manipulation using natural language prompts

Rust password hashing functions: Argon2, scrypt, PBKDF2

Ask HN: What would you work on if you couldn't fail?

Heroku Outage from AWS CloudFront Issue?

02-HarmonyOS5-SpeechRecognizer-Case

Comments (0)