Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request]: Low latency decoder for mediatek chips #82

Open
iverdugoacl opened this issue Jan 8, 2025 · 14 comments
Open

[Feature request]: Low latency decoder for mediatek chips #82

iverdugoacl opened this issue Jan 8, 2025 · 14 comments
Labels
enhancement New feature or request

Comments

@iverdugoacl
Copy link

Is your feature request related to a problem? Please describe.

Hello! Is it possible to use a low-latency decoder for MediaTek chipsets in moonlight? According to this post, the low-latency feature is available moonlight-stream#1241 (comment)

Describe the solution you'd like

To have access to an experimental low-latency decoding feature for MediaTek chipsets

Describe alternatives you've considered

Without an alternative

Screenshots

Not applicable

@iverdugoacl iverdugoacl added the enhancement New feature or request label Jan 8, 2025
@ClassicOldSong
Copy link
Owner

I think that's hardware independent, maybe newer MTK chips like the 9000 series have this decoder, but older ones might not.

@farika
Copy link

farika commented Jan 8, 2025

I think that's hardware independent, maybe newer MTK chips like the 9000 series have this decoder, but older ones might not.

It's likely we should ask to older mediatek chip owned their decoder list, but at least it could allow the recent mtk chips to play in good conditions. Parsec allows you to have lowlatency directly on all games even using the C2.mtk.avc or hevc.decoder and not the lowlatency one. On moonlight it's more complicated. You have to start the game before the stream, and above all not go through the desktop, otherwise the decode time remains high. What's more, on xbox games I manage to get down to 4 ms, whereas that's not the case on steam games. If Parsec can do it , i'm sure moonlight can also close this bug of 2 yo :).

@ClassicOldSong
Copy link
Owner

Parsec's report differs from Moonlight's, they're not actually the same thing. My own tests show that Parsec has worse performance than Moonlight, and the rest are server side things.

You can try the latest pre-release and see if Warp Drive mode can improve anything.

@farika
Copy link

farika commented Jan 8, 2025

Working well for me with prerelease version + warp mode on Galaxytab S10 ultra (Dimensity 9300+) . Without the desktop bug and steam bug (tested marvel rivals on steam and age on xbox). Decoding time a little higher than parsec but you're already mentionned that report differ. Lowlatency decoder is used.
1000000204
1000000205
If only services such as Xcloud and GFN could achieve the same result .... They stutter with 15-20ms ! You should advise them :(

@ClassicOldSong
Copy link
Owner

Hmmmm you mean Artemis can already use the MTK low latency decoder and also got rid of other issues? I didn't think Warp Drive mode can solve so many things...

@farika
Copy link

farika commented Jan 8, 2025

Artemis can already use the lowlatency decoder without the warp mode. But I had a weird problem where to get the 4-5ms you had to :

  • I launch the game before triggering the stream
  • Above all, don't use the desktop or else turn off the tablet screen, then turn it back on and enter the pin again, then restart the stream.
  • 4-5 ms only on xbox games and not on steam games ( high latency )

So all that weird stuff is solved and it's why i tested with marvel rivals on steam because i can't have the low latency before, even when using the good decoder.
The lowlatency decoder is also used for H264. I can't test AV1 because I don't have the encoder.

I don't know how Artemis manages to ask the tablet which decoder to use and switch on the lowlatency one. For webrtc streams (Xcloud / GFN) Chromium browsers go directly to c2.mtk.hevc.decoder (chrome / edge), same for the native geforce now application... With bad results, of course. I'd have to look through the logcat

@farika
Copy link

farika commented Jan 20, 2025

"Also be aware that Artemis/Moonlight's status overlay doesn't show latency correctly, trust your feeling."

Do you think this issue can be fixed to have a real benchmark behind the different options ? warp drive...

@ClassicOldSong
Copy link
Owner

Even with the false value it still shows a difference on my devices. Currently there's no good method to benchmark latency, but you can use the frame skipping test from TestUFO, stream by mirroring your physical display and use a third device to do high speed recording to compare how the white square advances on both devices.

@farika
Copy link

farika commented Jan 20, 2025

What I find strange is that if I limit Artemis to 30 FPS I can't get below 16 MS of decoding time. Whereas if I set it to 120 fps, I'm under 5 ms. It should be the other way round, because the more frames there are to decode, the more it loads the system?
It also does it for me on Windows, even on 120. If I don't move my windows very much, the decoding time is high, but if I move my windows quickly it drops to 4ms, but maybe this is normal behaviour?

I see no difference between warp mode and warp 2.

If you want to see how the lowlatency decoder is choosed on my S10 :
01-20 14:15:11.669 25830 26003 I moonlight-common-c: Initializing control stream...
01-20 14:15:11.669 25830 26003 I moonlight-common-c: done
01-20 14:15:11.669 25830 26003 I moonlight-common-c: Initializing video stream...
01-20 14:15:11.670 25830 26003 I moonlight-common-c: done
01-20 14:15:11.670 25830 26003 I moonlight-common-c: Initializing input stream...
01-20 14:15:11.670 25830 26003 I moonlight-common-c: done
01-20 14:15:11.670 25830 26003 I moonlight-common-c: Starting control stream...
01-20 14:15:11.676 1053 1053 I Layer : id=1511 removeFromCurrentState DimTransitionLayer for Surface(name=ActivityRecord{4aca74a u0 com.limelight.noir/com.limelight.Game t102})/@0xabe08a3#1511 (154)
01-20 14:15:11.681 1053 1053 D SurfaceFlinger: Display 4627039422300187648 HWC layers:
01-20 14:15:11.681 1053 1053 D SurfaceFlinger: DEVICE | 0xb400006f775968b0 | 0102 | RGBA_8888 | 0.0 0.0 2960.0 57.0 | 0 1791 2960 1848 | bbq-wrapper#1509
01-20 14:15:11.681 1053 1053 D SurfaceFlinger: DEVICE | 0xb400006f775bda50 | 0102 | RGBA_8888 | 0.0 57.0 2960.0 1848.0 | 0 0 2960 1791 | com.limelight.noir/com.limelight.AppView$_25830#1477
01-20 14:15:11.681 1053 1053 D SurfaceFlinger: SOLID_COLOR | | 0002 | Unknown | 0.0 0.0 0.0 0.0 | 0 95 2960 1760 | Background for SurfaceView[com.limelight.noir/com.limelight.Game]@0#1500
01-20 14:15:11.681 1053 1053 D SurfaceFlinger: CLIENT | 0xb400006f775cea90 | 0100 | RGBA_8888 | 0.0 0.0 2960.0 1844.0 | 0 4 2960 1848 | com.limelight.noir/com.limelight.Game$_25830#1496
01-20 14:15:11.681 1053 1053 D SurfaceFlinger: CLIENT | | 0000 | Unknown | 0.0 0.0 0.0 0.0 | 0 0 2960 1848 | Dim Layer for - Task=102#1505
01-20 14:15:11.681 1053 1053 D SurfaceFlinger: CLIENT | 0xb400006f775b6990 | 0100 | RGBA_8888 | 0.0 0.0 1739.0 395.0 | 610 688 2349 1083 | Établissement de la connexion$_25830#1495
01-20 14:15:11.681 1053 1053 D SurfaceFlinger: CLIENT | 0xb400006f775cd330 | 0100 | RGBA_8888 | 0.0 0.0 67.0 1
01-20 14:15:11.682 1053 1053 D SurfaceFlinger: 61.0 | 2893 489 2960 650 | com.sec.android.app.launcher/com.sam[...]e.edge.CocktailBarService$_2668#1393
01-20 14:15:11.682 1053 1053 D SurfaceFlinger: CLIENT | 0xb400006f775b2630 | 0100 | RGBA_8888 | 0.0 0.0 2960.0 84.0 | 0 1764 2960 1848 | TaskbarWindow$_2668#153
01-20 14:15:11.682 1053 1053 D SurfaceFlinger:
01-20 14:15:11.682 25830 26003 I moonlight-common-c: done
01-20 14:15:11.682 25830 26003 I moonlight-common-c: Starting video stream...
01-20 14:15:11.683 25830 26003 I com.limelight.LimeLog: Adaptive playback supported (FEATURE_AdaptivePlayback)
01-20 14:15:11.683 25830 26003 I com.limelight.LimeLog: Decoder supports fused IDR frames (FEATURE_AdaptivePlayback)
01-20 14:15:11.683 25830 26003 I com.limelight.LimeLog: Decoder configuration try: 0
01-20 14:15:11.683 25830 26003 I com.limelight.LimeLog: Low latency decoding mode supported (FEATURE_LowLatency)
01-20 14:15:11.687 25830 26014 I CCodec : state->set(ALLOCATING)
01-20 14:15:11.688 25830 26015 I CCodec : allocate(c2.mtk.hevc.decoder.lowlatency)
01-20 14:15:11.689 25830 26015 I Codec2Client: Available Codec2 services: "default" "default0" "software"
01-20 14:15:11.694 25830 26015 I CCodec : setting up 'default' as default (vendor) store
01-20 14:15:11.697 1008 1588 D C2MtkComponentStore: trytofind: name=c2.mtk.hevc.decoder.lowlatency
01-20 14:15:11.697 1008 1588 D C2MtkComponentStore: find: name=c2.mtk.hevc.decoder.lowlatency, key=c2.mtk.hevc.decoder.lowlatency
01-20 14:15:11.697 1008 1588 V C2MtkVdec: in CreateCodec2HevcLowLatencyFactory
01-20 14:15:11.698 1008 1588 V C2MtkVdec: in CreateCodec2InitFactory
01-20 14:15:11.698 1008 1588 D C2MtkComponentStore: onNumberOfVideoInstancesChanged: Decoder 1, Encoder 0
01-20 14:15:11.698 1008 1588 V C2MtkVdec: Number of instances increased to 1(c2.mtk.hevc.decoder.lowlatency)
01-20 14:15:11.698 1008 1588 D C2MtkVdec: [0xB40000760CBA31B0] IntfImpl: name=c2.mtk.hevc.decoder.lowlatency, kind=1, domain=1, mediaType=video/hevc, driverIntf=v4l2
01-20 14:15:11.700 1008 1588 D C2MtkVdec: [0xB40000760CBA31B0] Max supported level: 24838
01-20 14:15:11.700 1008 1588 D C2MtkVdec: [0xB40000760CBA31B0] Supported resolution: max(width 8192, height 4352), min(width 16, height 16)
01-20 14:15:11.700 1008 1588 D C2MtkVdec: [0xB40000760CBA31B0] Alignment required for output buffer: width 64, height 64
01-20 14:15:11.701 1008 1588 D C2MtkVdec: (video/hevc): MaxInputSizeSetter: 2097152 -> 2097152 bytes, width 320, height 240
01-20 14:15:11.701 1008 1588 V C2MtkVdec: Pixel format 0x0 not supported
01-20 14:15:11.701 1008 1588 D C2MtkVdec: CropSetter: old 319,239 : new 319,239
01-20 14:15:11.701 1008 1588 V C2MtkVdec: OutputBufferAllocConfigSetter: old 1 : new 1
01-20 14:15:11.702 1008 1588 D C2MtkVdecExt: [0xB4000075BCBA5D38] ExtIntfImpl: name=c2.mtk.hevc.decoder.lowlatency, kind=1, domain=1, mediaType=video/hevc
01-20 14:15:11.702 25830 25830 I InsetsController: cancelAnimation: types=statusBars, animType=1, host=com.limelight.noir/com.limelight.Game, from=android.view.InsetsController.notifyFinished:1783 android.view.InsetsAnimationThreadControlRunner$1.lambda$notifyFinished$0:85 android.view.InsetsAnimationThreadControlRunner$1.$r8$lambda$RAf1SfIREsj9-wH5FOigMy6eLkM:0
01-20 14:15:11.702 1008 1588 E C2MtkVdec: [0xB40000775CBC1FE8] Lock on weak_ptr (mIntf) returns null.
01-20 14:15:11.702 1008 1588 V C2MtkVdecExt: VdecVppIndependentDisabledSetter, (0 1 1)
01-20 14:15:11.702 1008 1588 V C2MtkVdecExt: VdecMlvecDriverVersionSetter number:65536
01-20 14:15:11.702 1008 1588 D C2MtkComponentStore: onRuntimeInfoUpdated[0xB40000760CBA31B0]: new instance added.
01-20 14:15:11.702 1008 1588 D C2MtkComponentStore: onRuntimeInfoUpdated[0xB40000760CBA31B0]: invalid, cUid 0, cPid 0, psDisabled 0, isCpuDemanding 0, cluster , minFreq
01-20 14:15:11.702 1008 1588 D C2MtkComponentStore: onRuntimeInfoUpdated[0xB40000760CBA31B0]: normal mode. isHeavyLoading 0, numOfValInst 0, psDisable 0
01-20 14:15:11.702 1008 1588 D C2MtkComponent: [0xB40000762CBABAF0] setWorkflow: PROCESSED->COMPLETED
01-20 14:15:11.702 1008 1588 D C2MtkComponent: [0xB40000762CBABAF0] setWorkflow: UNPROCESSED->PROCESSED
01-20 14:15:11.702 1008 1588 D C2MtkComponent: [0xB40000762CBABAB0] c2.mtk.hevc.decoder.lowlatency.preproc thread stared.
01-20 14:15:11.702 1008 1588 D C2MtkComponent: [0xB40000762CBABAB0] c2.mtk.hevc.decoder.lowlatency.preprocasyncs1 thread stared.
01-20 14:15:11.702 1008 1588 D C2MtkComponent: [0xB40000762CBABAB0] c2.mtk.hevc.decoder.lowlatency.preprocasyncs2 thread stared.
01-20 14:15:11.703 1008 1588 D C2MtkComponent: [0xB40000762CBABAB0] c2.mtk.hevc.decoder.lowlatency.postproc thread stared.
01-20 14:15:11.703 1008 1588 D C2MtkComponent: [0xB40000762CBABAB0] c2.mtk.hevc.decoder.lowlatency.postprocasyncs1 thread stared.
01-20 14:15:11.703 1008 1588 D C2MtkComponent: [0xB40000762CBABAB0] c2.mtk.hevc.decoder.lowlatency.postprocasyncs2 thread stared.

@ClassicOldSong
Copy link
Owner

Actually, no, lower framerate does lower system load when bitrate is also lower. The delay is basically frame queue and sync delay which cannot be lowered if streamed with lower fps.

@farika
Copy link

farika commented Jan 20, 2025

OK, so if I reach less than 5 ms with a mediatek with 120fps, the problem seems to have been solved?

Nothing to do with moonlight but I'm trying to understand how it works and why it doesn't work in web-rtc with the browsers on the services... And why does artemis manage to select the lowlatency decoder when browsers go straight to the classic c2.mtk with the consequences that we know. I'm not sure that they use mediacodecinfo when the applications do...

@ClassicOldSong
Copy link
Owner

I think it's sort of resolved or workarounded. Warp mode isn't something super fancy but a hack that really works...

@farika
Copy link

farika commented Jan 20, 2025

Except that for those who don't have or can't switch to the low-latency decoder, warp mode or not, they can't get down to the correct decode time.

@ClassicOldSong
Copy link
Owner

Yeah, it's still mostly an issue from MTK rather than applications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants