我做的FFmpeg开源C#封装库Sdcb.FFmpeg 
写在前面:
该主题为2022年12月份.NET Conf China 2022我的主题,项目地址:https://github.com/sdcb/Sdcb.FFmpeg
对应的PPT可以从这下载:https://io.starworks.cc:88/cv-public/2022/.NET玩转音视频操作FFmpeg.pptx
对应的视频可以从这里观看(从3:19:00开始):https://bbs.csdn.net/topics/609897502
FFmpeg
是知名的音频视频处理软件,我平时工作生活中会经常用到。但同时我也是.NET
程序员,在尝试性的用C#
调用FFmpeg
时,有以下这些选择:
- 进程外调用,比如:
- FFmpeg.NET
- MediaToolkit
- Xabe.Ffmpeg
- 基于C API平台调用,比如:
- FFmpeg.AutoGen
- EmguFFmpeg
- Sdcb.FFmpeg
如果基于命令行的话,有以下优缺点:
- 优点:容易学习、入门方便、不与GPL开源协议冲突
- 基于进程互操作,依赖于标准流重定向管理状态
- 输入和输出依赖于文件,很难精细控制
如果是基于C API做平台调用,则可以很好解决上面一些问题,有如下优缺点:
- 输入和输出可基于内存,可精细控制每一帧
- 性能方面减少了跨进程的损耗,更能有保障
- 缺点:C API代码比较复杂
- 缺点:业界普遍使用FFmpeg.AutoGen,在C#的基础上糅合C指针,写起来甚至比C API更复杂
我做了什么?
受制于以上这些困难,我以业界普遍使用的开源项目FFmpeg.AutoGen
为基础,我我自己动手做了一个Sdcb.FFmpeg
,它有如下优点:
- 保留所有直接调用C API的能力、保留跨平台的能力
- 删掉并完全重写了
ClangMacroParser
依赖,因此比原版支持更多的宏解析 - 动态库加载方式从手动LoadLibrary改为了自动的
[DllImport]
,这在.NET Core中可以自动从NuGet包中加载dll,这更符合.NET社区共识 - 删掉了仓库所有大二进制依赖和大二进制历史,改成自动从网上下载,这缩小了仓库体积
- 简化了枚举名字,如
AVCodecID.AV_CODEC_ID_H264
->AVCodecID.H264
- 为许多C宏改造成了C#枚举,如
ffmpeg.AV_DICT_MATCH_CASE
->AV_DICT_READ.MatchCase
- 除了底层封装,还提供了中层(类)封装和高层(帮助类)封装,比如
CodecContext
和MediaDictionary
- 我制作了动态链接库的
NuGet
包,这可以保障程序不需要安装外部依赖直接就能运行
NuGet包列表
-
FFmpeg 5.x:
Package Link Sdcb.FFmpeg Sdcb.FFmpeg.runtime.windows-x64 -
FFmpeg 4.4.x:
Package Link Sdcb.FFmpeg Sdcb.FFmpeg.runtime.windows-x64
Linux/MacOS下如何使用?
Linux
下你并不需要这些NuGet
包,Linux
的发行版本很多,这些发行版大都内置了FFmpeg
这样非常常见的库,比如在Ubuntu 22.04
中,就可以通过如下命令来安装FFmpeg 5.x
的动态链接库:
apt update apt install software-properties-common add-apt-repository ppa:savoury1/ffmpeg4 -y add-apt-repository ppa:savoury1/ffmpeg5 -y apt update apt install ffmpeg -y
如果是FFmpeg 4.x
,则可以通过以下命令来安装动态链接库:
apt update apt install software-properties-common add-apt-repository ppa:savoury1/ffmpeg4 -y apt update apt install ffmpeg -y
如果是MacOS
,则可以通过以下命令来安装动态链接库:
brew install ffmpeg
NuGet
包一般会和libc
相关的库绑定,没有很好的泛用性,而且一般Linux
中有更好的解决方案,因此我没有为Linux
制作运行时NuGet
包。
但不要理解错了,Sdcb.FFmpeg
在Linux
中也是经过测试的,也运行得很好,Github Actions
测试链接:https://github.com/sdcb/Sdcb.FFmpeg/actions
为什么我要另起炉灶?
其实我并不是一上来就准备另起炉灶,一开始我受到北京大佬于宏伟这个EmguFFmpeg项目的启发,觉得FFmpeg.AutoGen
确实很难用,但只要依赖于FFmpeg.AutoGen
,稍做点封装,就能减少许多维护工作,为此我于2020~2021年一直在想办法开发和维护这个开源项目:Sdcb.FFmpegAPIWrapper,这个项目是完全基于Sdcb.FFmpeg
开发的,当时这个项目也已经基本完成(就是没怎么做宣传、示例和教程)。
然而随着项目的深入,我越来越觉得直接依赖于FFmpeg.AutoGen
会导致代码过于“笨重”,比如同一套东西,原始的和“高级”的有两种不同的写法(比如同时存在AVCodecID.AV_CODEC_ID_H264
和AVCodecID.H264
,用户大概率会迷失,因此经过了许久的迷茫期后我终于下定决心改造FFmpeg.AutoGen
,整个改造的过程伴随了大约一年的时间,最后就造就了今天的状态。
6个示例演示Sdcb.FFmpeg
示例1 纯代码生成视频
可以理解这个示例是FFmpeg
的“Hello World”,需要引用如下NuGet包:
- Sdcb.FFmpeg 5.1.2
- Sdcb.FFmpeg.runtime.windows-x64
需要引用以下名字空间:
- Sdcb.FFmpeg.Codecs
- Sdcb.FFmpeg.Formats
- Sdcb.FFmpeg.Raw
- Sdcb.FFmpeg.Toolboxs.Extensions
- Sdcb.FFmpeg.Toolboxs.Generators
- Sdcb.FFmpeg.Utils
完整代码如下(点击展开):
// this example is based on Sdcb.FFmpeg 5.1.2 FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg); using FormatContext fc = FormatContext.AllocOutput(formatName: "mp4"); fc.VideoCodec = Codec.CommonEncoders.Libx264; MediaStream vstream = fc.NewStream(fc.VideoCodec); using CodecContext vcodec = new CodecContext(fc.VideoCodec) { Width = 800, Height = 600, TimeBase = new AVRational(1, 30), PixelFormat = AVPixelFormat.Yuv420p, Flags = AV_CODEC_FLAG.GlobalHeader, }; vcodec.Open(fc.VideoCodec); vstream.Codecpar!.CopyFrom(vcodec); vstream.TimeBase = vcodec.TimeBase; string outputPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "muxing.mp4"); fc.DumpFormat(streamIndex: 0, outputPath, isOutput: true); using IOContext io = IOContext.OpenWrite(outputPath); fc.Pb = io; fc.WriteHeader(); VideoFrameGenerator.Yuv420pSequence(vcodec.Width, vcodec.Height, 600) .ConvertFrames(vcodec) .EncodeAllFrames(fc, null, vcodec) .WriteAll(fc); fc.WriteTrailer();
运行后应该可以在桌面上看到一个muxing.mp4
的文件,这个文件就是通过上述代码生成的,这个视频效果如下图所示:
值得一提的是,我写了VideoFrameGenerator.Yuv420pSequence
,它输入了少量参数,返回了IEnumerable<Frame>
(或者在其它示例中IEnumerable<Packet>
),这是我项目里面非常常见的写法,这样既体现了C#
语言简明强大的魅力,又其实保障了资源管理和内存释放。
示例2 压制视频
这个示例将展示如何将一个视频压制成如下参数,这些参数也是微信Windows桌面端视频不受二压的参数:
- 编码:H264
- 视频码率:600kbps以下
- 视频分辨率:未限制,但推荐长边960
- 音频编码:AAC
- 音频码率:48kbps
需要引用如下NuGet包:
- Sdcb.FFmpeg 5.1.2
- Sdcb.FFmpeg.runtime.windows-x64
需要引用如下名字空间:
- Sdcb.FFmpeg.Codecs
- Sdcb.FFmpeg.Common
- Sdcb.FFmpeg.Filters
- Sdcb.FFmpeg.Formats
- Sdcb.FFmpeg.Raw
- Sdcb.FFmpeg.Toolboxs
- Sdcb.FFmpeg.Toolboxs.Extensions
- Sdcb.FFmpeg.Toolboxs.FilterTools
- Sdcb.FFmpeg.Toolboxs.Generators
- Sdcb.FFmpeg.Utils
- static Sdcb.FFmpeg.Raw.ffmpeg
- System.Collections.Concurrent
- System.Runtime.CompilerServices
- System.Threading.Tasks
完整代码如下(点击展开):
void Main() { FFmpegLogger.LogLevel = LogLevel.Error; FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg); Task.Run(() => A7r3VideoToWechat(@"Y:a7r32022-12-12C0060.MP4")).Wait(); } void A7r3VideoToWechat(string mp4Path) { using FormatContext inFc = FormatContext.OpenInputUrl(mp4Path); inFc.LoadStreamInfo(); // prepare input stream/codec MediaStream inAudioStream = inFc.GetAudioStream(); using CodecContext audioDecoder = new(Codec.FindDecoderById(inAudioStream.Codecpar!.CodecId)); audioDecoder.FillParameters(inAudioStream.Codecpar); audioDecoder.Open(); audioDecoder.ChannelLayout = (ulong)ffmpeg.av_get_default_channel_layout(audioDecoder.Channels); MediaStream inVideoStream = inFc.GetVideoStream(); using CodecContext videoDecoder = new(Codec.FindDecoderByName("h264_cuvid")); videoDecoder.FillParameters(inVideoStream.Codecpar!); videoDecoder.Open(); // dest file string destFile = Path.Combine(Path.GetDirectoryName(mp4Path)!, Path.GetFileNameWithoutExtension(mp4Path) + "_wechat.mp4"); using FormatContext outFc = FormatContext.AllocOutput(fileName: destFile); // dest encoder and streams outFc.AudioCodec = Codec.CommonEncoders.AAC; MediaStream outAudioStream = outFc.NewStream(outFc.AudioCodec); using CodecContext audioEncoder = new(outFc.AudioCodec) { Channels = 1, SampleFormat = outFc.AudioCodec.Value.NegociateSampleFormat(AVSampleFormat.Fltp), SampleRate = outFc.AudioCodec.Value.NegociateSampleRates(48000), BitRate = 48000 }; audioEncoder.ChannelLayout = (ulong)ffmpeg.av_get_default_channel_layout(audioEncoder.Channels); audioEncoder.TimeBase = new AVRational(1, audioEncoder.SampleRate); audioEncoder.Open(outFc.AudioCodec); outAudioStream.Codecpar!.CopyFrom(audioEncoder); outFc.VideoCodec = Codec.FindEncoderByName("libx264"); MediaStream outVideoStream = outFc.NewStream(outFc.VideoCodec); using VideoFilterContext vfilter = VideoFilterContext.Create(inVideoStream, "scale=1920:-1"); using CodecContext videoEncoder = new(outFc.VideoCodec) { Flags = AV_CODEC_FLAG.GlobalHeader, ThreadCount = Environment.ProcessorCount, ThreadType = ffmpeg.FF_THREAD_FRAME, BitRate = 595_000 }; vfilter.ConfigureEncoder(videoEncoder); var dict = new MediaDictionary { //["qp"] = "30", ["tune"] = "zerolatency", ["preset"] = "veryfast" }; videoEncoder.Open(outFc.VideoCodec, dict); //dict.Dump(); outVideoStream.Codecpar!.CopyFrom(videoEncoder); outVideoStream.TimeBase = videoEncoder.TimeBase; // begin write using IOContext io = IOContext.OpenWrite(destFile); outFc.Pb = io; outFc.WriteHeader(); MediaThreadQueue<Frame> decodingQueue = inFc .ReadPackets(inVideoStream.Index, inAudioStream.Index) .DecodeAllPackets(inFc, audioDecoder, videoDecoder) .ToThreadQueue(cancellationToken: QueryCancelToken, boundedCapacity: 64); MediaThreadQueue<Packet> encodingQueue = decodingQueue.GetConsumingEnumerable() .ApplyVideoFilters(vfilter) .ConvertAllFrames(audioEncoder, videoEncoder) .AudioFifo(audioEncoder) .EncodeAllFrames(outFc, audioEncoder, videoEncoder) .ToThreadQueue(cancellationToken: QueryCancelToken); CancellationTokenSource end = new(); QueryCancelToken.Register(() => end.Cancel()); Dictionary<int, PtsDts> ptsDts = new(); Task.Run(async () => { double totalDuration = Math.Max(inVideoStream.GetDurationInSeconds(), inAudioStream.GetDurationInSeconds()); try { while (!end.IsCancellationRequested) { Log(); await Task.Delay(1000, end.Token); } } finally { Log(); } void Log() => Console.WriteLine($"{GetStatusText()}, dec/enc queue: {decodingQueue.Count}/{encodingQueue.Count}"); string GetStatusText() => $"{(outVideoStream.TimeBase * ptsDts.GetValueOrDefault(outVideoStream.Index, PtsDts.Default).Dts).ToDouble():F2} of {totalDuration:F2}"; }); encodingQueue.GetConsumingEnumerable() .RecordPtsDts(ptsDts) .WriteAll(outFc); end.Cancel(); outFc.WriteTrailer(); }
运行效果如图(将500多MB压缩为5MB):
值得一提的是这里的MediaThreadQueue<Frame>
和MediaThreadQueue<Packet>
,内部都是基于C#
的BlockingCollection
加多线程做的,这样可能提高效率,保证性能。
示例3 创建gif(表情包?)
注意,我创建了一个demo网站可以用于演示该功能,可以点击“生成”按钮,比如可以得到这样的表情包:
我把所有有完整Visual Studio
代码示例上传到了Github,可以在这下载:https://github.com/sdcb/ffmpeg-wjz-sorry-generator
它有如下步骤和要点:
- 视频解码
- 将每一帧转换为BGRA像素格式
- 使用Direct2D读取并绘制字幕
- 将每一帧输入视频过滤器,转换为PAL8格式
- 将PAL8编码像素格式的帧编码为gif
注意这个demo我用到了Direct2D
,它基于这个开源项目做的:Vortice.Windows
示例4 实际桌面投屏(远程桌面?)
这个可以实现将一台电脑的屏幕内容,以较低的网络开销,通过网络实时地传输到另一台电脑,它的使用场景包含实时视频通话、远程投屏、远程桌面控制等。
代码分为两部分,桌面录制-编码-发送端和远程接收-解码-显示端。
桌面录制-编码-发送端完整源代码
需要引用NuGet
包:
- Sdcb.FFmpeg 4.4.3
- Sdcb.FFmpeg.runtime.windows-x64 4.4.3
- Sdcb.ScreenCapture
完整源代码如下(点击展开):
// This example was initially written based on Sdcb.FFmpeg 4.4.3 & Sdcb.ScreenCapture void Main() { StartService(QueryCancelToken); } void StartService(CancellationToken cancellationToken = default) { var tcpListener = new TcpListener(IPAddress.Any, 5555); cancellationToken.Register(() => tcpListener.Stop()); tcpListener.Start(); while (!cancellationToken.IsCancellationRequested) { TcpClient client = tcpListener.AcceptTcpClient(); Task.Run(() => ServeClient(client, cancellationToken)); } } void ServeClient(TcpClient tcpClient, CancellationToken cancellationToken = default) { try { using var _ = tcpClient; using NetworkStream stream = tcpClient.GetStream(); using BinaryWriter writer = new(stream); RectI screenSize = ScreenCapture.GetScreenSize(screenId: 0); RdpCodecParameter rcp = new(AVCodecID.H264, screenSize.Width, screenSize.Height, AVPixelFormat.Bgr0); using CodecContext cc = new(Codec.CommonEncoders.Libx264RGB) { Width = rcp.Width, Height = rcp.Height, PixelFormat = rcp.PixelFormat, TimeBase = new AVRational(1, 20), }; cc.Open(null, new MediaDictionary { ["crf"] = "30", ["tune"] = "zerolatency", ["preset"] = "veryfast" }); writer.Write(rcp.ToArray()); using Frame source = new(); foreach (Packet packet in ScreenCapture .CaptureScreenFrames(screenId: 0) .ToBgraFrame() .ConvertFrames(cc) .EncodeFrames(cc)) { if (cancellationToken.IsCancellationRequested) { break; } writer.Write(packet.Data.Length); writer.Write(packet.Data.AsSpan()); } } catch (IOException ex) { // Unable to write data to the transport connection: 远程主机强迫关闭了一个现有的连接。. // Unable to write data to the transport connection: 你的主机中的软件中止了一个已建立的连接。 ex.Dump(); } } public class Filo<T> : IDisposable { private T? Item { get; set; } private ManualResetEventSlim Notify { get; } = new ManualResetEventSlim(); public void Update(T item) { Item = item; Notify.Set(); } public IEnumerable<T> Consume(CancellationToken cancellationToken = default) { while (!cancellationToken.IsCancellationRequested) { Notify.Wait(cancellationToken); yield return Item!; } } public void Dispose() => Notify.Dispose(); } public static class BgraFrameExtensions { public static IEnumerable<Frame> ToBgraFrame(this IEnumerable<LockedBgraFrame> bgras) { using Frame frame = new Frame(); foreach (LockedBgraFrame bgra in bgras) { frame.Width = bgra.Width; frame.Height = bgra.Height; frame.Format = (int)AVPixelFormat.Bgra; frame.Data[0] = bgra.DataPointer; frame.Linesize[0] = bgra.RowPitch; yield return frame; } } } record RdpCodecParameter(AVCodecID CodecId, int Width, int Height, AVPixelFormat PixelFormat) { public byte[] ToArray() { byte[] data = new byte[16]; Span<byte> span = data.AsSpan(); BinaryPrimitives.WriteInt32LittleEndian(span, (int)CodecId); BinaryPrimitives.WriteInt32LittleEndian(span[4..], Width); BinaryPrimitives.WriteInt32LittleEndian(span[8..], Height); BinaryPrimitives.WriteInt32LittleEndian(span[12..], (int)PixelFormat); return data; } }
值得一提的是Sdcb.ScreenCapture
这个NuGet
包也是我做的,它是基于DXGI
的技术,录屏时能做到内存0复制,可以实现每秒60帧录屏且CPU占用率很低。这里挖个坑以后有机会介绍这个开源项目,Github地址如下:https://github.com/sdcb/Sdcb.ScreenCapture
远程接收-解码-显示端完整源代码
需要引用的NuGet包:
- Sdcb.FFmpeg 4.4.3
- Sdcb.FFmpeg.runtime.windows-x64 4.4.3
- FlysEngine.Desktop
请点击展开显示:
// This example was initially written based on Sdcb.FFmpeg 4.4.3 & FlysEngine.Desktop #nullable enable ManagedBgraFrame? managedFrame = null; bool cancel = false; unsafe void Main() { using RenderWindow w = new(); w.FormClosed += delegate { cancel = true; }; Task decodingTask = Task.Run(() => DecodeThread(() => (3840, 2160))); w.Draw += (_, ctx) => { ctx.Clear(Colors.CornflowerBlue); if (managedFrame == null) return; ManagedBgraFrame frame = managedFrame.Value; fixed (byte* ptr = frame.Data) { //new System.Drawing.Bitmap(frame.Width, frame.Height, frame.RowPitch, System.Drawing.Imaging.PixelFormat.Format32bppPArgb, (IntPtr)ptr).DumpUnscaled(); BitmapProperties1 props = new(new PixelFormat(Format.B8G8R8A8_UNorm, Vortice.DCommon.AlphaMode.Premultiplied)); using ID2D1Bitmap bmp = ctx.CreateBitmap(new SizeI(frame.Width, frame.Height), (IntPtr)ptr, frame.RowPitch, props); ctx.UnitMode = UnitMode.Dips; ctx.DrawBitmap(bmp, 1.0f, InterpolationMode.NearestNeighbor); } }; RenderLoop.Run(w, () => w.Render(1, Vortice.DXGI.PresentFlags.None)); } async Task DecodeThread(Func<(int width, int height)> sizeAccessor) { using TcpClient client = new TcpClient(); await client.ConnectAsync(IPAddress.Loopback, 5555); using NetworkStream stream = client.GetStream(); using BinaryReader reader = new(stream); RdpCodecParameter rcp = RdpCodecParameter.FromSpan(reader.ReadBytes(16)); using CodecContext cc = new(Codec.FindDecoderById(rcp.CodecId)) { Width = rcp.Width, Height = rcp.Height, PixelFormat = rcp.PixelFormat, }; cc.Open(null); foreach (var frame in reader .ReadPackets() .DecodePackets(cc) .ConvertVideoFrames(sizeAccessor, AVPixelFormat.Bgra) .ToManaged() ) { if (cancel) break; managedFrame = frame; } } public static class FramesExtensions { public static IEnumerable<ManagedBgraFrame> ToManaged(this IEnumerable<Frame> bgraFrames, bool unref = true) { foreach (Frame frame in bgraFrames) { int rowPitch = frame.Linesize[0]; int length = rowPitch * frame.Height; byte[] buffer = new byte[length]; Marshal.Copy(frame.Data._0, buffer, 0, length); ManagedBgraFrame managed = new(buffer, length, length / frame.Height); if (unref) frame.Unref(); yield return managed; } } } public record struct ManagedBgraFrame(byte[] Data, int Length, int RowPitch) { public int Width => RowPitch / BytePerPixel; public int Height => Length / RowPitch; public const int BytePerPixel = 4; } public static class ReadPacketExtensions { public static IEnumerable<Packet> ReadPackets(this BinaryReader reader) { using Packet packet = new(); while (true) { int packetSize = reader.ReadInt32(); if (packetSize == 0) yield break; byte[] data = reader.ReadBytes(packetSize); GCHandle dataHandle = GCHandle.Alloc(data, GCHandleType.Pinned); try { packet.Data = new DataPointer(dataHandle.AddrOfPinnedObject(), packetSize); yield return packet; } finally { dataHandle.Free(); } } } } record RdpCodecParameter(AVCodecID CodecId, int Width, int Height, AVPixelFormat PixelFormat) { public static RdpCodecParameter FromSpan(ReadOnlySpan<byte> data) { return new RdpCodecParameter( CodecId: (AVCodecID)BinaryPrimitives.ReadInt32LittleEndian(data), Width: BinaryPrimitives.ReadInt32LittleEndian(data[4..]), Height: BinaryPrimitives.ReadInt32LittleEndian(data[8..]), PixelFormat: (AVPixelFormat)BinaryPrimitives.ReadInt32LittleEndian(data[12..])); } }
两者运行效果如图:
可见传输延迟在0.28
秒的样子,这是通过libx264
编码通过yuv420p
传输的我4k
显示器视频,可见可以满足实际网络会议演示、投屏直播、远程控制方面的需求(如果是1080p延迟应该可以更低)。
注意该源代码用上了我自己写的开源Direct2D
封装引擎:FlysEngine,你不需要关注它的细节(只需要安装NuGet包即可),但如果你碰巧关注,这里又挖个坑看以后有机会介绍介绍,在这之前只需要知道的是它只对D3D11、DXGI、Direct2D、WIC、DirectWrite做了一层薄薄的封装。
示例5 接收显示RTSP摄像头视频
这个程序依赖于如下NuGet包:
- FlysEngine.Desktop
- Sdcb.FFmpeg 4.4.3
- Sdcb.FFmpeg.runtime.windows-x64 4.4.3
完整代码(点击展开):
#nullable enable FFmpegBmp? ffBmp = null; FFmpegBmp? lastFFbmp = null; FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg); CancellationTokenSource cts = new(); using RenderWindow w = new(); Task.Run(() => DecodeRTSP(Util.GetPassword("home-rtsp-ipc"), cts.Token)); w.Draw += (_, ctx) => { if (ffBmp == null) return; if (lastFFbmp == ffBmp) return; GCHandle handle = GCHandle.Alloc(ffBmp.Data, GCHandleType.Pinned); try { using ID2D1Bitmap bmp = ctx.CreateBitmap(new SizeI(ffBmp.Width, ffBmp.Height), handle.AddrOfPinnedObject(), ffBmp.RowPitch, new BitmapProperties(new Vortice.DCommon.PixelFormat(Format.B8G8R8A8_UNorm, Vortice.DCommon.AlphaMode.Premultiplied))); lastFFbmp = ffBmp; Size clientSize = ctx.Size; float top = (clientSize.Height - ffBmp.Height) / 2; ctx.Transform = Matrix3x2.CreateTranslation(0, top); ctx.DrawBitmap(bmp, 1.0f, InterpolationMode.Linear); } finally { handle.Free(); } }; w.FormClosing += delegate { cts.Cancel(); }; RenderLoop.Run(w, () => w.Render(1, Vortice.DXGI.PresentFlags.None)); void DecodeRTSP(string url, CancellationToken cancellationToken = default) { using FormatContext fc = FormatContext.OpenInputUrl(url); fc.LoadStreamInfo(); MediaStream videoStream = fc.GetVideoStream(); using CodecContext videoDecoder = new CodecContext(Codec.FindDecoderByName("hevc_qsv")); videoDecoder.FillParameters(videoStream.Codecpar!); videoDecoder.Open(); foreach (Frame frame in fc .ReadPackets(videoStream.Index) .DecodePackets(videoDecoder) .ConvertVideoFrames(() => new(w.ClientSize.Width, w.ClientSize.Width * videoDecoder.Height / videoDecoder.Width), AVPixelFormat.Bgr0)) { if (cancellationToken.IsCancellationRequested) break; try { byte[] data = new byte[frame.Linesize[0] * frame.Height]; Marshal.Copy(frame.Data._0, data, 0, data.Length); ffBmp = new FFmpegBmp(frame.Width, frame.Height, frame.Linesize[0], data); } finally { frame.Unref(); } } } public record FFmpegBmp(int Width, int Height, int RowPitch, byte[] Data);
我农村老家的摄像头使用的是RTSP摄像头,这是使用上述代码的运行效果:
示例6 读RTSP流并保存为mp4/mov文件
这个示例依赖于以下NuGet
包:
- Sdcb.FFmpeg 4.4.3
- Sdcb.FFmpeg.runtime.windows-x64 4.4.3
完整代码示例(请点击展开):
// The example was initially written using Sdcb.FFmpeg 4.4.3 FFmpegLogger.LogWriter = (level, msg) => Console.Write(msg); using FormatContext inFc = FormatContext.OpenInputUrl(Util.GetPassword("home-rtsp-ipc")); inFc.LoadStreamInfo(); MediaStream inAudioStream = inFc.GetAudioStream(); MediaStream inVideoStream = inFc.GetVideoStream(); long gpts_v = 0, gpts_a = 0, gdts_v = 0, gdts_a = 0; while (!QueryCancelToken.IsCancellationRequested) { using FormatContext outFc = FormatContext.AllocOutput(formatName: "mov"); string dir = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "rtsp", DateTime.Now.ToString("yyyy-MM-dd")); Directory.CreateDirectory(dir); using IOContext io = IOContext.OpenWrite(Path.Combine(dir, $"{DateTime.Now:HHmmss}.mov")); outFc.Pb = io; MediaStream videoStream = outFc.NewStream(Codec.FindEncoderById(inVideoStream.Codecpar!.CodecId)); videoStream.Codecpar!.CopyFrom(inVideoStream.Codecpar); videoStream.TimeBase = inVideoStream.RFrameRate.Inverse(); videoStream.SampleAspectRatio = inVideoStream.SampleAspectRatio; MediaStream audioStream = outFc.NewStream(Codec.FindEncoderById(inAudioStream.Codecpar!.CodecId)); audioStream.Codecpar!.CopyFrom(inAudioStream.Codecpar); audioStream.TimeBase = inAudioStream.TimeBase; audioStream.Codecpar.ChannelLayout = (ulong)ffmpeg.av_get_default_channel_layout(inAudioStream.Codecpar.Channels); outFc.WriteHeader(); FilterPackets(inFc.ReadPackets(inAudioStream.Index, inVideoStream.Index), videoFrameCount: 60 * 20) .WriteAll(outFc); outFc.WriteTrailer(); IEnumerable<Packet> FilterPackets(IEnumerable<Packet> packets, int videoFrameCount) { long pts_v = gpts_v, pts_a = gpts_a, dts_v = gdts_v, dts_a = gdts_a; long[] buffer = new long[200]; long ithreshold = -1; int videoFrame = 0; foreach (Packet pkt in packets) { pkt.StreamIndex = pkt.StreamIndex == inAudioStream.Index ? audioStream.Index : videoStream.Index; if (pkt.StreamIndex == inAudioStream.Index) { // audio (gpts_a, gdts_a, pkt.Pts, pkt.Dts) = (pkt.Pts, pkt.Dts, pkt.Pts - pts_a, pkt.Dts - dts_a); pkt.RescaleTimestamp(inAudioStream.TimeBase, audioStream.TimeBase); } else { // video if (videoFrame < buffer.Length) { buffer[videoFrame] = pkt.Data.Length; ithreshold = -1; } else if (videoFrame == buffer.Length) { ithreshold = buffer.Order().ToArray()[buffer.Length / 2] * 4; } if (videoFrame >= videoFrameCount && pkt.Data.Length > ithreshold) { break; } (gpts_v, gdts_v, pkt.Pts, pkt.Dts) = (pkt.Pts, pkt.Dts, pkt.Pts - pts_v, pkt.Dts - dts_v); pkt.RescaleTimestamp(inVideoStream.TimeBase, videoStream.TimeBase); videoFrame++; } yield return pkt; } } }
这个程序可以全天候运行,运行后RTSP摄像头录的完整视频和音频,大约每1.5分钟对应一个视频文件,都会保存到桌面的这个文件夹中(如图):
这样的话也许就有机会取代录机了~
总结与展望
我认为把东西做出来和把东西做好是有区别的,以前在C#
里面东西也就是“能用”的状态,这和许多node.js
或者python
那样的极客玩家有本质区别,希望通过这样一个开源项目能向“.NET作为第一等公民”方向努力。
维护开源不易,喜欢的朋友请点个赞,赏个star:https://github.com/sdcb/Sdcb.FFmpeg
我也想能给自己立个flag,希望未来我可以封装FlyCV
、libyuv
、x264
基于libaom-av1
,甚至也许有一点有机会做一个.NET
版本的FFmpeg
。
喜欢的朋友 请关注我的微信公众号:【DotNet骚操作】