音视频基础知识

转载:https://ffmpeg.xianwaizhiyin.net/base-knowledge/base-knowledge.html

色彩空间

RGB

YUV

编码压缩(encode)格式

JPEG 标准,用于单张图片压缩。标准文档 ISO/IEC 10918-1

H.262 标准,用于视频编解码,标准文档 ISO/IEC 13818-2

H.263 标准,用于视频编解码。

H.264 标准,在 2022年 目前是应用非常广泛的标准。

VP9,Google 出的视频编解码标准。

AVS,中国的视频压缩标准。

....

封装(复用Mux)格式

音视频封装格式,可以把多个流数据合并到一个文件里面,这就是封装格式,这就是 mux

AVI

FLV

MP4

MKV

MOV

WEBM

....

Gstreamer

gstreamer主要通过类似于管道流来进行媒体处理

万能通用

检查摄像头当前输出类型

v4l2-ctl --all -d /dev/video0 | grep "Pixel Format"

万能播放

#mjpeg格式
gst-launch-1.0 v4l2src device=/dev/video0 ! image/jpeg ! decodebin ! videoconvert ! autovideosink

#默认yuyv格式raw
gst-launch-1.0 v4l2src device=/dev/video0 ! decodebin ! videoconvert ! autovideosink	

视频录制

  1. 从USB摄像头获取MJEP格式图像,保存为avi格式

gst-launch-1.0 v4l2src device=/dev/video0 ! image/jpeg,width=1280,height=480,framerate=30/1 ! avimux ! filesink location = ~/Desktop/test.avi

  1. 从USB摄像头获取MJEP格式图像,保存为jpg系列图片

gst-launch-1.0 v4l2src device=/dev/video0 ! image/jpeg,width=1280,height=480,framerate=30/1 ! multifilesink location = ~/Desktop/img/video_%04d.jpg

  1. 从USB摄像头获取MJEP格式图像,解码为raw格式,编码为h264,并保存为mp4格式

gst-launch-1.0 v4l2src device=/dev/video0 ! image/jpeg,width=1280,height=480,framerate=30/1 ! mppjpegdec ! mpph264enc ! mpegtsmux ! filesink location = ~/Desktop/test.mp4

视频播放

  1. 首先确定视频的封装和编码格式

sudo apt install mediainfo
mediainfo ./test.mp4
  1. 选择对应的解包

MPEG-TS tsdemux

MPEG-4 qtdemux

AVI avidemux

  1. 选择对应的解析

JPEG jpegparse

AVC/H264 h264parse

  1. 选择对于的解码

JPEG jpegdec/mpphpegdec

AVC/H264 h264deco/mppvideodec

  1. 选择播放

fakesink autovideosink filesink...

示例

  1. ts封装 h264编码

gst-launch-1.0 filesrc location=~/Desktop/test.ts ! tsdemux ! h264parse ! mppvideodec ! autovideosink
  1. mp4封装 h264编码

gst-launch-1.0 filesrc location=~/Desktop/test.mp4 ! qtdemux ! h264parse ! mppvideodec ! autovideosink
  1. avi封装 MJPG编码

gst-launch-1.0 filesrc location=~/Desktop/test.avi ! avidemux ! queue ! mppjpegdec ! autovideosink

gst-inspect-1.0查看插件信息

sink 输入量

src 输出量

一些常用插件信息汇总

类型:

image/

jpeg

png

video/

x-raw原生格式,即都有每个像素的信息

x-divx

x-msmpeg

x-h263

x-h264

x-dv

x-wmv

x-jpc

x-vp8

x-huffyuv

mpeg

解码类

jpegdec

sink输入 image/jpeg

src输出 video/x-raw

SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-raw
                 format: { (string)I420, (string)RGB, (string)BGR, (string)RGBx, (string)xRGB, (string)BGRx, (string)xBGR, (string)GRAY8 }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
  
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      image/jpeg

编码类

jpegenc

sink输入video/x-raw

src输出 image/jpeg

SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw
                 format: { (string)I420, (string)YV12, (string)YUY2, (string)UYVY, (string)Y41B, (string)Y42B, (string)YVYU, (string)Y444, (string)NV21, (string)NV12, (string)RGB, (string)BGR, (string)RGBx, (string)xRGB, (string)BGRx, (string)xBGR, (string)GRAY8 }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
  
  SRC template: 'src'
    Availability: Always
    Capabilities:
      image/jpeg
                  width: [ 16, 65535 ]
                 height: [ 16, 65535 ]
              framerate: [ 0/1, 2147483647/1 ]
             sof-marker: { (int)0, (int)1, (int)2, (int)4, (int)9 }

解析类

parse解析类 通常是一种处理特定媒体格式或编码格式的元素,用于解析和分析媒体流数据,从而使其可以被后续的元素正确地解码或处理

从一个大类解析出该类中所包含的格式

jpegparse

Parse JPEG images into single-frame buffers

从一堆jpeg图片分离出单独的连续帧

sink输入image/jpeg

src输出 image/jpeg

SRC template: 'src'
    Availability: Always
    Capabilities:
      image/jpeg
                 format: { (string)I420, (string)Y41B, (string)UYVY, (string)YV12 }
                  width: [ 0, 2147483647 ]
                 height: [ 0, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
                 parsed: true
  
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      image/jpeg

h264parse

sink输入video/x-h264

src输出 video/x-h264

SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-h264
                 parsed: true
          stream-format: { (string)avc, (string)avc3, (string)byte-stream }
              alignment: { (string)au, (string)nal }
  
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-h264

容器封装&对应解包

"Muxer" 插件的主要功能是将不同类型的媒体流合并到一个指定的容器格式(如 MP4、MKV、AVI 等)的容器中。

通常是管道中的输出前的一环(倒数第二环)——包装

avimux

sink输入 image video audio

src输出 video/x-msvideo

对应的解包为avidemux

SINK template: 'video_%u'
    Availability: On request
    Capabilities:
      video/x-raw
                 format: { (string)YUY2, (string)I420, (string)BGR, (string)BGRx, (string)BGRA, (string)GRAY8, (string)UYVY, (string)v210 }
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
      image/jpeg
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-divx
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
            divxversion: [ 3, 5 ]
      video/x-msmpeg
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
          msmpegversion: [ 41, 43 ]
      video/mpeg
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
            mpegversion: { (int)1, (int)2, (int)4 }
           systemstream: false
      video/x-h263
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-h264
          stream-format: byte-stream
              alignment: au
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-dv
                  width: 720
                 height: { (int)576, (int)480 }
              framerate: [ 0/1, 2147483647/1 ]
           systemstream: false
      video/x-huffyuv
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-wmv
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
             wmvversion: [ 1, 3 ]
      image/x-jpc
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-vp8
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
      image/png
                  width: [ 16, 4096 ]
                 height: [ 16, 4096 ]
              framerate: [ 0/1, 2147483647/1 ]
  
  SINK template: 'audio_%u'
    Availability: On request
    Capabilities:
      audio/x-raw
                 format: { (string)U8, (string)S16LE, (string)S24LE, (string)S32LE }
                   rate: [ 1000, 96000 ]
               channels: [ 1, 65535 ]
      audio/mpeg
            mpegversion: 1
                  layer: [ 1, 3 ]
                   rate: [ 1000, 96000 ]
               channels: [ 1, 2 ]
      audio/mpeg
            mpegversion: 4
            mpegversion: 1
                  layer: [ 1, 3 ]
                   rate: [ 1000, 96000 ]
               channels: [ 1, 2 ]
      audio/mpeg
            mpegversion: 4
          stream-format: raw
                   rate: [ 1000, 96000 ]
               channels: [ 1, 2 ]
      audio/x-ac3
                   rate: [ 1000, 96000 ]
               channels: [ 1, 6 ]
      audio/x-alaw
                   rate: [ 1000, 48000 ]
               channels: [ 1, 2 ]
      audio/x-mulaw
                   rate: [ 1000, 48000 ]
               channels: [ 1, 2 ]
      audio/x-wma
                   rate: [ 1000, 96000 ]
               channels: [ 1, 2 ]
             wmaversion: [ 1, 2 ]
  
  SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-msvideo

mp4mux

sink输入 video audio subtitle

src输出 video/quicktime

对应的解包为qtdemux

SINK template: 'subtitle_%u'
    Availability: On request
    Capabilities:
      text/x-raw
                 format: utf8
    Type: GstQTMuxPad
    Pad Properties:
      emit-signals        : Send signals to signal data consumption
                            flags: readable, writable
                            Boolean. Default: false
      trak-timescale      : Timescale to use for this pad's trak (units per second, 0 is automatic)
                            flags: readable, writable
                            Unsigned Integer. Range: 0 - 4294967295 Default: 0 
  
  SINK template: 'video_%u'
    Availability: On request
    Capabilities:
      video/mpeg
            mpegversion: 4
           systemstream: false
                  width: [ 16, 2147483647 ]
                 height: [ 16, 2147483647 ]
      video/x-divx
            divxversion: 5
                  width: [ 16, 2147483647 ]
                 height: [ 16, 2147483647 ]
      video/x-h264
          stream-format: avc
              alignment: au
                  width: [ 16, 2147483647 ]
                 height: [ 16, 2147483647 ]
      video/x-h265
          stream-format: { (string)hvc1, (string)hev1 }
              alignment: au
                  width: [ 16, 2147483647 ]
                 height: [ 16, 2147483647 ]
      video/x-mp4-part
                  width: [ 16, 2147483647 ]
                 height: [ 16, 2147483647 ]
      video/x-av1
                  width: [ 16, 2147483647 ]
                 height: [ 16, 2147483647 ]
    Type: GstQTMuxPad
    Pad Properties:
      emit-signals        : Send signals to signal data consumption
                            flags: readable, writable
                            Boolean. Default: false
      trak-timescale      : Timescale to use for this pad's trak (units per second, 0 is automatic)
                            flags: readable, writable
                            Unsigned Integer. Range: 0 - 4294967295 Default: 0 
  
  SINK template: 'audio_%u'
    Availability: On request
    Capabilities:
      audio/mpeg
            mpegversion: 1
                  layer: [ 1, 3 ]
               channels: [ 1, 2 ]
                   rate: [ 1, 2147483647 ]
      audio/mpeg
            mpegversion: 4
          stream-format: raw
               channels: [ 1, 8 ]
                   rate: [ 1, 2147483647 ]
      audio/x-ac3
               channels: [ 1, 6 ]
                   rate: [ 1, 2147483647 ]
      audio/x-alac
               channels: [ 1, 2 ]
                   rate: [ 1, 2147483647 ]
      audio/x-opus
        channel-mapping-family: [ 0, 255 ]
               channels: [ 1, 8 ]
                   rate: [ 1, 2147483647 ]
    Type: GstQTMuxPad
    Pad Properties:
      emit-signals        : Send signals to signal data consumption
                            flags: readable, writable
                            Boolean. Default: false
      trak-timescale      : Timescale to use for this pad's trak (units per second, 0 is automatic)
                            flags: readable, writable
                            Unsigned Integer. Range: 0 - 4294967295 Default: 0 
  
  SRC template: 'src'
    Availability: Always
    Capabilities:
      video/quicktime
                variant: iso
    Type: GstAggregatorPad
    Pad Properties:
      emit-signals        : Send signals to signal data consumption
                            flags: readable, writable
                            Boolean. Default: false

mpegtsmux

sink输入 video audio

src输出 video/mpegts

对应的解包为:tsdemux

SRC template: 'src'
    Availability: Always
    Capabilities:
      video/mpegts
           systemstream: true
             packetsize: { (int)188, (int)192 }
    Type: GstAggregatorPad
    Pad Properties:
      emit-signals        : Send signals to signal data consumption
                            flags: readable, writable
                            Boolean. Default: false
  
  SINK template: 'sink_%d'
    Availability: On request
    Capabilities:
      video/mpeg
                 parsed: true
            mpegversion: { (int)1, (int)2, (int)4 }
           systemstream: false
      video/x-dirac
      image/x-jpc
      video/x-h264
          stream-format: byte-stream
              alignment: { (string)au, (string)nal }
      video/x-h265
          stream-format: byte-stream
              alignment: { (string)au, (string)nal }
      audio/mpeg
                 parsed: true
            mpegversion: 1
      audio/mpeg
                 framed: true
            mpegversion: { (int)2, (int)4 }
          stream-format: { (string)adts, (string)raw }
      audio/x-lpcm
                  width: { (int)16, (int)20, (int)24 }
                   rate: { (int)48000, (int)96000 }

输出显示类

xvimagesink

sink输入video/x-raw

SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw
              framerate: [ 0/1, 2147483647/1 ]
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]

RK芯片硬件加速

mppjpegdec

解析jpeg格式为raw格式,一般用于图像处理

sink输入image/jpeg

src输出 video/x-raw, video/x-raw(memory:DMABuf)

SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-raw
                 format: { (string)NV12, (string)NV16, (string)NV12_10LE40, (string)NV12, (string)NV21, (string)I420, (string)YV12, (string)NV16, (string)NV61, (string)BGR16, (string)RGB, (string)BGR, (string)RGBA, (string)BGRA, (string)RGBx, (string)BGRx, (string)BGR16, (string)RGB16, (string)ABGR, (string)ARGB, (string)BGRA, (string)RGBA, (string)xBGR, (string)xRGB, (string)BGRx, (string)RGBx }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-raw(memory:DMABuf)
                 format: { (string)NV12, (string)NV16, (string)NV12_10LE40, (string)NV12, (string)NV21, (string)I420, (string)YV12, (string)NV16, (string)NV61, (string)BGR16, (string)RGB, (string)BGR, (string)RGBA, (string)BGRA, (string)RGBx, (string)BGRx, (string)BGR16, (string)RGB16, (string)ABGR, (string)ARGB, (string)BGRA, (string)RGBA, (string)xBGR, (string)xRGB, (string)BGRx, (string)RGBx }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
  
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      image/jpeg
                 parsed: true

mmpvideodec

sink输入video/x-h264, video/x-h265, video/mpeg

src输出 video/x-raw, video/x-raw(memory:DMABuf)

SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-h264
          stream-format: { (string)avc, (string)avc3, (string)byte-stream }
              alignment: { (string)au }
                 parsed: true
      video/x-h265
          stream-format: { (string)hvc1, (string)hev1, (string)byte-stream }
              alignment: { (string)au }
                 parsed: true
      video/mpeg
            mpegversion: { (int)1, (int)2, (int)4 }
                 parsed: true
           systemstream: false
      video/x-vp8
      video/x-vp9
  
  SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-raw
                 format: { (string)NV12, (string)NV16, (string)NV12_10LE40, (string)NV12, (string)NV21, (string)I420, (string)YV12, (string)NV16, (string)NV61, (string)BGR16, (string)RGB, (string)BGR, (string)RGBA, (string)BGRA, (string)RGBx, (string)BGRx }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-raw(memory:DMABuf)
              framerate: [ 0/1, 2147483647/1 ]
      video/x-raw(memory:DMABuf)
                 format: { (string)NV12, (string)NV16, (string)NV12_10LE40, (string)NV12, (string)NV21, (string)I420, (string)YV12, (string)NV16, (string)NV61, (string)BGR16, (string)RGB, (string)BGR, (string)RGBA, (string)BGRA, (string)RGBx, (string)BGRx }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
      video/x-raw
                 format: { (string)NV12, (string)NV16, (string)NV12_10LE40, (string)NV12, (string)NV21, (string)I420, (string)YV12, (string)NV16, (string)NV61, (string)BGR16, (string)RGB, (string)BGR, (string)RGBA, (string)BGRA, (string)RGBx, (string)BGRx }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
               arm-afbc: 1
      video/x-raw(memory:DMABuf)
                 format: { (string)NV12, (string)NV16, (string)NV12_10LE40, (string)NV12, (string)NV21, (string)I420, (string)YV12, (string)NV16, (string)NV61, (string)BGR16, (string)RGB, (string)BGR, (string)RGBA, (string)BGRA, (string)RGBx, (string)BGRx }
                  width: [ 1, 2147483647 ]
                 height: [ 1, 2147483647 ]
              framerate: [ 0/1, 2147483647/1 ]
               arm-afbc: 1

mmph264enc

sink输入video/x-raw

src输出 video/x-h264 stream-format: byte-stream

P.S. byte-stream字节流类型不能之间保存,只能作为中间件

SRC template: 'src'
    Availability: Always
    Capabilities:
      video/x-h264
                  width: [ 96, 2147483647 ]
                 height: [ 64, 2147483647 ]
          stream-format: { (string)byte-stream }
              alignment: { (string)au }
                profile: { (string)baseline, (string)main, (string)high }
  
  SINK template: 'sink'
    Availability: Always
    Capabilities:
      video/x-raw
                 format: { (string)NV12, (string)I420, (string)YUY2, (string)UYVY, (string)BGR16, (string)RGB16, (string)ABGR, (string)ARGB, (string)BGRA, (string)RGBA, (string)xBGR, (string)xRGB, (string)BGRx, (string)RGBx, (string)NV12, (string)NV21, (string)I420, (string)YV12, (string)NV16, (string)NV61, (string)BGR16, (string)RGB, (string)BGR, (string)RGBA, (string)BGRA, (string)RGBx, (string)BGRx }
                  width: [ 96, 2147483647 ]
                 height: [ 64, 2147483647 ]

山和山不相遇,人与人要相逢