CMDecodingState

VCMDecodingState 是用于判断nalu是否可以连续解码,判断的依据因不同编码格式而不同。它支持了三种编码格式:VP8,VP9,H264,看下它定义的几个成员变量

uint16_t sequence_num_;
uint32_t time_stamp_;
int picture_id_;
int temporal_id_;
int tl0_pic_id_;
bool full_sync_;  // Sync flag when temporal layers are used.

picture_id,temporal_id,tl0_pic_id是携带在vp8,vp9中的信息,用于标识Nalu间的关系及是否可连续解码。而H264并没有携带这些信息,在成员函数ContinuousFrame中,可以看到对H264的处理逻辑。在这篇文章里也只关心H264的处理。

成员函数 ContinuousFrame

bool VCMDecodingState::ContinuousFrame(const VCMFrameBuffer* frame) const {
  // Check continuity based on the following hierarchy:
  // - Temporal layers (stop here if out of sync).
  // - Picture Id when available.
  // - Sequence numbers.
  // Return true when in initial state.
  // Note that when a method is not applicable it will return false.
  assert(frame != NULL);
  // A key frame is always considered continuous as it doesn't refer to any
  // frames and therefore won't introduce any errors even if prior frames are
  // missing.
  if (frame->FrameType() == VideoFrameType::kVideoFrameKey &&
      HaveSpsAndPps(frame->GetNaluInfos())) {
    return true;
  }
  // When in the initial state we always require a key frame to start decoding.
  if (in_initial_state_)
    return false;
  if (ContinuousLayer(frame->TemporalId(), frame->Tl0PicId()))
    return true;
  // tl0picId is either not used, or should remain unchanged.
  if (frame->Tl0PicId() != tl0_pic_id_)
    return false;
  // Base layers are not continuous or temporal layers are inactive.
  // In the presence of temporal layers, check for Picture ID/sequence number
  // continuity if sync can be restored by this frame.
  if (!full_sync_ && !frame->LayerSync())
    return false;
  if (UsingPictureId(frame)) {
    if (UsingFlexibleMode(frame)) {
      return ContinuousFrameRefs(frame);
    } else {
      return ContinuousPictureId(frame->PictureId());
    }
  } else {
    return ContinuousSeqNum(static_cast<uint16_t>(frame->GetLowSeqNum())) &&
           HaveSpsAndPps(frame->GetNaluInfos());
  }
}

对H264的nalu,pic_id值为kNoPictureId,Tl0picId的值为kNoTl0PicIdx,TemporalId的值为kNoTemporaId。所以对pictureid或temporalid的判断,都是可以忽略。那么对H264的执行逻辑是这段语句

return ContinuousSeqNum(static_cast<uint16_t>(frame->GetLowSeqNum())) &&
           HaveSpsAndPps(frame->GetNaluInfos());

是通过seqnum,是否有sps,pps来判断帧间的解码连续性。
如果两个nalu是连续的则后一个的nalu的中最小的seqnum是等于前一个nalu中最大的seqnum加1的,成员函数ContinuousSeqNum就是这个判断逻辑。

成员函数HaveSpsAndPps

它做了两件事:

  1. 判断nalu是否是同一个GOP
  2. 判断GOP中是否有SPS和PPS
bool VCMDecodingState::HaveSpsAndPps(const std::vector<NaluInfo>& nalus) const {
  std::set<int> new_sps;
  std::map<int, int> new_pps;
  for (const NaluInfo& nalu : nalus) {
    // Check if this nalu actually contains sps/pps information or dependencies.
    if (nalu.sps_id == -1 && nalu.pps_id == -1)
      continue;
    switch (nalu.type) {
      case H264::NaluType::kPps:
        if (nalu.pps_id < 0) {
          RTC_LOG(LS_WARNING) << "Received pps without pps id.";
        } else if (nalu.sps_id < 0) {
          RTC_LOG(LS_WARNING) << "Received pps without sps id.";
        } else {
          new_pps[nalu.pps_id] = nalu.sps_id;
        }
        break;
      case H264::NaluType::kSps:
        if (nalu.sps_id < 0) {
          RTC_LOG(LS_WARNING) << "Received sps without sps id.";
        } else {
          new_sps.insert(nalu.sps_id);
        }
        break;
      default: {
        int needed_sps = -1;
        auto pps_it = new_pps.find(nalu.pps_id);
        if (pps_it != new_pps.end()) {
          needed_sps = pps_it->second;
        } else {
          auto pps_it2 = received_pps_.find(nalu.pps_id);
          if (pps_it2 == received_pps_.end()) {
            return false;
          }
          needed_sps = pps_it2->second;
        }
        if (new_sps.find(needed_sps) == new_sps.end() &&
            received_sps_.find(needed_sps) == received_sps_.end()) {
          return false;
        }
        break;
      }
    }
  }
  return true;
}

是否是同一个GOP的判断是根据sps_id和pps_id:

  1. pps_id为 pic_parameter_set_id,表示当前pps的id,某个pps在码流中会被相应的slice引用。slice引用pps的方式就是在slice header中保存pps的 id。

  2. sps_id为 seq_parameter_set_id,表示当前sps的id。被pps引用,在pps中带有所引用的sps的id。

那么在一个GOP内的nalu,各slice中pps id应该是相同的。pps中的sps id与sps中的 id是相同的。如果两个nalu的seqnum是连续的,且属于同一个GOP,且存在SPS,PPS,则认为帧间是可连续解码的。

VCMJitterBuffer中对nalu是否可连续解码的处理

知道了H264判断nalu间是否可连续解码的依据,再回过头来看看VMCJitterBuffer的InsertPacket方法关于nalu间是否可连续解码的逻辑,涉及到三个成员函数:FindAndInsertContinuousFramesWithState,FindAndInsertContinuousFrames,IsContinuous

  • FindAndInsertContinuousFramesWithState成员函数,它的作用就是根据最近一次可解码nalu的信息(记录在VCMDecodingState中)在incomplete framelist中寻找同属一个GOP内的nalu。从incomplete framelis中删除,插入到decodable framelist中
void VCMJitterBuffer::FindAndInsertContinuousFramesWithState(
    const VCMDecodingState& original_decoded_state) 
{//寻找同一个GOP内的Nalu

  // Copy original_decoded_state so we can move the state forward with each
  // decodable frame we find.
  VCMDecodingState decoding_state;
  decoding_state.CopyFrom(original_decoded_state);

  // When temporal layers are available, we search for a complete or decodable
  // frame until we hit one of the following:
  // 1. Continuous base or sync layer.
  // 2. The end of the list was reached.
  //对H264可以忽略temporal的处理逻辑
  for (FrameList::iterator it = incomplete_frames_.begin();it != incomplete_frames_.end();)
  {
    VCMFrameBuffer* frame = it->second;
    if (IsNewerTimestamp(original_decoded_state.time_stamp(),frame->Timestamp())) 
    {
      ++it;
      continue;
    }
    
    if (IsContinuousInState(*frame, decoding_state)) 
    {
      decodable_frames_.InsertFrame(frame);
      incomplete_frames_.erase(it++);
      decoding_state.SetState(frame);
    } else if (frame->TemporalId() <= 0) {
      break;
    } else {
      ++it;
    }
  }
}

  • 成员函数FindAndInsertContinuousFrames,是通过一个nalu在incomplete framelist中寻找同属一个GOP内的nalu
void VCMJitterBuffer::FindAndInsertContinuousFrames(
    const VCMFrameBuffer& new_frame) {
  VCMDecodingState decoding_state;
  decoding_state.CopyFrom(last_decoded_state_);
  decoding_state.SetState(&new_frame);
  FindAndInsertContinuousFramesWithState(decoding_state);
}

  • 成员函数IsContinuous是用于判断nalu是否可以连续解码
bool VCMJitterBuffer::IsContinuous(const VCMFrameBuffer& frame) const
{
  if (IsContinuousInState(frame, last_decoded_state_)) 
  {//与last_decoded_state_代表的上一个nalu是可连续解码的
    return true;
  }
  
  //还有一种情况:该frame与last_decoded_state_代表的nalu是在seqnum上是不连续,
  //但是属于同一个GOP内的,所以要遍历decodable framelist进行判断
  VCMDecodingState decoding_state;
  decoding_state.CopyFrom(last_decoded_state_);
  for (FrameList::const_iterator it = decodable_frames_.begin();
       it != decodable_frames_.end(); ++it) {
    VCMFrameBuffer* decodable_frame = it->second;
    if (IsNewerTimestamp(decodable_frame->Timestamp(), frame.Timestamp())) {
      break;
    }
    decoding_state.SetState(decodable_frame);
    if (IsContinuousInState(frame, decoding_state)) {
      return true;
    }
  }
  return false;
}

判断nalu是否可连续解码,需要考虑两种情况:

  1. 该nalu与last_decoded_state_代表的上一个nalu在同一个GOP内,且seqnum是连续的。
  2. 属于同一个GOP,但是seqnum不连续,此时应该去遍历decodable framelist,寻找在同一个GOP内,seqnum连续的nalu。

对VCMJitterBuffer的插入操作,就时涉及到对rtp包的处理和对nalu,GOP的处理。也通过这两篇文章讲的比较清楚了。后面将会关注去nalu的处理。