Weighted Boxes Fusion

study

相較於 NMS 只保留最高分，其餘刪除的作法，Weighted Box Fusion (WBF) 採用的是融合策略。它利用所有重疊框的資訊，計算出加權平均後的座標，從而產生一個更精準的邊界框。這在多模型集成，或有高度重疊物件的場景中特別有效。

演算法

WBF 將重疊度高於閾值的框歸為同一群，並利用置信度作為權重來修正最終座標。

Algorithm: Weighted Box Fusion (WBF) Input: B = {b_{1}, \dots, b_{N}}, S = {s_{1}, \dots, s_{N}}, Thresh_{i o u} Output: F (Fused Boxes) 1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : 10 : 11 : 12 : 13 : 14 : 15 : 16 : 17 : C \leftarrow {} // 初始化聚類列表 for b_{i} \in SortedByScore (B) do matched \leftarrow False for c_{j} \in C do if IoU (b_{i}, c_{a vg}) \geq Thresh_{i o u} then Add b_{i} to cluster c_{j} // 將框加入現有聚類 Update c_{a vg} // 重新計算加權平均座標 matched \leftarrow True break end if end for if not matched then C \leftarrow C \cup {{b_{i}}} // 建立新聚類 end if end for F \leftarrow ComputeFinalScores (C) return F

融合公式

對於一個聚類中的 $K$ 個框，最終座標 $X_{f ina l}$ 計算如下（置信度越高，話語權越重）：

X_{f ina l} = \frac{\sum _{i = 1}^{K} s _{i} \cdot X _{i}}{\sum _{i = 1}^{K} s _{i}}

應用

在 Kibo RPC 2025 中，前期受限於資料量不足，單一模型準確度不高，我們採用了模型集成策略，訓練了多個針對不同尺度與情境的模型，然後用 WBF 進行結果融合。後期單一模型的性能已顯著提升且足以應對多數情況，但仍保留使用 WBF。

改良版演算法

Algorithm: Revised Weighted Box Fusion Input: D, T_{I o U}, T_{co n t}, T_{co n f} Output: F (Fused Boxes) 1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : 10 : 11 : 12 : 13 : 14 : 15 : 16 : 17 : 18 : 19 : 20 : Sort D by score desc; F \leftarrow \emptyset; U \leftarrow false for each d_{i} \in D where U [i] is false do // 尋找未使用的框 G \leftarrow {d_{i}}; U [i] \leftarrow true // 建立新群組 for each d_{j} \in D where j > i \land \neg U [j] do // 向後尋找重疊 if IoU (d_{i}, d_{j}) > T_{I o U} \lor Containment (d_{j}, d_{i}) > T_{co n t} then G \leftarrow G \cup {d_{j}}; U [j] \leftarrow true // 將框加入群組並標記為已使用 end if end for W_{s u m} \leftarrow \sum_{d \in G} (d . co n f \times d . w e i g h t) // 加權分數總和 Box_{a vg} \leftarrow \frac{\sum _{d \in G} ( d . b o x \times d . co n f \times d . w e i g h t )}{W _{s u m}} // 座標加權融合 Conf_{a vg} \leftarrow W_{s u m} / \sum_{d \in G} d . w e i g h t // 歸一化信心度 (w, h) \leftarrow GetDimensions (Box_{a vg}) // 計算融合框尺寸 if Conf_{a vg} < T_{co n f} \lor w < 5% \lor h < 5% then continue // 異常尺寸過濾 d_{b es t} \leftarrow SortByWeightedScore (G) [0] // 取最高權重者決定類別 F \leftarrow F \cup {(d_{b es t} . c l a ss, Conf_{a vg}, Box_{a vg})} end for return F

改良版 vs. 傳統

相較於標準 WBF，此修正版針對極端尺度差異與雜訊過濾進行了以下改良：

匹配機制
- 傳統：僅依賴 IoU > Threshold 來判斷重疊
- 改良版：新增 OR Containment > Threshold 判斷條件
- 目的：解決「大框完全包覆小框」時，因 IoU 分母過大導致數值過低，而無法正確融合的問題
雜訊過濾
- 傳統：對融合後的結果照單全收，不做檢查
- 改良版：新增幾何過濾機制（若 Width 或 Height < 5% 則捨棄）
- 目的：剔除在融合過程中可能產生的異常扁平或細長的無效雜訊
權重計算
- 傳統：僅使用預測框的信心作為權重
- 改良版：引入模型權重 ( $W e i g h t \times C o n f$ )
- 目的：允許開發者針對不同模型設定權重（例如：給予近距離、高解析度的模型更高話語權），讓可靠的模型主導結果
類別決策
- 傳統：通常直接沿用第一個框的類別，或對類別分數做平均
- 改良版：融合座標後，對聚類進行重排序，並選取加權分數最高者的類別
- 目的：在多類別容易混淆的場景下，確保由信心度最強的預測來源來決定最終標籤，避免多數決導致的平庸錯誤

程式碼

  /**
   * Applies Revised Weighted Box Fusion (WBF) to combine overlapping detections.
   * This method is a modified version of the original WBF algorithm.
   * Instead of merging boxes by class, it merges all boxes of different classes first,
   * then chooses the best class based on the weighted average of the confidence scores.
   *
   * @param detections List of Detection objects to be fused.
   * @param iouThreshold IoU threshold for merging boxes.
   * @param confThreshold Confidence threshold for filtering boxes.
   * @return List of fused Detection objects.
   */
  private List<Detection> wbf(List<Detection> detections, float iouThreshold, float containmentThreshould, float confThreshold) {
    if (detections.isEmpty()) {
      Log.i(TAG, "No detections to process.");
      return detections;
    }
 
    // Sort the score in descending order
    Collections.sort(detections, new Comparator<Detection>() {
      @Override
      public int compare(Detection d1, Detection d2) {
        return Float.compare(d2.confidence, d1.confidence);
      }
    });
 
    List<Detection> fused = new ArrayList<>();
    boolean[] used = new boolean[detections.size()];
 
    // Iterate through the detections
    for (int i = 0; i < detections.size(); i++) {
      if (used[i]) continue;
 
      // Create a new group for the current detection
      List<Detection> group = new ArrayList<>();
      group.add(detections.get(i));
      used[i] = true;
 
      // Check for overlapping detections
      for (int j = i + 1; j < detections.size(); j++) {
        if (used[j]) continue;
 
        Detection di = detections.get(i);
        Detection dj = detections.get(j);
        float iou = calculateIoU(di.box, dj.box);
        float containment = calculateContainment(dj.box, di.box);
 
        if (iou > iouThreshold || containment > containmentThreshould) {
          group.add(dj);
          used[j] = true;
        }
      }
 
      // Compute weighted box
      float confidenceSum = 0f;
      float weightSum = 0f;
      float x1 = 0f, y1 = 0f, x2 = 0f, y2 = 0f;
 
      for (Detection d : group) {
        confidenceSum += d.confidence * d.modelWeight;
        weightSum += d.modelWeight;
        x1 += d.box[0] * d.confidence * d.modelWeight;
        y1 += d.box[1] * d.confidence * d.modelWeight;
        x2 += d.box[2] * d.confidence * d.modelWeight;
        y2 += d.box[3] * d.confidence * d.modelWeight;
      }
 
      float confidenceAvg = confidenceSum / weightSum;
      if (confidenceAvg < confThreshold) {
        continue;
      }
 
      x1 /= confidenceSum;
      y1 /= confidenceSum;
      x2 /= confidenceSum;
      y2 /= confidenceSum;
 
      float minX = Math.min(x1, x2);
      float minY = Math.min(y1, y2);
      float maxX = Math.max(x1, x2);
      float maxY = Math.max(y1, y2);
 
      // Check if the bounding box is too small
      if ((maxX - minX) < 0.05 || (maxY - minY) < 0.05) {
        continue;
      }
 
      // Sort into descending order of confidence
      Collections.sort(group, new Comparator<Detection>() {
        @Override
        public int compare(Detection a, Detection b) {
          float weightedConfidenceA = a.confidence * a.modelWeight;
          float weightedConfidenceB = b.confidence * b.modelWeight;
          return Float.compare(weightedConfidenceB, weightedConfidenceA);
        }
      });
 
      Detection fusedDetection = new Detection(
        new float[]{minX, minY, maxX, maxY},
        confidenceAvg,
        group.get(0).classId,
        group.get(0).className
      );
      fused.add(fusedDetection);
    }
 
    return fused;
  }

IoU 的盲點

注意到由於 IoU 計算的是交集與聯集的比例，當小框完全被大框包含，但面積差異巨大時，IoU 數值會因為分母過大而偏低，導致無法正確識別這種包含關係。

IoU 公式定義

IoU = \frac{交集 (Intersection)}{聯集 (Union)}

分子 (交集)：兩個框重疊的部分
分母 (聯集)：兩個框加起來的總面積（扣除重疊部分）
問題情境
- 當一個極小的框完全位於一個極大的框裡面時
- 交集：小框的面積 ( $A re a = 400$ )
- 聯集：大框的面積 ( $A re a = 10000$ )
- 計算結果 $\frac{很小的數 (400)}{很大的數 (10000)} \to 很小的數$
- 這種數值偏差會讓演算法（如 NMS 或 WBF）產生誤解：這兩個框的 IoU 只有 0.04 ，遠低於閾值，這是兩個完全不同的東西，保留它們！
- 實際上，這兩個框指的是同一個物體，只是其中一個預測得比較大，另一個比較精細。若只依賴 IoU，會導致無法正確融合或移除這些重疊框。

包含率 Containment

為了修復 IoU 在大框包小框時的數值缺陷，我們引入了 Containment 指標。

公式定義

Containment = \frac{交集 (Intersection)}{min ( Area _{1} , Area _{2} )}

分子 (交集)：與 IoU 相同，為兩個框重疊的部分
分母 (最小面積)：這是關鍵差異。我們不再除以巨大的聯集，而是除以較小那個框的面積
回到剛才的例子
- 小框 ( $A re a = 400$ ) 完全位於大框 ( $A re a = 10, 000$ ) 內部
- 交集：依然是小框的面積 ( $400$ )
- 分母：取兩者中的最小值，即小框面積 ( $400$ )
- 計算結果 $\frac{400}{400} = 1.0 (100%)$
在實作中採用 OR 邏輯來結合兩者
- 只要 IoU > T_iou 或者 Containment > T_cont 其中一項成立，就視為重疊
- 同時兼顧一般重疊與尺寸懸殊的包含關係

🪴 Quartz 4.0

Recent writing

Quadruped robot

Quadson simulation

ROS bridge

SSD

Sensor topic Not found

Weighted Boxes Fusion

演算法

融合公式

應用

改良版演算法

改良版 vs. 傳統

程式碼

IoU 的盲點

包含率 Containment

Graph View

Table of Contents

Backlinks

🪴 Quartz 4.0

Recent writing

Quadruped robot

Quadson simulation

ROS bridge

SSD

Sensor topic Not found

Weighted Boxes Fusion

演算法 §

融合公式 §

應用 §

改良版演算法 §

改良版 vs. 傳統 §

程式碼 §

IoU 的盲點 §

包含率 Containment §

Graph View

Table of Contents

Backlinks

演算法

融合公式

應用

改良版演算法

改良版 vs. 傳統

程式碼

IoU 的盲點

包含率 Containment