1. object detection and segmentation on COCO도 잘하고
2. Semantic segmentation on ADE20K 도 잘하고
3. model efficiency도 낫다
In addition,
Hybrid model = CNN + self attention이 연구되고 있다.
Prior to ViT, the focus was on augmenting a ConvNet with self-attention/non-local modules [8, 55, 66, 79] to capture long-range dependencies
The original ViT [20] first studied a hybrid configuration, and a large body of follow-up works focused on reintroducing convolutional priors to ViT, either in an explicit [15, 16, 21, 82, 86, 88] or implicit [45] fashion.
최근 동향
결론
여전히 ConvNet의 구조는 competible하다!
'On Going > Deep Learning' 카테고리의 다른 글
[ViT] Vision Transformer 리뷰 (0) | 2025.03.24 |
---|---|
[Paper Review] ConvNeXt - A ConvNet for the 2020s (3/n) ConvNet : Evaluations on classification (0) | 2023.11.28 |
[Paper Review] ConvNeXt - A ConvNet for the 2020s (2/n) ConvNet : a Roadmap (1) | 2023.11.27 |
[Paper Review] ConvNeXt - A ConvNet for the 2020s (1/n) Introduction (1) | 2023.11.27 |
[FastVit] Vision Transformer from APPLE (0) | 2023.08.18 |
댓글