Auto Seed Vl2 <2024>

By generating seeds in embedding space rather than pixel space, we avoid the compounding errors of full image generation. The hypernetwork’s meta-learning objective ensures that seeds are discriminative for the original task and compatible with the continually updated VLM.

| Configuration | Avg Acc | Drop | |----------------------------------------|---------|------| | Full Auto-Seed VL2 | 82.2 | — | | w/o consistency loss (( \mathcalL \textconsist )) | 75.4 | -6.8 | | w/o gradient-conditioned generation (random seeds) | 68.9 | -13.3 | | w/o meta-update of ( G \phi ) | 74.1 | -8.1 | | w/o seed pruning (full memory) | 82.0 | -0.2 (ns) | auto seed vl2

[6] von Oswald, J., et al. (2020). Continual learning with hypernetworks. ICLR. By generating seeds in embedding space rather than

[3] Zhou, K., et al. (2022). Learning to prompt for vision-language models. IJCV. (2020)

Auto-Seed VL2 maintains a set of auto-generated seeds ( \mathcalS ) that grows slowly over tasks. Auto-Seed VL2 operates in three phases per task: (1) Seed replay, (2) Online adaptation, (3) Seed update. 4.1 Overall Architecture

[2] Shin, H., et al. (2017). Continual learning with deep generative replay. NIPS.

[4] Thengane, V., et al. (2023). Continual-CLIP: Fine-tuning CLIP for continual learning. CVPR Workshop.