LLM-JEPA for Transferable AI Representations

Explore top LinkedIn content from expert professionals.

Summary

LLM-JEPA for transferable AI representations is an emerging approach that improves how large language models (LLMs) learn and understand information by combining traditional word prediction with a method called Joint Embedding Predictive Architecture (JEPA). This technique helps AI systems develop deeper, more structured language understanding that can be reused across tasks, making them more robust and adaptable.

Explore new training methods: Try combining standard next-word prediction with JEPA-style learning to encourage more meaningful, reusable representations in your language models.
Reduce resource barriers: Look into recent JEPA advancements that cut computational costs, making advanced AI more accessible for smaller teams and research groups.
Monitor evolving benchmarks: Stay alert to ongoing developments in JEPA-based LLMs, as researchers are testing them on a wider range of problems and sharing tools for community experimentation.

Summarized by AI based on LinkedIn member posts

Grigory Sapunov

7,919 followers 2mo
Report this post
An interesting new paper on LLM-JEPA from Hai Huang, Yann LeCun, and Randall Balestriero. 💡 Previously I wrote about JEPA approach applied to videos (V-JEPA and V-JEPA 2) and time series (CHARM). Now the JEPA approach is finally applied to LLMs! This work bridges a major gap between AI for vision and language, offering a potential leap forward in how we train language models. Instead of just predicting the next word, LLM-JEPA teaches models to understand the underlying meaning by predicting abstract concepts (as JEPA approach does)—for instance, grasping the essence of a code snippet from its natural language description. The paper introduces a hybrid objective combining standard next-token prediction with a Joint Embedding Predictive Architecture (JEPA) loss, a technique highly successful in computer vision. The empirical results are compelling: LLM-JEPA consistently boosts performance, accelerates parameter-efficient fine-tuning (PEFT), and shows remarkable resistance to overfitting. This method doesn't just improve scores; it fundamentally creates more structured and transferable representations. While the current computational overhead is a challenge to address, this paper opens a promising new direction beyond traditional LLM training. 🚀 Review: https://lnkd.in/eC4Jte_r Paper: https://lnkd.in/erZJadb3 Code: https://lnkd.in/ethXT7sX
3 Comments
Like Comment
Hai Huang

LLM-JEPA | AI @ Atlassian | ex-Google | Tsinghua alum

5,573 followers 2mo
Report this post
🚀 𝘃𝟮 𝗼𝗳 𝗼𝘂𝗿 𝗽𝗮𝗽𝗲𝗿 “𝗟𝗟𝗠-𝗝𝗘𝗣𝗔” 𝗶𝘀 𝗼𝘂𝘁 𝗼𝗻 𝗮𝗿𝗫𝗶𝘃! 🔍 𝐖𝐡𝐚𝐭’𝐬 𝐧𝐞𝐰? ✅ 𝗦𝗶𝗴𝗻𝗶𝗳𝗶𝗰𝗮𝗻𝘁𝗹𝘆 𝗹𝗼𝘄𝗲𝗿 𝗰𝗼𝗺𝗽𝘂𝘁𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗼𝘃𝗲𝗿𝗵𝗲𝗮𝗱 — reduced overhead from 𝟮𝟬𝟬% → 𝟮𝟱% using a simple yet effective 𝗿𝗮𝗻𝗱𝗼𝗺 𝗝𝗘𝗣𝗔-𝗹𝗼𝘀𝘀 𝗱𝗿𝗼𝗽𝗼𝘂𝘁. ✅ 𝗕𝗿𝗼𝗮𝗱𝗲𝗿 𝗮𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀 — extended beyond symmetric 2-view datasets to 𝗡𝗤-𝗢𝗽𝗲𝗻 (Natural Questions for open-domain) and 𝗛𝗲𝗹𝗹𝗮𝗦𝘄𝗮𝗴 (sentence completion), and tested on reasoning models like 𝗤𝘄𝗲𝗻𝟯 and 𝗗𝗲𝗲𝗽𝗦𝗲𝗲𝗸-𝗥𝟭-𝗗𝗶𝘀𝘁𝗶𝗹𝗹𝗲𝗱. ✅ 𝗥𝗶𝗴𝗼𝗿𝗼𝘂𝘀 𝗮𝗯𝗹𝗮𝘁𝗶𝗼𝗻𝘀 — JEPA loss design outperforms alternatives including 𝗟𝟮, 𝗠𝗦𝗘, 𝗽𝗿𝗲𝗽𝗲𝗻𝗱 [𝗣𝗥𝗘𝗗] 𝘁𝗼𝗸𝗲𝗻𝘀, 𝗖𝗼𝗱𝗲→𝗧𝗲𝘅𝘁, and 𝗜𝗻𝗳𝗼𝗡𝗖𝗘 variants. 🧩 𝐖𝐡𝐚𝐭 𝐢𝐬 𝐋𝐋𝐌-𝐉𝐄𝐏𝐀? If you’re seeing this for the first time: LLM-JEPA introduces the 𝗝𝗼𝗶𝗻𝘁 𝗘𝗺𝗯𝗲𝗱𝗱𝗶𝗻𝗴 𝗣𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝘃𝗲 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁𝘂𝗿𝗲 (𝗝𝗘𝗣𝗔) — a self-supervised learning paradigm proven in vision — as a 𝗿𝗲𝗴𝘂𝗹𝗮𝗿𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗹𝗼𝘀𝘀 for LLMs. Combined with next-token prediction, it enables models to: 🎯 Boost fine-tuning accuracy 🧠 Resist overfitting 🌱 Work in pretraining via 𝗽𝗮𝗿𝗮𝗽𝗵𝗿𝗮𝘀𝗲-𝗯𝗮𝘀𝗲𝗱 𝗝𝗘𝗣𝗔 🌀 Induce 𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲𝗱 𝗹𝗮𝘁𝗲𝗻𝘁 𝗿𝗲𝗽𝗿𝗲𝘀𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻𝘀 unseen in either base or normally fine-tuned models 🧪 The 𝘃𝟭 𝘄𝗼𝗿𝗸𝘀𝗵𝗼𝗽 𝘃𝗲𝗿𝘀𝗶𝗼𝗻 (accepted to NeurIPS 2025 UniReps + DL4C) received valuable feedback highlighting high compute cost, limited applications, and missing ablations — all fully addressed in this release. Huge thanks to the UniReps and DL4C reviewers for their constructive and insightful comments that helped shape v2. It’s been a privilege to collaborate with Yann LeCun (NYU) and Randall Balestriero (Brown) — few experiences are more inspiring than working alongside the pioneers of modern deep and self-supervised learning. The 𝗰𝗼𝗱𝗲 𝗶𝘀 𝗼𝗽𝗲𝗻-𝘀𝗼𝘂𝗿𝗰𝗲𝗱, and we warmly invite others to experiment with it — and help explore this emerging frontier between 𝗝𝗘𝗣𝗔 𝗮𝗻𝗱 𝗟𝗟𝗠𝘀. 💻 Code: https://lnkd.in/eUX2b8iE 📄 Paper: https://lnkd.in/ers8_yzm Together with Yann and Randall, we’re already exploring new variants and applications — and look forward to sharing more soon. Stay tuned!

LLM-JEPA: Large Language Models Meet Joint Embedding Predictive Architectures arxiv.org

11 Comments
Like Comment

LLM-JEPA for Transferable AI Representations

Summary

More in AI Model Development

Explore categories