A great report from the Ada Lovelace Institute on what foundation models are, how they are used in the public sector, what they could be used for in the future, and what are the risks. As always from an Ada report it is very comprehensive, and it has some nice clarifications of terminology for those just starting to think about these issues. Key takeaways are: 💡 "Foundation models rely on large-scale data and compute for training. Their capabilities centre on text, image and data analysis or data generation." 💡 "Prominent examples include chatbots like ChatGPT or Claude, and image generators like Midjourney." 💡 "Potential uses include document analysis, decision support, policy drafting and public knowledge access, according to those working in or with the public sector." 💡 "Public services should carefully consider the counterfactuals to implementing foundation models. This means comparing proposed use cases with more mature and tested alternatives that might be more effective or provide better value for money." 💡 "Evaluating these alternatives should be guided by the principles of public life." 💡 "Risks include biases, privacy breaches, misinformation, security threats, overreliance, workforce harms and unequal access." 💡 "It is vital to mitigate these risks through monitoring, internal and independent oversight, and engaging with those affected by the technologies." 💡 "Existing guidance and impact assessments provide baseline governance for using foundation models but may need enhancement. Small pilots, independent auditing and public involvement can also minimise risks." 💡 "Government should invest in skills and address technical dependencies." 💡 "Government could consider options like funding domestic data centres and updates to procurement guidelines for AI systems." 💡 "As foundation models’ capabilities evolve and market dynamics change, there will be new opportunities for public-interest-driven innovation, but new risks also need to be anticipated to ensure effective governance." #aiethics #chatgpt #responsibleai #aigovernance Khoa Lam, Jeffery Recker, Abhi Sanka, Ravit Dotan, PhD, Ryan Carrier, FHCA, Luke Vilain https://lnkd.in/gYS_BjSD
Understanding Foundation Models and Their Potential
Explore top LinkedIn content from expert professionals.
Summary
Foundation models are advanced AI systems trained on vast amounts of data, enabling them to perform a wide range of tasks, from text or image generation to complex data analysis. These models, such as ChatGPT and image generators like MidJourney, hold the potential to revolutionize industries, drive innovation, and tackle global challenges, but they also present risks like bias, misinformation, and privacy concerns.
- Understand their versatility: Foundation models can adapt to diverse applications, such as drafting policies, analyzing genetic data, or even supporting scientific discovery, making them highly valuable across sectors.
- Mitigate associated risks: It’s essential to establish robust oversight, ethical guidelines, and small-scale pilot implementations to address challenges such as biases, misuse, and privacy vulnerabilities.
- Explore new opportunities: Leverage the ability of these models to accelerate innovation in areas like healthcare, environmental research, and technology while ensuring proper governance and public involvement.
-
-
Was Plato right about LLMs? 🤔 A new paper explores the convergence of AI representations A recent paper titled "The Platonic Representation Hypothesis" by Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola suggests that representations learned by different AI models are converging towards a shared ideal, despite variations in training data, objectives, and architectures. The authors analyzed several vision and language models and found growing alignment between their learned representations, as measured by the similarity structure of their vector embeddings. Larger and more capable models showed greater alignment, both within and across modalities. They hypothesize these models converge on an ideal representation of reality, a concept they liken to Plato's theory of forms and term the "platonic representation." This convergence is driven by increasing model size, data diversity, and the need to perform well on many tasks. The implications are significant - more efficient transfer learning, better multimodal integration, and progress toward artificial general intelligence. If this hypothesis holds, we may see AI models that can seamlessly process and reason over multiple data types within a unified world representation. The paper was submitted to arXiv on May 13, 2024. As the AI community digs into these findings, it will be fascinating to see if this "platonic representation" hypothesis gains further support. What do you think - are we witnessing the emergence of a shared language of intelligence across AI models? #AI #MachineLearning #FoundationalModels #AGI
-
🧬 ChatGPT meet Biology ChatGPT and similar large language models (LLMs) have shown that by training on vast amounts of text written by humans, they can learn how to understand language, reply with language and even write creatively. However, their potential extends beyond language to any field with foundational building blocks that yield complex outcomes. One such field is genetics. Just as LLMs learn to understand and generate language, they can also learn from genetic data. Imagine foundation models analyzing gene sequences to predict traits, design genes for desired characteristics, or even creatively assemble new gene sequences. "Design a gene sequence to create a cat with green eyes and blue fur?" Currently, genetic research progresses slowly, often involving experiments designed to test hypotheses that may ultimately be disproven after months. But what if we could accelerate this with AI? Enter scGPT, a new foundation model for single-cell biology developed from single-cell sequencing data of over 33 million cells. This model not only enhances our understanding of gene and cell function but also holds promise for improving tasks like cell type annotation, integration of multiple datasets, and predicting cellular responses to changes. Using foundation models in biology could potentially expedite the development of groundbreaking treatments. Foundation models could generate gene sequences and then we just test them in the lab. We're just scratching the surface, but the journey looks promising. Read more about this innovative approach in the linked paper below. Thank you Haotian Cui and Chloe Xueqi Wang for moving us forward in this area! https://lnkd.in/guXb23Ws #Genetics #AI #ChatGPT #FoundationModels #BiologicalResearch #Innovation
-
Scientists use microscopes, telescopes, and particle accelerators to make sense of our reality. Maybe neural nets are the place where the next breakthroughs hide in plain sight 🔬🔭💡🤖 Miles Cranmer talks about how the future of science itself might be transformed through AI: https://buff.ly/3y24uzG Traditionally, individual observations inspire researchers to inductively develop theories, from which hypotheses are derived and tested until a consensus is reached. However, it's now evident that 'Foundation Models' are achieving remarkable predictive performance across various domains, such as fluid dynamics and complex simulations. Who knew that training on cat videos could aid in 3-body predictions? These models automatically uncover powerful patterns, mapping causal structures that our theories struggle to capture as successfully. Thus, instead of the conventional approach of starting with observations and seeking appropriate formulas, the proposed idea is to delve into the well-performing trained networks, extract the patterns (i.e., formulas), and then proceed deductively. The challenge then becomes interpreting these formulas and deciphering their significance. While the ultimate success of this approach remains to be seen, I find the concept incredibly intriguing. What are your thoughts on this potential paradigm shift in scientific discovery? #NextGreatScientificTheory #NeuralNetworks #FoundationModels #HighDimensionalData #CausalStructures #DeductiveReasoning #PatternExtraction #NovelPatterns #ScientificDiscovery #FutureOfScience #InnovationAcceleration #TransformativeTech #DataDrivenInsights
-
LLM Architectures, Demystified Understanding how large language models work should not require a PhD. So check out this clear, visual breakdown of the 6 core LLM architectures that power today’s most advanced AI systems. Whether you’re building, investing, or just curious about the models behind the AI revolution, this will give you a solid mental map. 🔍 What you’ll learn in the carousel: 🈸Encoder-Only: Ideal for language understanding tasks like classification and sentiment analysis. Think BERT and RoBERTa. 🈴Decoder-Only: The foundation of autoregressive models like GPT, optimized for text generation. 💹Encoder-Decoder: A flexible architecture behind models like T5 and BART, perfect for translation, summarization, and question answering. 🛗Mixture of Experts (MoE): Used in models like Mixtral, this architecture activates only a subset of the model’s parameters at inference, offering scale with efficiency. ♐️State Space Models (SSM): Architectures like Mamba enable fast inference and long context retention, moving beyond attention bottlenecks. 🔀Hybrid Architectures: Combinations like Jamba bring together transformers, state space models, and MoE to capture the best of each approach. Hope that builders, product leaders, or AI enthusiasts can use this guide to understand what’s happening under the hood. 👉 Swipe through the carousel 🔁 Share with someone trying to grasp LLM fundamentals 💬 Let me know which architecture you find most promising #llm #aiagents #artificialintelligence
-
A new paper by researchers from Stanford, Princeton, MIT, Georgetown, Berkeley, Meta, Hugging Face, GitHub among others, analyzes 'the benefits and risks of foundation models with widely available weights and presents a framework to assess their marginal risk compared to closed models or existing technology'. 🌿 Benefits of Open Foundation Models includes distributing who defines acceptable model behavior, increasing innovation, accelerating science, enabling transparency and mitigating monoculture and market concentration. 🛡A Framework for Analyzing the Marginal Risk of Open Foundation Models 1️⃣ Threat identification: Systematically identify and characterize the potential threats such as naming the misuse vector, e.g., spear-phishing scams or influence operations, and detailing the manner in which the misuse would be executed. 2️⃣ Existing risk (in the absence of open foundation models) : Understand the pre-existing level of risk e.g., disinformation on social media, spear-phishing scams, cyberattacks on critical infrastructure to contextualize and baseline any new risk introduced by open foundation models. 3️⃣ Existing defenses (absent open foundation models): Understand the current defensive landscape for these risks such as with technical interventions (e.g., spam filters to detect and remove spear-phishing emails) and regulatory interventions (e.g., laws punishing the distribution of child sexual abuse material). 4️⃣ Evidence of marginal risk of open FMs: Understand where open foundation models simply duplicate existing risk and where these concerns that are well-addressed by existing measures. Conversely, identify where new risks are introduced (e.g., fine tuning models to create non-consensual intimate imagery of specific people) and where existing defenses will be inadequate (e.g., AI-generated child sexual abuse material may overwhelm existing law enforcement resources). 5️⃣ Ease of defending against new risks: Delineate where new defenses can be implemented or existing defenses can be modified to address the increase in overall risk 6️⃣ Uncertainty and assumptions: Acknowledge the assumptions related to the trajectory of technological development, the agility of threat actors in adapting to new technologies, and the potential effectiveness of novel defense strategies. Source: https://lnkd.in/g2rxEY4N Full paper: https://lnkd.in/gqysMnsE #risks #foundationmodels #aiandbusiness #societalimpact
-
In January 2024, the National Institute of Standards and Technology (NIST) published its updated report on AI security, called "Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations," which now includes a focus on the security of generative AI, addressing attacks on both predictive and generative AI systems. This comprehensive work categorizes various adversarial attack methods, their objectives, and capabilities, along with strategies for their mitigation. It can help put NIST’s AI Risk Management Framework into practice. Attacks on predictive AI systems (see screenshot #1 below): - The report breaks down predictive AI taxonomy into classifications based on attack stages, goals, capabilities, knowledge, and data modality. - Key areas of focus include evasion and poisoning attacks, each with specifics on white-box and black-box attacks, their transferability, and mitigation strategies. - Privacy attacks are dissected into data reconstruction, membership inference, model extraction, and property inference, with proposed mitigations. Attacks on generative AI systems (see screenshot #2 below): - The section on Generative AI Taxonomy from the NIST report outlines attack classifications and specific vulnerabilities within Generative AI systems such as Generative Adversarial Networks (GANs), Generative Pre-trained Transformers (GPTs), and Diffusion Models. - It then delves into the evolution of Generative AI stages of learning, highlighting the shift from traditional models to the pre-training of foundation models using unsupervised learning to capture patterns for downstream tasks. These foundation models are subsequently fine-tuned for specific applications, often by third parties, making them particularly vulnerable to poisoning attacks, even with minimal tampering of training datasets. - The report further explores the deployment phase of generative AI, which exhibits unique vulnerabilities distinct from predictive AI. Notably, it identifies the potential for attackers to exploit data channels for injection attacks similar to SQL injection, the manipulation of model instructions to align LLM behaviors, enhancements through contextual few-shot learning, and the ingestion of runtime data from external sources for application-specific context. - Additionally, it addresses novel security violations specific to Generative AI and details various types of attacks, including AI supply chain attacks, direct and indirect prompt injection attacks, and their mitigations, as well as violations like availability, integrity, privacy compromises, and abuse. For a deeper dive into these findings, including the taxonomy of attacks and their mitigations, visit the full report available at: https://lnkd.in/guR56reH Co-authored by Apostol Vassilev (NIST), Alina Oprea (Northeastern University), Alie Fordyce, and Hyrum Anderson (both from Robust Intelligence) #NIST #aisecurity
-
So what are Foundation Models? Imagine a multi-talented robot that's adept at numerous tasks, not just one. That's what Foundation Models are in the AI landscape. Unlike the old-school AI that excelled in just one specific job, these models are the versatile stars of the tech world. Foundation Models have changed the game by learning from a massive pool of data, giving them an almost human-like understanding of the world. This versatility means they can adapt to a variety of tasks. Whether it’s crafting a poem, analyzing complex data, or even whipping up a new recipe, these models are up for the challenge. They're like having an ever-evolving, super-smart assistant. The impact of these models goes beyond just technical marvels. They’re making significant strides in the real world, aiding in everything from medical research to tackling environmental challenges. Foundation Models are not just transforming technology; they're reshaping industries like education, healthcare, and business, making our work more efficient and our lives easier. As more people get access to and start using these models, we're poised to witness groundbreaking advancements. We're talking about potential solutions to some of the world's most pressing problems. Foundation Models are not just a buzzword; they represent the future of AI, a future that's bright and full of possibilities. I’m excited to see how they'll continue to transform our world, including advances we are working on at GE HealthCare.
-
Anthropic is quietly doing super interesting work on LLM interpretability! Almost as impressive as the Golden Gate bridge! >>> 𝐓𝐡𝐞 𝐡𝐢𝐞𝐫𝐚𝐫𝐜𝐡𝐲 𝐨𝐟 𝐨𝐮𝐫 "𝐮𝐧𝐝𝐞𝐫𝐬𝐭𝐚𝐧𝐝𝐢𝐧𝐠" 𝐨𝐟 𝐀𝐈 𝐦𝐨𝐝𝐞𝐥𝐬 𝐬𝐨𝐫𝐭 𝐨𝐟 𝐠𝐨𝐞𝐬 𝐥𝐢𝐤𝐞 𝐭𝐡𝐢𝐬: 1. At the most superficial level we can say the black box works most of the time, based on evaluation benchmarks 2. One level deeper researchers find tools for interpretability, giving some glimpse into how a model arrives at is prediction 3. At the foundational level there is a mathematical understanding of generalization, convergence properties, robustness etc. >>> 𝐅𝐨𝐫 "𝐫𝐞𝐠𝐮𝐥𝐚𝐫" 𝐝𝐞𝐞𝐩 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠, 𝐫𝐞𝐬𝐞𝐚𝐫𝐜𝐡𝐞𝐫𝐬 𝐚𝐫𝐞 𝐚𝐭 𝐥𝐞𝐯𝐞𝐥 2. 👉 LIME was a first big breakthroughs: Train a simple, interpretable model on a narrow region of a complex model you want to understand 👉 Shapley values are most popular: This uses game theory concepts to understand the marginal contribution of each model feature to a prediction >>> 𝐁𝐔𝐓 𝐞𝐯𝐞𝐫𝐲𝐨𝐧𝐞 𝐢𝐧 𝐋𝐋𝐌 𝐥𝐚𝐧𝐝 𝐢𝐬 𝐭𝐨𝐨 𝐛𝐮𝐬𝐲 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐛𝐢𝐠𝐠𝐞𝐫 𝐦𝐨𝐝𝐞𝐥𝐬 𝐚𝐧𝐝 𝐰𝐞'𝐫𝐞 𝐫𝐞𝐚𝐥𝐥𝐲 𝐨𝐧𝐥𝐲 𝐡𝐚𝐥𝐟 𝐰𝐚𝐲 𝐭𝐨 𝐞𝐯𝐞𝐧 𝐫𝐞𝐚𝐜𝐡 𝐥𝐞𝐯𝐞𝐥 1. 𝐔𝐧𝐭𝐢𝐥 𝐀𝐧𝐭𝐡𝐫𝐨𝐩𝐢𝐜 :) 👉 In essence, they figured out a way to map human-interpretable concepts to the billions of neural activations, mid-way through an LLMs calculations 👉 The results are really cool! Not only do the features show how an LLM "thinks" about concepts like "the Golden Gate bridge" or "Inner conflict"… 👉 The team also proved these features are "real". If you artificially increase the underlying weights, Claude starts to behave meaningfully different EXAMPLE: The Anthropic team for a while souped up the "Golden Gate" feature with some pretty entertaining results. >>> 𝐈𝐭𝐬 𝐮𝐬𝐞𝐟𝐮𝐥 𝐟𝐫𝐨𝐦 𝐚 𝐩𝐫𝐚𝐜𝐭𝐢𝐜𝐚𝐥 𝐬𝐭𝐚𝐧𝐝 𝐩𝐨𝐢𝐧𝐭 𝐚𝐧𝐝 𝐧𝐨𝐭 𝐣𝐮𝐬𝐭 𝐦𝐞𝐦𝐞𝐬 - 𝐚𝐬 𝐢𝐭 𝐜𝐫𝐞𝐚𝐭𝐞𝐬 𝐚𝐧𝐨𝐭𝐡𝐞𝐫 𝐭𝐨𝐨𝐥 𝐭𝐨 𝐛𝐞𝐭𝐭𝐞𝐫 𝐦𝐚𝐧𝐚𝐠𝐞 / 𝐬𝐚𝐟𝐞𝐠𝐮𝐚𝐫𝐝 𝐦𝐨𝐝𝐞𝐥 𝐛𝐞𝐡𝐚𝐯𝐢𝐨𝐫! Anthropic blog: https://lnkd.in/gUJ7Ugs7
-
Just in time for Christmas! Sippo Rossi has his paper on foundation models, generative AI, and research methods accepted at the International Journal of Information Management (https://lnkd.in/epNv9M-J. Many thanks to Sippo for leading the team, Raghava Rao Mukkamala, Matti Rossi and Professor Yogesh K Dwivedi for helping shape the ideas, and to all for keeping me along for the journey! This was a fun paper to write and a fun team to work with! The citation: Rossi, S., Raghava, R. M., Rossi, M., Dwivedi, Y. K., & Thatcher, J. B. (forthcoming). Augmenting Research Methods with Foundation Models and Generative AI. International Journal of Information Management. Abstract: Deep learning (DL) research has made remarkable progress in recent years. Natural language processing and image generation have made the leap from computer science journals to open-source communities and commercial services. Pre-trained DL models built on massive datasets, also known as foundation models, such as the GPT-3 and BERT, have led the way in democratizing artificial intelligence (AI). However, their potential use as research tools has been overshadowed by fears of how this technology can be misused. Some have argued that AI threatens scholarship, suggesting they should not replace human collaborators. Others have argued that AI creates opportunities, suggesting that AI-human collaborations could speed up research. Taking a constructive stance, this editorial outlines ways to use foundation models to advance science. We argue that DL tools can be used to create realistic experiments and make specific types of quantitative studies feasible or safer with synthetic rather than real data. All in all, we posit that the use of generative AI and foundation models as a tool in information systems research is in very early stages. Still, if we proceed cautiously and develop clear guidelines for using foundation models and generative AI, their benefits for science and scholarship far outweigh their risks. Keywords: Foundation model, Generative AI, Experiments, Synthetic data