Microsoft and Google shake hands on A2Aāwhat that means for you. AI_Distilled #94: Whatās New in AI This Week Building GenAI infra sounds coolāuntil itās 3am and your LLM is down. This free guide helps you avoid the pitfalls. Learn the hidden costs, real-world tradeoffs, and decision framework to confidently answer: build or buy? Includes battle-tested tips from Checkr, Convirza & more. GRAB IT NOW Here's what's happening in the world of AI, which has been buzzing with groundbreaking developments! This week, we're tracking OpenAI's global partnerships for democratic AI, the transparency debate sparked by Anthropic's Claude 3.7 prompt leak, and Google's powerful Gemini 2.5 Pro debut alongside a fresh 'G' logo. We also explore the intersection of tech and Saudi investment, a surprising Microsoft-Google collaboration for AI agent interoperability, Anthropic's real-time web search integration into Claude, and OpenAI's practical guide for enterprise AI adoption. Ready to explore the cutting edge? Let's dive into the most captivating stories making headlines in the world of AI right now. LLM Expert Insights, Packt In today's issue: š§ Expert Deep Dive: DĆ©borah Mesquita & Duygu Altinok explore how spaCy stays relevant in an LLM worldālightweight, fast, and surprisingly powerful. š OpenAI Goes Global: Launches "OpenAI for Countries" to support democratic AI infrastructure across nations. š”ļø Claude 3.7 Prompt Leak: Anthropicās 24K-token system prompt leak sparks concerns over AI transparency and model security. āļø Gemini 2.5 Pro Preview: Google unveils major upgradesāinteractive coding, UI focus, and top leaderboard rankings. šØ Google Logo Makeover: The iconic āGā gets a gradient glow-up, syncing with the sleek aesthetic of Gemini AI. š Tech Meets Oil: Musk, Altman & co. attend Saudi summit seeking AI fundingāsparking debate over geopolitics and ethics. š¤ Microsoft Adopts A2A: In a rare move, Microsoft joins Googleās A2A protocol, enabling cross-agent communication in Azure. š Claude AI Gets Web Access: Anthropic arms Claude with real-time internet searchādirectly challenging traditional engines. š OpenAIās Enterprise Playbook: New guide reveals how companies like Klarna & Morgan Stanley are putting AI to work. šEXPERT INSIGHTS Is spaCy still relevant in an era of LLMs? With the dominance of LLMs, it may seem like weāve acquired a magic wand capable of solving nearly any task ā from checking the weather to writing code for the next enterprise solution. In this context, one might wonder: are our favorite Python libraries, which we've long relied on, still relevant? Today, weāll talk about one such library, spaCy. Despite the rise of LLMs, spaCy remains highly relevant in the NLP landscape. However, its role has evolved. It now serves as a faster, more efficient, and lightweight alternative to large language models for many practical use cases. Consider, for example, an HR screening system at a Fortune 500 company. spaCy can extract information such as names, skills, experience and other relevant details from resumes, and even flag profiles that best match a particular job description. Now imagine the cost per resume if, instead of spaCy, an LLM handled these tasks. spaCy excels at tokenization, part-of-speech (POS) tagging, named entity recognition (NER), dependency parsing, and even building custom components using rule-based or machine learning-based annotators. In this issue, weāll briefly explore the spaCy NLP pipeline, as detailed in the Packt book, Mastering spaCy, Second Edition, by DĆ©borah Mesquita and Duygu Altinok. Hereās a high-level overview of the spaCy processing pipeline, which includes a tokenizer, tagger, parser, and entity recognizer. Letās go through a overview of these components. 1. Tokenization: Tokenization refers to splitting a sentence into its individual tokens. A token is the smallest meaningful unit of a piece of text ā it could be a word, number, punctuation mark, currency symbol, or any other element that serves as a building block of a sentence. Tokenization can be complex, as it requires handling special characters, punctuation, whitespace, numbers, and more. spaCyās tokenizer uses language-specific rules to perform this task effectively. You can explore examples of language-specific data here. Consider the following piece of code: import spacy nlp = spacy.load("en_core_web_md") doc = nlp("I forwarded you an email.") print([token.text for token in doc]) The tokens are: ['I', 'forwarded', 'you', 'an', 'email', '.'] 2.POS tagging: Part-of-speech (POS) tags help us identify verbs, nouns, and other grammatical categories in a sentence. They also contribute to tasks such as word sense disambiguation (WSD). Each word is assigned a POS tag based on its context, the surrounding words, and their respective POS tags. POS taggers are typically sequential statistical models, meaning the tag assigned to a word depends on its neighboring tokens, their tags, and the word itself. To display the POS tags for the sentence in the previous example, you can iterate through each token as follows: for token in doc: print(token.text, "tag:", token.tag_) The output for the example sentence is: I tag: PRP forwarded tag: VBD you tag: PRP an tag: DT email tag: NN 3. Dependency parser: While POS tags provide insights into the grammatical roles of neighboring words, they do not reveal the relationships between words that are not directly adjacent in a sentence. Dependency parsing, on the other hand, analyzes the syntactic structure of a sentence by tagging the syntactic relations between tokens and linking those that are syntactically connected. A dependency (or dependency relation) is a directed link between two tokens. Every word in a sentence plays a specific syntactic role, such as verb, subject, or object, which contributes to the overall sentence structure. This syntactic structure is heavily used in applications like chatbots, question answering, and machine translation. In spaCy, each token is assigned a dependency label, just like other linguistic features such as the lemma or POS tag. A dependency label describes the type of syntactic relation between two tokens, where one token acts as the syntactic parent (called the head) and the other as its dependent (called the child). For example, in the sentence āI forwarded you an email,ā spaCy will label āIā as the subject performing the action, āyouā as the indirect object (the recipient), āemailā as the direct object, and āforwardedā as the main verb (or root) of the dependency graph. A root word has no parent in the syntactic tree; it serves as the central verb that anchors the structure of the sentence. Letās look at how dependency relationships appear in this sentence: for token in doc: print(token.text, "\tdep:", token.dep_) Output will be: I dep: nsubj forwarded dep: ROOT you dep: dative an dep: det email dep: dobj . dep: punct If the sentence were āYou forwarded me an email,ā the direct and indirect objects would change, allowing us to capture the underlying relationships and perform further processing based on them. Here are the dependency relationships for this sentence: You dep: nsubj forwarded dep: ROOT me dep: dative an dep: det email dep: dobj . dep: punct 4.Named Entity Recognition (NER): A named entity is any real-world object such as a person, a place (e.g., city, country, landmark, or famous building), an organization, a company, a product, a date, a time, a percentage, a monetary amount, a drug, or a disease name. Some examples include Alicia Keys, Paris, France, Brandenburg Gate, WHO, Google, Porsche Cayenne, and so on. A named entity always refers to a specific object, and that object is distinguishable by its corresponding named entity tag. For instance, in the sentence āParis is the capital of France,ā spaCy would tag "Paris" and "France" as named entities, but not "capital", because ācapitalā is a generic noun and does not refer to a specific, identifiable object. Letās see how spaCy recognizes the entities in the sentence in the following code snippet: doc = nlp("I forwarded you an email from Microsoft.") print(doc.ents) token = doc[6] print(token.ent_type_, spacy.explain(token.ent_type_)) Since Microsoft is the only named entity in the sentence, spaCy correctly identifies it and specifies its type. [Microsoft] This was just a quick peek into spaCy pipelines ā but thereās much more to explore. For instance, the spacy-transformers extension integrates pretrained transformer models directly into your spaCy pipelines, enabling state-of-the-art performance. Additionally, the spacy-llm plugin allows you to incorporate LLMs like GPT, Cohere, etc. for inference and prompt-based NLP tasks. Liked the Insights? Want to dive in deeper? The book Mastering spaCy, Second Edition by DĆ©borah Mesquita and Duygu Altinok is your comprehensive guide to building end-to-end NLP pipelines with spaCy. Check it out! Join Packtās Accelerated Agentic AI Bootcamp this June and learn to design, build, and deploy autonomous agents using LangChain, AutoGen, and CrewAI. Hands-on training, expert guidance, and a portfolio-worthy projectādelivered live, fast, and with purpose. This is it. 50% off this Workshop ends on 18th May If youāre ināmove now. Code: EXCLUSIVE50 Book Before 18th May Midnight RESERVE YOUR SEAT NOW! šLATEST DEVELOPMENT OpenAI Launches Global AI Partnership Initiatives OpenAI has launched "OpenAI for Countries," a global initiative aimed at assisting nations in developing AI infrastructure aligned with democratic values. It is partnering with the US government in these projects. Through these infrastructure collaborations, the program seeks to promote AI development that upholds principles like individual freedom, market competition, and the prevention of authoritarian control. This effort is part of OpenAI's broader mission to ensure AI benefits are widely distributed and to provide a democratic alternative to authoritarian AI models. Claude 3.7 System Prompt Leak Sparks Debate on AI Transparency and Security A leak revealed the 24,000-token system prompt of Anthropic's Claude 3.7 Sonnet. System prompts are theāÆfoundational instructionsāÆthat guide an AI's behaviour, tools, and filtering mechanisms, essentially itsāÆrulebook. While showcasing Anthropicās commitment to transparency and constitutional AI, the exposure raises security concerns about potential manipulation. The incident highlights tensions between openness and system integrity as AI models increasingly influence information access and decision-making across sectors. Google Unveils Gemini 2.5 Pro with Major Upgrades Google has unveiled an early-access preview of Gemini 2.5 Pro, its most advanced AI model, ahead of the upcoming Google I/O 2025 conference. The Gemini 2.5 Pro update introduces enhanced coding capabilities, particularly for building interactive web apps. It excels in UI-focused development, code transformation, and editing. This updated version leads on the WebDev Arena Leaderboard and demonstrates improved video understanding. Developers can access it via Google AI Studio and Vertex AI. Google Iconic āGā Logo Gets a Makeover After a Decade The new logo features a gradient design, blending the brand's colors instead of using solid blocks. This change aims to modernize its look and align with the visual style of its AI products, like Gemini. The updated logo is currently visible on iOS and Pixel devices, with a wider rollout expected soon. AI Ambitions and Oil Wealth: Tech Titans Join Trump in Saudi Investment Summit Top U.S. tech leaders, including Elon Musk, Sam Altman, and Jensen Huang, joined President Trump in Riyadh for a major investment summit with Saudi Crown Prince Mohammed bin Salman. The event highlighted deepening U.S.-Gulf ties as tech firms seek AI infrastructure funding and Saudi Arabia diversifies beyond oil. Critics question national security risks tied to this commercial diplomacy. Microsoft Adopts Googleās A2A Protocol to Boost AI Agent Interoperability In a rare move, Microsoft has adopted Googleās Agent2Agent (A2A) protocol, enabling AI agents from different platforms to communicate and collaborate. This move promotes open standards and enhances enterprise interoperability. Integrated into Azure and Copilot Studio, A2A allows cross-vendor AI coordination. It aligns with Microsoftās broader push toward open AI ecosystems, amid rising enterprise demand for agent-based automation solutions. Anthropic's Claude AI Gets Real-Time Web Search, Challenges Traditional Search Engines Anthropic has equipped Claude AI with a web search API, enabling real-time internet access and source-cited answers. The feature lets Claude fetch and summarize current data, challenging traditional search engines. Aimed at developers, it allows custom controls and enhances tools like customer support or news apps. This shift may reshape content attribution and search monetization. š¢ If your company is interested in reaching an audience of developers and, technical professionals, and decision makers, you may want toadvertise with us. If you have any comments or feedback, just reply back to this email. Thanks for reading and have a great day! Thatās a wrap for this weekās edition of AI_Distilled š§ āļø We would love to know what you thoughtāyour feedback helps us keep leveling up. š Drop your rating here Thanks for reading, The AI_Distilled Team (Curated by humans. Powered by curiosity.) *{box-sizing:border-box}body{margin:0;padding:0}a[x-apple-data-detectors]{color:inherit!important;text-decoration:inherit!important}#MessageViewBody a{color:inherit;text-decoration:none}p{line-height:inherit}.desktop_hide,.desktop_hide table{mso-hide:all;display:none;max-height:0;overflow:hidden}.image_block img+div{display:none}sub,sup{font-size:75%;line-height:0}#converted-body .list_block ol,#converted-body .list_block ul,.body [class~=x_list_block] ol,.body [class~=x_list_block] ul,u+.body .list_block ol,u+.body .list_block ul{padding-left:20px} @media (max-width: 100%;display:block}.mobile_hide{min-height:0;max-height:0;max-width: 100%;overflow:hidden;font-size:0}.desktop_hide,.desktop_hide table{display:table!important;max-height:none!important}}
Read more