Category: Research Highlights

  • Getting ready for the quantum era. INESC-ID joins European effort to strengthen cybersecurity and quantum resilience

    Getting ready for the quantum era. INESC-ID joins European effort to strengthen cybersecurity and quantum resilience

    Quantum computing is a serious threat to today’s cryptographic systems. A strong reason for the European Defence Fund (EDF) to invest in the development of quantum-resistant solutions for the defense sector. One of these initiatives is project SEQURED: Strengthening Defense Networks for the Quantum Era, launched May, 1st.

    With a duration of 36 months and a budget of nearly four million euros, SEQURED brings together nine partners across Europe, including INESC-ID, to develop next-generation encryption tools, digital signatures, and secure data-sharing mechanisms.

    As quantum technologies evolve, so do the risks posed to current cryptographic systems and so SEQURED aims to develop innovative solutions to ensure that both private and public sector organisations can protect their data against the emerging threat of quantum-enabled cyberattacks.

    This international consortium of academic and industry partners is funded by Horizon Europe programme and is under the European Commission’s broader strategy for digital sovereignty and security. The project focuses on real-world applications of post-quantum cryptography (PQC), ensuring that today’s encrypted communications remain secure tomorrow, even in a post-quantum world. It also integrates cutting-edge cybersecurity practices with privacy-by-design principles and compliance with evolving EU regulations.

    At INESC-ID, funded with €359k, researchers contribute their expertise in cryptographic algorithms, architectures, and secure system design, leading the project’s work package in these areas.  Leonel Sousa, the projects’ PI, is taking part at the kick off meeting, happening May 19 and 20, in Greece.

    Images | © 2025 IBM

  • American media mogul uses copyright violations detector developed at INESC-ID

    American media mogul uses copyright violations detector developed at INESC-ID

    A novel method developed by INESC-ID researchers is at the heart of a headline-making investigation into AI training practices. The technique — DE-COP: Detecting Copyrighted Content in Language Models Training Data— has been used in a recent study by the AI Disclosures Project, co-founded by media figure Tim O’Reilly and economist Ilan Strauss, to examine whether OpenAI’s GPT-4o model was trained on copyrighted, paywalled content without permission.

    At the core of the controversy is the possibility that OpenAI, a leading player in generative AI, used proprietary books from O’Reilly Media in the training of its most advanced model to date, GPT-4o. The AI Disclosures Project’s paper points to DE-COP as a critical tool in establishing this likelihood.

    Developed in 2024 by INESC-ID researchers André Duarte and Arlindo Oliveira, together with colleagues from University of California and Carnegie Mellon University, DE-COP tackles one of the most relevant and difficult questions in the field of AI ethics and transparency: How can we detect if copyrighted content was used in a model’s training data, when that data is not publicly disclosed?

    DE-COP works by probing large language models (LLMs) with multiple-choice questions, where the correct answer is embedded within both exact quotes and paraphrased versions of suspected training content. If a model consistently selects verbatim excerpts over paraphrased ones, it suggests prior exposure — a hallmark of what is known in the field as a “membership inference attack.”

    To validate the approach, the researchers behind DE-COP constructed BookTection, a benchmark dataset featuring excerpts and paraphrases from 165 books, both pre- and post-dating the training cutoffs of popular LLMs. The method outperformed previous techniques by a significant margin — a 9.6% improvement in detection performance on models with available logits, and 72% accuracy on fully black-box models, where prior methods hovered around four percent.

    The AI Disclosures Project applied DE-COP to a set of 34 books from O’Reilly Media, analysing nearly 14,000 paragraph excerpts. The study found that GPT-4o showed a much higher “recognition” of paywalled content from O’Reilly books compared to OpenAI’s previous model, GPT-3.5 Turbo. As noted in the article published at TechCrunch.

    “GPT-4o [likely] recognizes, and so has prior knowledge of, many non-public O’Reilly books published prior to its training cutoff date,” the authors noted. They also found that GPT-3.5 Turbo, in contrast, demonstrated greater recognition of publicly available O’Reilly materials — suggesting a significant shift in training data sources between model generations.

  • AMALIA, giving voice to Portuguese identity through Artificial Intelligence

    AMALIA, giving voice to Portuguese identity through Artificial Intelligence

    Few expressed pain and longing with the intensity of Amália Rodrigues, the iconic fado singer who became a symbol of Portuguese cultural identity. Her voice, her language, and her emotion are all part of a legacy that continues to shape Portugal’s artistic and emotional landscape. Drawing inspiration from that deep cultural well, AMALIA (Automatic Multimodal Language Assistant with Artificial Intelligence) is the name chosen for the first Portuguese Large Language Model (LLM) designed from scratch to reflect and preserve the richness of the Portuguese language and identity – with INESC-ID playing a crucial role, particularly in the area of speech processing.

    Derived from the Latin word for “fate,” fado conveys a broad spectrum of emotions, from heartbreak and nostalgia to joy and resilience. Similarly, AMALIA is being designed to understand, process, and generate content in European Portuguese, capturing nuances in both language and culture. “This tool will serve a wide range of applications across essential sectors such as education, media, science, cultural heritage, and public administration”, anticipates INESC-ID researcher and Professor at Técnico, Alberto Abad, from Human Language Technologies.

    A strategic national investment

    Supported under Portugal’s Recovery and Resilience Plan (PRR) and coordinated by the Foundation for Science and Technology (FCT), AMALIA is being developed by a national consortium of top academic and research institutions. This includes Universidade de Lisboa, via Instituto Superior Técnico, Universidade NOVA, the Universidade do Porto, Universidade de Coimbra, Universidade do Minho, and the national laboratories NOVA LINCS, IT, INESC TEC, CISUC/LASI, and ALGORITMI/LASI. Experts from the University of Beira Interior and the University of Évora are also contributing.

    Under the coordination of Alberto Abad, INESC-ID’s contribution focuses on multimodal language processing, particularly the integration of spoken language. This means AMALIA will not only be able to interpret text but also receive and process speech and images – giving it “ears” and “eyes,” with the “brain” generating accurate and contextually aware text responses.

    Unlike commercial AI models primarily optimized for global markets, AMALIA is trained from the scratch using resources such as Arquivo.pt and is specifically tailored for European Portuguese. It will be open source and designed to operate in closed and secure environments, ensuring data protection and reinforcing national technological sovereignty.

    AMALIA will serve as a strategic asset for Portugal – not just as a language model, but as a digital guardian of linguistic and cultural heritage. In an age when companies tend to prioritize broader language variants like Brazilian Portuguese, AMALIA’s focus on the European variant is both a cultural imperative and a technical challenge.

    Filling a niche

    By September 2025, the consortium aims to release a public version of the model. A first internal version was successfully launched on March 31, 2025, already capable of engaging in contextual conversations and demonstrating knowledge of Portuguese culture and language.

    “AMALIA will not replace general-purpose models like ChatGPT”, Alberto Abad underlines. “Instead, it fills a vital niche: delivering specialized, context-sensitive responses in domains where language, culture, and data privacy matter.” Its potential spans education, public service, cultural preservation, and more.

    As Fernando Pessoa once said, “My homeland is the Portuguese language (A minha pátria é a língua portuguesa).” With AMALIA, that homeland now has a voice in the digital future. One that speaks, understands, and respects its unique identity.

     

     

     

     

     

  • INESC-ID Stands out in ACM CHI Conference

    INESC-ID Stands out in ACM CHI Conference

    Our institute made waves last week during the ACM (Association of Computing Machinery) CHI 2025 conference on Human Factors in Computing Systems, with several contributions presented.

    Patrícia Piedade, Artificial Intelligence for People and Society (AIPS) PhD student and Rui Prada, AIPS  researcher, co-authored a paper that received an honorable mention, placing it in the top 5% of submissions. In addition, three other papers, co-written by INESC-ID researchers, were presented at the conference. It is worth noting that one of these articles was solely authored by women, including AIPS PhD student, Regina Duarte and AIPS researchers Ana Paiva and Joana Campos; an inspiring example of the growing presence of women in computer science. Isabel Neto, former INESC-ID researcher authored two of the papers mentioned.

    ACM CHI is the world’s most important conference on human-computer interaction, where researchers and practitioners present cutting-edge work on how people interact with digital technologies. It is held annually and in 2025 it took place from 26 April to 1 May in Yokohama, Japan.

  • INESC-ID researchers receive ACM SIGSOFT Distinguished Paper Award at ICSE 2025

    INESC-ID researchers receive ACM SIGSOFT Distinguished Paper Award at ICSE 2025

    João F. Ferreira, INESC-ID researcher and professor at the Department of Computer Science and Engineering (DEI) at Instituto Superior Técnico, and Nuno Saavedra, PhD student and INESC-ID researcher, have received the prestigious ACM SIGSOFT Distinguished Paper Award at the International Conference on Software Engineering (ICSE) 2025—the premier global event in software engineering.

    The award-winning paper, titled Rango: Adaptive Retrieval-Augmented Proving for Automated Software Verification,” presents a novel approach to proof synthesis using machine learning and large language models (LLMs), enhanced by retrieval augmentation techniques. Rango introduces a dynamic, adaptive system that identifies and integrates relevant proofs and premises at each stage of the software verification process. This allows the tool to tailor its reasoning to both the specific project and the evolving state of the proof itself.

    The effectiveness of Rango was demonstrated using a newly curated dataset, CoqStoq, which contains more than 2,200 open-source Coq projects. The tool successfully synthesized proofs for 32% of theorems, marking a 29% improvement over previous state-of-the-art systems—an impressive leap forward that could help make formal software verification more accessible and practical for developers.

    This significant achievement is the result of a strong international collaboration with the research including researchers from the University of California San Diego, the University of Massachusetts and Pedro Carrott, a former MSc student at Técnico, supervised by João F. Ferreira and presently at Imperial College London.

    The ACM SIGSOFT Distinguished Paper Award is reserved for papers of exceptional quality presented at ICSE, and this recognition reflects both the scientific impact and collaborative excellence behind the work.

  • INESC-ID contributes to Europe’s digital autonomy in High-Performance Computing and AI through the DARE Project

    INESC-ID contributes to Europe’s digital autonomy in High-Performance Computing and AI through the DARE Project

    INESC-ID is one of 38 partners involved in a major European effort to build a sovereign computing infrastructure through the new project DARE SGA1 – Digital Autonomy with RISC-V in Europe.

    Our contribution will be on the project’s software ecosystem, in collaboration with INESC TEC, with a focus on optimizing performance for RISC-V architectures – an open-source, modular instruction set architecture that enables anyone to design custom processors without licensing fees, promoting technological independence and innovation. And also integrating HPC and AI applications, and enabling co-design approaches between hardware and software teams.

    Backed by €240 million in funding from the EuroHPC Joint Undertaking, the project marks a strategic step toward reducing Europe’s dependence on non-EU hardware and software in the fields of High-performance Computing (HPC) and Artificial Intelligence (AI).

    Coordinated by the Barcelona Supercomputing Center (BSC-CNS), DARE SGA1 will design and develop next-generation processors and a full software ecosystem based on RISC-V, an open standard instruction set architecture. The initiative’s goal is to create a fully European HPC technology stack to support scientific research, industrial innovation, and public-sector digital infrastructure.

    “INESC-ID’s long-standing expertise in computer architecture and HPC positions us well to support this ambitious European initiative,” said Leonel Sousa, INESC-ID researcher, responsible for the Portuguese participation in Dare, and professor at Instituto Superior Técnico.

    The first three years of DARE SGA1 will focus on building three RISC-V-based chiplets: a general-purpose processor (led by Codasip), a high-precision vector accelerator (led by Openchip), and an AI inference engine (led by Axelera AI). These components will form the backbone of Europe’s future supercomputing systems, offering greater energy efficiency and scalability than traditional monolithic chips.

    “There’s no AI without HPC”, notes Leonel Sousa. “At the core of the project lies Europe’s ambition to become self-reliant in semiconductor and chip design. It’s crucial to reduce our dependency on foreign chip supply”, underlines the researcher.

    DARE SGA1 is the first phase of a six-year roadmap to secure Europe’s digital autonomy in HPC and AI infrastructure. The project is expected to lay the groundwork for the EU’s first fully sovereign supercomputing system by the end of its initial phase.

  • INESC-ID researchers win awards at EuroSys 2025

    INESC-ID researchers win awards at EuroSys 2025

    Two papers co-authored by INESC-ID researchers, professors, and students from the Department of Computer Science and Engineering (DEI) at Instituto Superior Técnico were distinguished at EuroSys 2025, a leading European conference in computer systems, held in Rotterdam from March 30 to April 3.

    The paper HawkSet: Automatic, Application-Agnostic, and Efficient Concurrent PM Bug Detection, by João Oliveira and João Gonçalves, both PhD students in the Doctoral Program in Computer Engineering (PDEIC), and Miguel Matos, INESC-ID researcher, DEI Professor and Big Era Chair team member, received the prestigious EuroSys Gilles Muller Best Artifact Award.

    The work presents HawkSet, an innovative tool for detecting concurrent bugs in Persistent Memory (PM) systems. PM enables the development of fast, persistent applications without relying on expensive HDD/SSD-based I/O operations. However, due to the volatile nature of caches and CPU memory reordering for performance optimization, developers must use low-level instructions to ensure data consistency in case of crashes—especially in concurrent environments, where new classes of bugs can emerge.

    HawkSet stands out for being automatic, application-agnostic, and highly efficient. It employs lockset analysis and automatic binary instrumentation to detect all bugs found by state-of-the-art tools, as well as seven previously unknown bugs. It achieves this without requiring application-specific knowledge, debugging artifacts, or guided executions. HawkSet also delivers significant performance improvements—up to 159x faster detection—and consistently uncovers hard-to-reach bugs that depend on rare interleavings.

    Luís Pedrosa (DEI/INESC-ID) was also honored, alongside his co-authors, with the EuroSys Test of Time Award for the paper “Large-scale cluster management at Google with Borg.”

    The paper describes Borg, Google’s cluster management system that runs hundreds of thousands of jobs from thousands of applications across clusters with tens of thousands of machines.

    Borg achieves high resource utilization through a combination of admission control, efficient task-packing, over-commitment, and process-level performance isolation. It supports high-availability applications with runtime features that reduce fault-recovery time and scheduling policies that lower the risk of correlated failures. For users, Borg offers a declarative job specification language, integration with name services, real-time monitoring, and tools for system analysis and simulation.

    EuroSys, the European Conference on Computer Systems, is one of the most prestigious conferences in the field of systems research—particularly relevant to the Distributed, Parallel and Secure Systems (DPSS) research area at INESC-ID.

  • Finding his corner of the beach: Ricardo Rei’s path to an award-winning Ph.D.

    Finding his corner of the beach: Ricardo Rei’s path to an award-winning Ph.D.

    When Ricardo Rei started his bachelor’s degree in Computer Science and Engineering, he already knew he wanted to work in AI. But it was the people who led the former INESC-ID researcher to pursue his master’s in dialogue systems and chatbots. Under the supervision of João Graça, Unbabel’s CTO at the time — a company Ricardo discovered during Semana Empresarial e Tecnológica — he came into close contact with automatic translation algorithms. It was then he realised how hard it was to evaluate the quality of chatbots or, as we may put it, to separate the wheat from the chaff.

    It soon became clear that this would be the topic of his Ph.D. thesis, titled “Robust, Interpretable and Efficient MT Evaluation with Fine-tuned Metrics.” A thesis he completed in less than the usual four years — and one that has earned strong recognition from both industry and academia through citations, conference presentations, and awards. The latest of these is the Anthony C. Clarke Best Thesis Award by the European Association for Machine Translation (EAMT) — the first time a scientist working in Portugal has received this distinction.

    “I am not surprised at all!” says INESC-ID researcher and proud supervisor Luísa Coheur — who co-supervised the work with Alon Lavie from Carnegie Mellon University. “It’s a very robust thesis, with many publications and no fragility,” she adds.

    The Cross-lingual Optimized Metric for Evaluation of Translation, or COMET — the most visible outcome of Ricardo’s work — has been widely adopted for evaluating translation engine outputs. It’s now integrated into the Unbabel portfolio, a company specialising in AI-driven translation with human assistance, where Ricardo has worked ever since completing his master’s. He currently holds the position of Senior Research Scientist.

    “Our models are public,” he notes, “but if they are being used for commercial purposes, they must be paid for.” COMET is used to determine whether a machine translation needs human review and correction. “It ended up being adopted as the main evaluation metric,” Ricardo explains.

    Luísa recalls the same enthusiasm and dedication in Ricardo when he was just a first-year student, crediting that strong connection and drive for his remarkable achievements. As for Ricardo, it’s no surprise he uses a beach metaphor to describe how he completed his Ph.D. so quickly and smoothly: “When I started my Ph.D., I already knew my corner of the beach.” After all, he’s a former surf champion.

    The award, named after a former member of EAMT, of “exceptional human qualities”, notes INESC-ID researcher and current President of EAMT, Helena Moniz, will be delivered on a ceremony in June.

  • Overcoming the Adamastor: INESC-ID PhD Student wins third edition of the award “Vencer o Adamastor”

    Overcoming the Adamastor: INESC-ID PhD Student wins third edition of the award “Vencer o Adamastor”

    In an effort to reduce the time spent correcting errors in coding, Pedro Orvalho, who recently concluded his PhD thesis, at INESC-ID and Instituto Superior Técnico, and is now at the University of Oxford, has developed the artificial intelligence tool MENTOR, which has earned him the third edition of the award “Vencer o Adamastor” (“Overcoming the Adamastor”). On winning this recognition, the researcher shared that “it is an honour for me to receive this award, as it recognises the impact, both at a scientific and societal level, of the research work I developed during my PhD in collaboration with my supervisors, Vasco Manquinho here at INESC-ID, and Mikoláš Janota at CIIRC, at the Czech Technical University in Prague.”

    The MENTOR system helps to automatically identify errors in computer programs, offering instant, personalised feedback to the students, while encouraging them to solve the problem themselves, as the system doesn’t provide solutions. This reduces the amount of simpler doubts and questions asked to the teachers, allowing their time to be dedicated to more complex or conceptual student issues, improving pedagogical support.

    Tests have been carried out in Computer Engineering courses at Instituto Superior Técnico, with positive feedback. However, its use will not be exclusive to university level – according to Pedro “looking to the future, with the increasing digitalisation of society, programming will soon become a common subject at all levels of education, from basic to university. The MENTOR system thus appears as a learning tool that can help in the construction of this path, where each student can learn to program more autonomously.”

    The award ceremony will take place tomorrow, April 11, at 17h00, at Técnico Innovation Center, attended by Rogério Colaço, president of Instituto Superior Técnico, Arlindo Oliveira, president of INESC, Luís Ferreira, Rector of the University of Lisbon, and Fernando Alexandre, Minister of Education, Science and Innovation.

    The Prize “Vencer o Adamastor” (“Overcoming the Adamastor”), established by INESC and the newspaper “Público”, aims to reward “innovative works by young scientists, developed in Portugal, in the fields of electrical engineering , computing and the like, which reveal not only scientific excellence, but also potential for developments that benefit society”.

  • Faster, Higher, Stronger: INESC-ID Joins Project ACHILLES to Redefine AI

    Faster, Higher, Stronger: INESC-ID Joins Project ACHILLES to Redefine AI

    In Greek legend, Achilles was a hero with a single vulnerability—his heel. Similarly, the “Achilles’ heels” of modern AI systems are trust and efficiency. The recently launched ACHILLES Horizon Europe Project (“Human-Centred Machine Learning: Lighter, Clearer, Safer”) aims to address these critical weaknesses.

    Bringing together 16 organizations from 10 countries, ACHILLES has significant Portuguese involvement. Led by Fraunhofer Portugal Research (FhAICOS) as the coordinating partner, INESC-ID plays a key role with six researchers leading the Work Package (WP) on AI sustainability and contributing to the WP on Privacy-Preserving Machine Learning and Model Monitoring. Paolo Romano, from Distributed Parallel and Secure Systems, coordinates INESC-ID’s participation, which has a budget of nearly one million euros.

    The ACHILLES team seeks to drive responsible AI innovation in line with European values and regulations. Moving away from the traditional “Faster, Higher, Stronger” approach—borrowed from the Olympics, another iconic Greek reference—ACHILLES champions a new framework: “Lighter, Clearer, Safer,” reflecting the evolving demands of modern AI.

    “At the heart of ACHILLES is an iterative development cycle inspired by clinical trials,” explained André Carreiro, Senior Scientist and ACHILLES Project Coordinator.

    A standout innovation within the project is the ACHILLES Integrated Development Environment (IDE), a machine-learning-driven platform empowering developers to build AI solutions that are not only more effective and efficient but also responsible and ethically compliant. The project will validate its approaches through real-world applications in healthcare, identity verification, content creation, and pharmaceuticals, demonstrating its transformative potential across diverse sectors.

    Funded with over €8 million under the Horizon Europe Framework Program, on the cluster of digital, industry and space, ACHILLES is set to redefine the way we approach AI, ensuring it aligns with the values and expectations of modern society.