This article was automatically translated from Polish to English using Claude Opus 4.7 and may contain translation errors.

Poles in AI Safety

Jakub Kryś|February 22, 2025|15 min

Although Silicon Valley and London dominate the AI Safety landscape, the global balance of power is becoming increasingly diverse. Recently, new centres have been dynamically developing in many corners of the world. Poland is no exception; moreover, it can boast a truly significant number of experts in this field. In this article, we will introduce the profiles of Poles and people connected with Poland who are actively working to reduce the risks of advanced artificial intelligence.

Don't see yourself on this list? Join our community and let us know! ;)

Jan Betley & Anna Sztyber-Betley

This couple are among the most recognised researchers in AI safety thanks to their work on surprising behaviours of LLMs. Jan and Anna focus on studying phenomena related to out-of-context reasoning, as well as how LLMs learn hidden information from their data during training or fine-tuning. Two of their publications on Emergent Misalignment and Subliminal Learning were recently published in Nature – one of the most prestigious scientific journals in the world.

Anna completed her PhD in Automation and Robotics and is an assistant professor at the Faculty of Mechatronics at Warsaw University of Technology. Jan worked as a software developer for over a decade before switching his career to AI Safety through programmes such as ARENA and Astra. Together they collaborate with Owain Evans' TruthfulAI group.

Tomek Korbak

Tomek works at OpenAI, where he focuses on monitoring LLMs for misalignment and on AI system control. He is the author of over 50 scientific papers on AI and one of the leading experts on using LLMs' chain of thought to detect signs of misalignment (chain of thought monitoring). Previously he worked at the London UK AI Security Institute and Anthropic, studying aspects such as training-data filtering and sycophancy. Tomek studied cognitive science, philosophy and physics at the University of Warsaw, and then completed his PhD at the University of Sussex in the UK on Reinforcement Learning from Human Feedback (RLHF).

Krzysztof Bar

Krzysztof is CEO of London Initiative for Safe AI (LISA), a London AI Safety hub that hosts over 100 people daily and gathers over 1000 members, making it the largest such centre in Europe – surpassed in scale only by Berkeley, California. LISA is also home to organisations such as Apollo Research and training programmes like ARENA, LASR and Pivotal.

Krzysztof's scientific journey, however, began with… quantum computers! After studying mathematics and computer science at the University of Oxford, he earned his PhD there, specialising in theoretical aspects of quantum computing and category theory. During the same period he also chaired the Federation of Polish Student Societies in the UK.

After his PhD, Krzysztof moved to the private sector, where he worked for nearly a decade as a consultant at Oliver Wyman. He focused there on public-policy and technology advisory, working among others on the rollout of the UK Online Safety Act. Krzysztof pivoted his career in early 2025, completing the AGI Strategy course from BlueDot Impact and then, as part of the GovAI Fellowship, working with Toby Ord on the consequences of inference scaling in reasoning models.

Worth watching:

interview with Krzysztof by the founder of BlueDot Impact

Jakub Growiec

Jakub is a professor and head of the Department of Quantitative Economics at the Warsaw School of Economics. His research interests include the economic transformations driven by advanced AI and the theory of long-run economic growth and technological change. He authored the book "Accelerating Economic Growth", published by the prestigious Springer publishing house, on exactly this subject.

Sebastian Cygert

Sebastian heads the AI Safety and Transparency Department at the Polish NASK institute, where he conducts research on explainability and interpretability of AI models. His work also includes auditing AI systems and developing specialised software to support their analysis.

Sebastian is the author of over 50 scientific papers in the field of AI. He completed his PhD at Gdańsk University of Technology, specialising in computer vision.

Efficient LLM Moderation with Multi-Layer Latent Prototypes

Kamil Deja

Kamil leads the Generative AI team at the Warsaw-based IDEAS Research Institute and is an assistant professor at Warsaw University of Technology. He focuses on understanding the mechanisms inside neural networks through interpretability. His research covers not only LLMs but also diffusion models and computer vision. He is the author of over 40 scientific papers on AI.

Kamil completed his studies and PhD in computer science at Warsaw University of Technology, and then collaborated with institutions such as La Sapienza in Rome and CERN in Geneva.

Bartosz Cywiński

Bartosz is a PhD student in machine learning at Warsaw University of Technology and a close collaborator of Kamil Deja. Through the MATS programme he also collaborates with Neel Nanda and Arthur Conmy from Google DeepMind. His research focuses on mechanistic interpretability and on methods for eliciting latent knowledge from LLMs that they do not reveal directly.

Patryk Wielopolski

Patryk completed his PhD at Wrocław University of Technology, specialising in probabilistic uncertainty modelling with normalising flows. He spent six years at DataWalk, where he managed a team of engineers while also gaining business and product-management experience.

That experience soon proved very valuable – Patryk took a several-month career break during which he started devoting significant time to AI Safety. After completing BlueDot Impact courses, the ML4Good bootcamp and a few personal projects, he joined MATS as a Research Manager. MATS is a prestigious programme that, every six months, allows over 100 talents to develop AI Safety research under expert guidance. Day to day, Patryk not only ensures the high scientific quality of their work but also manages the operational and structural growth of the organisation.

Mateusz Dziemian

Mateusz works at Gray Swan, a company specialising in AI Security and red-teaming services. Apart from developing the company's engineering infrastructure, Mateusz analyses attacks on LLMs. The organisation collects data from tens of thousands of users via public challenges on the dedicated Gray Swan Arena platform. This allows the research team to analyse real-world cases and tap into the creativity of the community, ensuring much broader coverage of the threat landscape than traditional AI evaluation methods.

Mateusz also took part in the SPAR programme, where he studied the phenomenon of collusion between AI agents supervised by other language models.

Julia Bazińska

Julia is associated with Lakera, where she designs protection mechanisms for companies deploying AI-agent-based solutions. She is a co-author of the popular Gandalf platform, which – like Gray Swan Arena – lets users test language models through simulated attacks, providing data for ever more effective safeguards.

She began her scientific path with a Bachelor's in Computer Science at the University of Warsaw, where she chaired the Machine Learning Student Society. She earned her Master's at ETH Zurich. Her experience includes research internships at companies such as IBM, Google and Google DeepMind.

Filip Sondej

Filip's research focuses on unlearning – the process of removing unwanted information from LLMs that could be used for purposes such as biological attacks. In the past he also worked on cooperation and conflict prevention between AI agents as part of MATS and in cooperation with the London Centre on Long-Term Risk.

Filip completed computer science at AGH in Kraków and then cognitive science at Jagiellonian University.

Konrad Kozaczek

Konrad works in a rather niche (but fascinating!) area of AI Safety – studying the consequences of the potential future existence of so-called "digital minds", i.e. conscious entities based on AI. His work analyses the philosophical and moral aspects of their existence and considers how current legal and political systems could be extended to include their moral status. Konrad is also interested in s-risks – risks related to extreme suffering.

Through the Future Impact Group programme, Konrad collaborates with Professor Jeff Sebo at New York University and Dr Bradford Saad at the University of Oxford. He also mentors in the Impact First programme. Konrad completed his Master's in Philosophy and AI at Northeastern University London, with a thesis on replicating digital minds.

Michał Kubiak

Michał is a researcher focused on AI governance – specialising in European regulation and AI risk management. His tech-policy experience includes AI-policy specialist roles at the European DIGITAL SME Alliance and at Observatorio de Riesgos Catastróficos Globales. He is also a co-creator of the AI Risk Explorer platform, which continuously monitors risks and incidents related to advanced AI. His research analyses the role of "middle powers" – exploring what actions countries outside the US–China duopoly can take to maintain technological competitiveness and genuinely shape global AI safety standards.

Michał is also active in education: as a facilitator he runs groups for AI Safety courses such as BlueDot Impact, ML4Good and Electric Sheep.

Bartosz Kubiak

Bartosz specialises in AI policy at the European level – focusing on implementation of the EU AI Act, regulation of AI infrastructure (data centres, sovereign compute, the so-called Gigafactories) and governance of advanced AI systems. He took part in the Cambridge-based ERA programme, where he worked on legal-liability frameworks for autonomous AI agents. He teaches the AGI Strategy, Frontier AI Governance and Technical AI Safety courses at BlueDot Impact.

Previously, Bartosz was AI Policy Officer at the Brussels-based Centre for Future Generations, where he worked on the €25-billion public–private AI Gigafactories partnership and advised the European Commission on sovereign compute and AI Act implementation. He also advised Fortune 100 clients on EU regulation (DSA, DMA, AI Act, GDPR).

Bartosz earned a Master of Public Policy from the Blavatnik School of Government at the University of Oxford.

Maciej Chrabąszcz

Maciej is a researcher at the NASK institute and a PhD student at Warsaw University of Technology. His research focuses on using models' internal representations to detect and prevent dangerous behaviours. He is the author of over 10 scientific papers in AI.

Maciej studied mathematics and data analysis at Warsaw University of Technology, where he also chaired the Golem Artificial Intelligence Student Society.

Jan Dubiński

Like Maciej, Jan studies AI Safety at NASK and is a PhD student at Warsaw University of Technology, where he previously completed computer science. He is interested in adversarial attacks on AI systems, output watermarking, membership-inference attacks and model-weight-stealing attacks. He currently collaborates with Owain Evans' TruthfulAI through the Astra Fellowship. He is also a member of the ATLAS Collaboration at CERN.

Mateusz Piotrowski

Mateusz is interested in mechanistic interpretability – understanding how neural networks work at the level of neurons and their connections. He is a co-author of an open-source library for generating attribution graphs based on the circuit-tracing method. These tools let researchers visualise decision processes in LLMs and follow how information flows through the neural network. This line of research was initiated by Anthropic – one of the sector's leaders – with whom Mateusz worked as part of the Anthropic AI Safety Fellowship.

Mateusz has also worked on computational mechanics in the context of information theory and stochastic processes. This relatively niche area of technical AI Safety studies how transformers learn certain geometric structures in their internal representations. In his work Mateusz showed that intermediate representations in transformers have fractal geometry, whose structure can be predicted by analysing the data-generating process on which the transformer was trained.

Reworr

Reworr works at the non-profit research organisation Palisade Research, analysing the dangerous capabilities of AI agents in the cybersecurity domain. He has extensive experience in AI and IT Security, acquired among others through red-teaming, penetration testing and web-application security audits.

Zuzanna Matuszewska

Zuzanna works as a researcher at Measuring AI Progress—a non-profit organization designing evaluations of LLMs' agentic capabilities relevant to biorisks. She took on this role immediately after completing the first edition of the ERA AIxBio Fellowship in early 2026. Previously, she earned her medical degree from the Medical University of Warsaw and concurrently studied mathematics at the University of Warsaw. She also volunteers as a researcher at the Alliance to Feed the Earth in Disasters (ALLFED). She is interested in the welfare of animals and AI models.

Taras Kutsyk

Taras is a PhD student working on mechanistic interpretability at GMUM (Machine Learning Group) at Jagiellonian University in Kraków. He was previously a MATS fellow in Neel Nanda's group at Google DeepMind, took part in the AI Safety Camp and completed the AI Safety Fundamentals course at BlueDot Impact. His research focuses on applying interpretability techniques to AI-safety problems, including persona generalisation in large language models. Taras also collaborates with Jan Betley on studying self-awareness in LLMs.

Taras completed applied mathematics and computer science at Lviv Polytechnic and then engineering mathematics at the University of L'Aquila in Italy.

Jakub Kryś

To close, the profile of this article's author. Jakub completed a PhD in theoretical physics at Durham University in the UK, then switched his career to AI Safety through courses such as Technical AI Safety at BlueDot Impact. He has worked on adversarial attacks and jailbreaks, focusing on Vision Language Models (VLMs). He also has experience in research on using compute for AI Safety (compute governance), and on developing technical solutions supporting the AI Verification research programme. He currently works at the non-profit SaferAI, where he models cyberattacks carried out with LLMs.

Jakub is interested in s-risks, cooperation and conflict in multi-agent AI systems and the use of state-of-the-art LLMs in mathematical and physical research. He is also a mentor in the SPAR programme, where he supervises research projects on risk modelling (in the cyber and "Loss of Control" domains) and on LLM forecasting.

If you made it all the way to the end of this article, you are probably someone who would be interested in joining our community :) Within AI Safety Polska we organise regular webinars, a reading group and in-person meetups, and we offer space for substantive discussions on many aspects of AI Safety.

Follow our events on our Luma calendar and join our Slack!

Join the discussion

Want to discuss this article? Join our Slack community.

Join Slack