Author Image

Hi, I am Lukas

Lukas Schäfer

PhD Student at University of Edinburgh

I am a 24-year-old Data Science and Artificial Intelligence PhD student from Germany working on Multi-Agent Reinforcement learning at the Autonomous Agents Research Group.

My research focuses on the challenges of generalisation and sample efficiency: how can multiple agents learn effective behaviour with less data and be able to learn robust, re-usable skills which transfer to new environments.

Deep Learning


Aug 29, 2021

🤖 Just published a big redesign of my webpage based on GoHugo!

Jul 30, 2021

📃 Our work, Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks, has been accepted at the Datasets and Benchmarks track of the Neural Information Processing Systems Conference (NeurIPS) 2021!

Jul 20, 2021

📃 Our work, Decoupling Exploration and Exploitation in Reinforcement Learning, has been accepted at the Unsupervised RL (URL) workshop in the International Conference on Machine Learning (ICML) 2021!

March 19, 2021

📝 I wrote a blog post providing an overview of a range of multi-agent learning environments.

Sep 27, 2020

📃 Our work, Shared Experience Actor-Critic or Multi-Agent Reinforcement Learning, has been accepted at the Neural Information Processing Systems Conference (NeurIPS) 2020!

June 25, 2020

📝 I wrote a blog post about the UK Multi-Agent Systems Symposium at the Alan Turing Institute in London.


Lukas Schäfer, Filippos Christianos, Josiah Hanna, Stefano V. Albrecht (2021)
Unsupervised Reinforcement Learning (URL) Workshop in the International Conference on Machine Learning, 2021
Georgios Papoudakis, Filippos Christianos, Lukas Schäfer, Stefano V. Albrecht (2021)
Conference on Neural Information Processing Systems (NeurIPS), 2021 - Datasets and Benchmarks track
Filippos Christianos, Lukas Schäfer, Stefano V. Albrecht (2020)
Conference on Neural Information Processing Systems (NeurIPS), 2020


Research Intern

Nov 2020 - Mar 2021, Remote

Dematic is global player focused on design and implementation of automated system solutions for warehouses, distribution centres and production facilities.

  • Applying state-of-the-art AI technology to enable a prototype for automation of large-scale robotic warehouse logistics.


Sep 2018 - Aug 2020, Edinburgh

HYPED is a team of students at the University of Edinburgh dedicated to developing the Hyperloop concept and inspiring future generations about engineering. HYPED has received awards from SpaceX, Virgin Hyperloop One and Institution of Civil Engineers.

Navigation Advisor

Sep 2019 - Aug 2020

  • Advising navigation team on the adaptation and implementation of improved sensor and filtering techniques
Navigation Engineer

Sep 2018 - Aug 2019

  • Developing navigation system of “The Flying Podsman” Hyperloop prototype using sensor filtering, processing and control techniques to estimate location, orientation and speed of the pod
  • Finalist for the SpaceX 2019 Hyperloop competition in California in Summer 2019


University of Edinburgh
Ph.D in Data Science and Artificial Intelligence
Project: Sample Efficiency and Generalisation in Multi-Agent Reinforcement Learning
Supervisors: Stefano V. Albrecht (primary) and Amos Storkey (secondary)
Funding: Principal's Career Development Scholarship from the University of Edinburgh
Key Areas: Reinforcement Learning, Multi-Agent Systems, Generalisation, Exploration, Intrinsic Rewards
University of Edinburgh
M.Sc. in Informatics
CGPA: 77.28%
Funding: DAAD (German Academic Exchange Service) graduate scholarship & Stevenson Exchange Scholarship
Taken Courses
Course NameObtained Credit
Reinforcement Learning10 (82%)
Algorithmic Game Theory and its Applications10 (98%)
Machine Learning and Pattern Recognition20 (64%)
Probabilistic Modelling and Reasoning20 (75%)
Decision Making in Robots and Autonomous Agents10 (86%)
Robotics: Science and Systems20 (87%)
Natural Computing10 (84%)
Informatics Project Proposal10 (73%)
Informatics Research Review10 (72%)
Extracurricular Activities
  • Active position as navigation engineer for HYPED.
  • Participation in GEAS roleplaying society.
  • Participation in EUKC - Edinburgh University Kendo Club.
Saarland University
B.Sc. in Informatics
GPA: 3.7
Taken Courses
Course NameObtained Credit
Automated Planning9 (4.0)
Admissible Search Enhancements7 (4.0)
Information Retrieval and Data Mining9 (3.3)
Neural Networks: Implementation and Application6 (2.0)
Artificial Intelligence9 (3.3)
Software Engineering9 (3.7)
Modern Imperative Programming Languages5 (3.7)
Concurrent Programming6 (2.3)
Fundamentals of Data Structures and Algorithms6 (3.3)
Information Systems6 (3.7)
Introduction to Theoretical Computer Science9 (4.0)
System Architecture9 (4.0)
Mathematics for Computer Scientists I9 (4.0)
Mathematics for Computer Scientists II9 (2.7)
Mathematics for Computer Scientists III9 (3.3)
Programming I9 (4.0)
Programming II9 (4.0)
Japanese Foundations - Shokyu I6 (3.7)
Japanese Foundations - Shokyu II6 (3.0)
Japanese Applied Geography5 (4.0)
Japanese History II5 (4.0)
Extracurricular Activities
  • Japanese language and cultural studies as minor subject.
Higher Secondary School Certificate
GPA: 4.0
  • School year's best student award
  • Computer Science award of Saarland University
  • Mathematics award of Saarland University
  • History award of Historic Society for the Saar-Region


Curiosity in Multi-Agent Reinforcement Learning
MSc Thesis May 2019 - Aug 2019

In my MSc thesis project I researched the application of curiosity-inspired intrinsic exploration bonuses for multi-agent reinforcement learning. Count- and prediction-based curiosities were evaluated in combination with value-based and policy-gradient MARL methods, all implemented in PyTorch.

Domain-Dependent Policy Learning in Classical Planning
BSc Thesis May 2018 - Aug 2018

In my BSc thesis project I modified and implemented the neural network architecture of [](Action Schema Networks) for application in classical, deterministic planning and extensively evaluated the network’s suitability for this type of automated planning.

Reinforcement Learning For Football Playing
Individual Project January 2019 - March 2019

Software project as part of Reinforcement Learning lecture developing several classical and deep RL algorithms. Implemented algorithms include dynamic programming algorithms value iteration and policy iteration and tabular RL SARSA, Q-Learning, and Monte Carlo control. Lastly, deep RL methods of asynchronous DQN and tabular multi-agent RL algorithms were implemented and evaluated in the Half field offense (HFO) 2D football environment.

Autonomous Robot Localisation
Group Project Sep 2018 - Dec 2018

Group project as part of Robotics Science and Systems lecture with design, construction of a four-wheel differential steering mobile robot and development of an autonomous localisation system based on particle-filtering using sonar and IR sensors. The robot was constructed using a LEGO framework, a Raspberry Pi computer, sensors and actuators. The robot was tasked to navigate through a pre-defined environment without contact to obstacles and act on detection of variable points of interest using light sensors before returning back to its deployment location.

Automated Planning System
Group Project Oct 2017 - Feb 2018

Group project as part of Automated Planning lecture implementing several heuristics, search algorithms and pruning techniques in the Fast-Downward planning system for automated planning.

Plagiarism Detection Tool
Group Project Mar 2017 - Jun 2017

Group project as part of Software Engineering lecture going through all stages of software engineering from requirement gathering over planning and designing the architecture up to implementing and properly testing our prototype.

Galaxy-Based Search Algorithm
Group Project Sep 2018 - Dec 2018

Group project as part of Natural Computing lecture developing and critiquing the Galaxy-based Search Algorithm (GbSA) in comparison to Particle Swarm Optimisation (PSO) for PCA approximation.

Group Project Mar 2017 - Jun 2017

Group project as part of Modern Imperative Programming Languages lecture implementing the Conflict-Driven Clause Learning (CDCL) SAT-Solver using Rust.

Turn-Based Game
Group Project Sep 2016 - Oct 2016

In this extensive summer group project, we implemented an entire fictional turn-based strategy game in which various different computer- and player-controller characters can move on a hexagonal-map and attack other characters. The entire game-logic, game-server connection and a GUI for players were implemented throughout all stages of software engineering including architecture, creating multiple prototype visualizations and diagrams, implementing and testing our software.


For most projects, I am unable to provide access to code repositories. However, I would gladly discuss more details regarding the projects wherever possible. If you would like further information or have any general questions, please do not hesitate to get in touch!


Conference on Neural Information Processing Systems (NeurIPS), 2021
Conference on Neural Information Processing Systems (NeurIPS), 2020

Teaching Experience

Teaching Assistant

Oct 2019 - Present, School of Informatics, University of Edinburgh

Teaching assistant, demonstrator and marker for the Reinforcement Learning lecture at the University of Einburgh under Dr. Stefano V. Albrecht

  • Holding lectures on implementation of RL systems and Deep RL
  • Designing RL project covering wide range of topics including dynamic programming, single- and multi-agent RL as well as deep RL
  • Marking project and exam for reinforcement learning course
  • Advising students on various challenges regarding lecture material and content

Lecturer and Coach

Sep 2017 - Oct 2017, Mathematics Preparation Course, Saarland University

Voluntary lecturer and coach for the mathematics preparation course preparing upcoming computer science undergraduate students for their studies

  • Assisted the organisation of the mathematics preparation course for upcoming computer science students aiming to introduce them to foundational mathematical concepts, the university and student life as a whole
  • Introduced ∼250 participants to the importance of mathematics for computer science, formal languages and predicate logic in daily lectures of the first week
  • Supervised two groups to provide feedback and further assistance in daily coaching-sessions
  • The course received the BESTE-award for special student commitment 2017 at Saarland University

Teaching Assistant

Oct 2016 - Mar 2017, Dependable Systems and Software Chair, Saarland University

Tutor for the Programming 1 lecture about functional programming at the Dependable Systems and Software Group chair of Saarland University under Prof. Dr. Holger Hermanns

  • Taught first-year students fundamental concepts of functional programming, basic complexity theory and inductive correctness proofs in weekly tutorials and office hours
  • Corrected weekly tests as well as mid- and endterm exams
  • Collectively created learning materials and discussed student progress as part of the whole teaching team


Recent Posts