Advanced Fellowship
Evaluate. Steer. Control. Align future models.
Interrogate the leading techniques to keep machine intelligence safe.
Mitigating risks from advanced AI is possibly the most pressing problem of our time.
The 10-week Advanced Fellowship explores the most promising technical approaches to understand highly capable models, which we consider broadly useful for adressing many different problems such as misalignment or misuse.
We focus on two important topics:
Evaluation: How can we test what capabilities AI models have and how they will act in certain situations?
Interpretability: Can we learn more about AI models by looking at their internal workings?
In the first session, we'll have a broad recap and a discussion about possible sources of risk.
Then we'll have four sessions on each of the two technical topics:
General Introduction
Paper discussion
Hands-on experiments
Q&A with a researcher
Among other things, we'll discuss how we know refusal is mediated by a single direction and
what can be said about the length of tasks AI can do over time.
Finally, we’ll have a session on career progression, with these problems in mind.
If this sounds interesting to you, don’t hesitate to apply!
-
April 30th: Application Deadline
May 3rd: Application Decisions
May 8th: Fellowship kick-off
-
Basic Understanding of Deep Learning and Python
Interest in technical work
Applications from non-CS backgrounds and underrepresented groups such as woman are highly encouraged!
-
We expect fellows to commit around 2 hours per week by attending
our in-person session on Friday, 10am to 12pm.
-
If you have any questions regarding the fellowship, feel free contact us at: