-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
07d7e62
commit 760f074
Showing
1 changed file
with
22 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
--- | ||
title: "Turing award for RL, Solving SICP" | ||
date: 2025-03-05 | ||
tags: links computer-science reinforcemnet-learning textbook | ||
--- | ||
|
||
## Reinfocement learning pioneers receive Turing award | ||
|
||
Andrew Barto and Richard Sutton[^1] [receive the 2024 Turing award](https://awards.acm.org/about/2024-turing) for their groundbreaking work on [reinforcement learning](https://en.wikipedia.org/wiki/Reinforcement_learning). | ||
Even with today's focus on [LLM](https://en.wikipedia.org/wiki/Large_language_model)s and 'generative AI', reinforcement learning plays an important role, e.g. as the 'RL' part of [RLHF](https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback). | ||
I still remember how much I enjoyed their classic textbook on the subject (_Reinforcement Learning: An Introduction_) as a PhD student; it's [freely available](http://incompleteideas.net/book/the-book-2nd.html)[^2] and one of my all-time favorite computer science books. | ||
If you would rather _watch_ something on the topic, the [AlphaGo documentary](https://www.youtube.com/watch?v=WXuK6gekU1Y) is a nice showcase on how these ideas (combined with neural networks and _lots_ of compute power) can be successfully applied to problems [Go](https://en.wikipedia.org/wiki/Go_(game)), a wonderful board game that was previously thought 'unsolvable'. | ||
|
||
## SICP "100% speedrun" | ||
|
||
On a related note _also_ concerning famous computer science textbooks, here is [a nice breakdown from 2021](https://lockywolf.wordpress.com/2021/02/08/solving-sicp/) (via [HN](https://news.ycombinator.com/item?id=43257963)) of the time it takes to _fully_ work through [_Structure and Interpretation of Computer Programs_](https://en.wikipedia.org/wiki/Structure_and_Interpretation_of_Computer_Programs) (by Abelson, Sussman, and Sussman), colloquially known as SICP: roughly 729 hours, or almost 20 weeks of full-time work! 🤯 | ||
|
||
--- | ||
|
||
[^1]: Also, of ["The Bitter Lesson"](http://www.incompleteideas.net/IncIdeas/BitterLesson.html) fame. | ||
|
||
[^2]: This is the 'new' second edition that's already seven years old (the first edition is from 1998). Wow, time flies. |