· 1 min read

Context windows

we’ve done a lot of stuff to lengthen those context windows already, and yet we still have to do caching because of the economics of attention. We may spend more time on this branch of the tech tree because it is baked into the inference stack but when we let go we will be free

View original