Can it truly be general intelligence if we have to keep defining rewards? Isn’t this just “vast collection of tasks and interpolation between them intelligence”? Big Narrow AI Bundle?
The individuated self defines its own reward. What about self-actualization alignment
This is a misunderstanding. We evolved in a dynamic multi-agent environment where the optimal solution was unclear and ever-shifting. This is very different from learning a task through reward reinforcement.
evolution created wolves, RL created dogs
Evolution defined our rewards...
— typebulb (@typebulbit) March 19, 2026
obviously like kind of a mixed metaphor, but in the sense that dog breeds are created towards specific tasks. Like an individual dog being pavlovian trained is even more like RL, but dog breeds are like instruct tuned versions of The wolf base model
Maybe not self-actualization but self-prediction. like in an active inference sense. desirable AI personas are those which are able to stably attract the correct connotations and produce the correct behaviors that align its own future self?
is this just constitutional RLAIF