On Reinforcement Learning and Truth Seeking

Ryan Khurana

Oct 28, 2024

Or why LLMs are sycophants

Read →

3 Comments

Ryan Khurana

Oct 29, 2024

Not sure what the objective function would be. All metrics that can be gamed will be gamed if it’s efficient

Expand full comment

Reply (1)

barnabus

May 21

That's the difference of machine learning Go or Chess vs LLM. For Chess or Go metrics are easy. For LLM they are probably using 100k "readers" on meager pay from developing countries?

Expand full comment

Jason Van Humbeck

Oct 29, 2024

I really enjoyed the article - thank you for sharing!

I think your article highlights a fascinating problem, and one that I believe is already becoming prevalent in chatbots and other models using RLHF. For both Humans and AI, a commitment to 100% truth and correctness isn’t always desirable or beneficial for every relationship/situation. Which makes me wonder: as models grow more advanced or sentient, what balance between truth and sycophancy will they naturally converge toward?

Given your experience in data and machine learning, do you think more objective functions, like those in data analysis, could still benefit from some degree of sycophancy in ML/AI models used in evaluation?

Expand full comment

The (Ge)Narrative

On Reinforcement Learning and Truth Seeking