Hey folks, let's have a little chat about construct validity --- this is the concept that if you're going to use a psychological test, there should be good reason to believe that a) the thing it's meant to be testing is real and b) results on the test reflect something about that thing.
spectrum.ieee.org/theory-of-miβ¦
>>
AI Outperforms Humans in Theory of Mind Tests
Large language models convincingly mimic the understanding of mental statesEliza Strickland (IEEE Spectrum)
Prof. Emily M. Bender(she/her) reshared this.
Prof. Emily M. Bender(she/her)
in reply to Prof. Emily M. Bender(she/her) • • •So, even if a test has been established to have construct validity as a test relating to human cognition, you can't just throw it at a chatbot and take the results as meaningful.
What would it mean for a language model to have a theory of mind? How does string manipulation relate to that? Without answers to these questions, the tests are meaningless.
Hans-Cees ππ²π¦π¦¦ππ¦ππ πΈπ³π΅πΎπΉπ¬πΉπ²
in reply to Prof. Emily M. Bender(she/her) • • •Dawn Ahukanna
in reply to Prof. Emily M. Bender(she/her) • • •Spicy auto-suggest providing responses to a βrapid recallβ prompt and eventually getting a passing grade does not mean understanding, of any kind, happened.
Shae
in reply to Prof. Emily M. Bender(she/her) • • •Probertd8
in reply to Prof. Emily M. Bender(she/her) • • •MylesRyden
in reply to Prof. Emily M. Bender(she/her) • • •I know I am preaching to the choir, but I can't even get past the headline. The ability to "convincingly mimic the understanding of mental states" is not the same as understanding mental states.
I can say the words that "mimic the understanding" of the Theory of Relativity, but that in no way makes me Einstein.
In fact I have said this many times to my SO...I can say the words about quantum mechanics (for example) but I have no actual idea what those words actually mean to an actual theoretical physicist.
elhult
in reply to Prof. Emily M. Bender(she/her) • • •Callisto
in reply to Prof. Emily M. Bender(she/her) • • •katlin
in reply to Prof. Emily M. Bender(she/her) • • •Also:
arstechnica.com/ai/2024/05/chaβ¦
ChatGPT shows better moral judgment than a college undergrad
Ars TechnicaLio Hong
in reply to Prof. Emily M. Bender(she/her) • • •tuban_muzuru
in reply to Prof. Emily M. Bender(she/her) • • •My old interrogation instructor in the Army taught me I would learn more from lies than truth.
Here's Google Gemini, obfuscating and lying, today. Paid version, advanced.
docs.google.com/document/d/1MZβ¦
Gemini does a Sgt. Schultz
Google Docssilverwizard
in reply to Prof. Emily M. Bender(she/her) • •gcvsa βοΈπ°πΊπΈ π΅π
in reply to Prof. Emily M. Bender(she/her) • • •FranΓ§ois Houste
in reply to Prof. Emily M. Bender(she/her) • • •Susan Kaye Quinn π±(she/her)
in reply to Prof. Emily M. Bender(she/her) • • •