INDEX
Explanations
mentions of intelligence or intelligent actions in the context of various scenarios
mentions of intelligence or the concept of being intelligent
New Auto-Interp
Negative Logits
gren
-0.76
yan
-0.75
bley
-0.72
fold
-0.67
Nationals
-0.67
Bund
-0.65
pal
-0.65
git
-0.64
hs
-0.64
washing
-0.64
POSITIVE LOGITS
intelligent
3.48
elligent
2.64
Intelligent
2.59
smarter
1.84
intellig
1.77
smart
1.62
smartest
1.54
sentient
1.53
thoughtful
1.48
educated
1.33
Activations Density 0.018%