INDEX
Explanations
adverbs that express a sense of peculiarity or oddity
descriptors conveying unusual, unexpected, or ambiguous qualities
New Auto-Interp
Negative Logits
uese
-0.73
tan
-0.70
essors
-0.69
Dems
-0.66
rities
-0.65
orem
-0.65
Methods
-0.65
publication
-0.64
Slaughter
-0.63
Guys
-0.63
POSITIVE LOGITS
suited
0.85
tuned
0.84
disple
0.75
impressed
0.74
alarmed
0.73
aroused
0.72
fascinated
0.72
conceived
0.70
situated
0.70
confused
0.70
Activations Density 0.017%