INDEX
Explanations
expressions and discussions regarding opinions
New Auto-Interp
Negative Logits
lsi
-0.19
orian
-0.17
gars
-0.16
gow
-0.16
chner
-0.16
lier
-0.15
abet
-0.15
cq
-0.15
uras
-0.15
_FILL
-0.15
POSITIVE LOGITS
aires
0.22
ated
0.20
naire
0.20
/op
0.19
polls
0.17
expressed
0.17
holders
0.17
ATED
0.17
ster
0.17
opinion
0.17
Activations Density 0.031%