INDEX
Explanations
words and phrases expressing desire or intent in the speaker
New Auto-Interp
Negative Logits
oredCriteria
-1.10
itſelf
-1.03
Efq
-1.02
-0.96
myſelf
-0.93
Monfieur
-0.93
featureID
-0.91
himſelf
-0.88
GEBURTSDATUM
-0.85
themſelves
-0.84
POSITIVE LOGITS
for
0.57
a
0.56
an
0.53
neither
0.52
M
0.50
maioria
0.49
mostly
0.49
HORE
0.49
both
0.49
Tag
0.49
Activations Density 1.956%