INDEX
Explanations
occurrences of personal pronouns
New Auto-Interp
Negative Logits
UGE
-0.16
êt
-0.16
IRM
-0.14
uiten
-0.14
eniable
-0.14
ÄĮer
-0.14
ANGER
-0.14
firm
-0.14
firm
-0.14
coma
-0.14
POSITIVE LOGITS
demand
0.23
hereby
0.20
mean
0.19
haz
0.19
bet
0.18
better
0.18
promise
0.17
Demand
0.17
demands
0.17
kid
0.16
Activations Density 0.329%