INDEX
Explanations
words indicative of human experiences or personal narratives
New Auto-Interp
Negative Logits
several
-0.71
several
-0.69
Several
-0.67
many
-0.65
Several
-0.63
BeginInit
-0.63
many
-0.60
externi
-0.59
Usual
-0.58
banyak
-0.58
POSITIVE LOGITS
who
0.63
privées
0.60
kto
0.59
whom
0.56
antaranya
0.56
RegressionTest
0.55
anonyme
0.54
pubblici
0.54
dévelo
0.53
veramente
0.51
Activations Density 0.254%