INDEX
Explanations
the presence of the verb "to be" in various forms
Preceding the word "are"
being part of a description
New Auto-Interp
Negative Logits
a
-0.59
lèvres
-0.58
seseorang
-0.57
someone
-0.55
iemand
-0.51
Someone
-0.51
one
-0.50
ers
-0.49
domés
-0.49
las
-0.49
POSITIVE LOGITS
wolves
1.12
Theſe
1.02
ligiloj
0.99
those
0.92
yourselves
0.91
Personendaten
0.90
ſelves
0.87
)";
0.86
assholes
0.86
themſelves
0.84
Activations Density 0.501%