INDEX
Explanations
phrases that refer to individuals or groups of individuals
New Auto-Interp
Negative Logits
enberg
-0.15
oksen
-0.15
hc
-0.14
Ïģιά
-0.14
itez
-0.14
ston
-0.14
zzo
-0.13
estone
-0.13
gba
-0.13
_locals
-0.13
POSITIVE LOGITS
794
0.15
reau
0.15
419
0.14
_DL
0.14
are
0.14
Least
0.14
least
0.14
hap
0.14
Ads
0.14
innen
0.13
Activations Density 0.119%