INDEX
Explanations
proper nouns, specifically names of places and settlements
New Auto-Interp
Negative Logits
dio
-0.17
Ĥ¹
-0.15
gue
-0.15
521
-0.15
umper
-0.14
ISIBLE
-0.14
ardi
-0.14
iyel
-0.14
Iteration
-0.14
uforia
-0.14
POSITIVE LOGITS
qv
0.15
Syn
0.15
coat
0.15
representative
0.14
an
0.14
Char
0.14
Mean
0.14
ега
0.14
mean
0.14
East
0.14
Activations Density 0.024%