INDEX
Explanations
names of places or geographical locations
New Auto-Interp
Negative Logits
arers
-0.85
apple
-0.84
glers
-0.84
ebook
-0.77
isite
-0.76
holding
-0.75
iners
-0.75
ocument
-0.74
lance
-0.73
onies
-0.72
POSITIVE LOGITS
Kab
0.92
Benz
0.77
Kaz
0.75
Kats
0.74
Zen
0.74
henko
0.74
ovych
0.74
Dim
0.73
Dortmund
0.73
Kon
0.73
Activations Density 0.024%