INDEX
Explanations
phrases related to specific locations or entities spelled in a particular way
terms related to administrative divisions or geographical entities
New Auto-Interp
Negative Logits
ppo
-0.71
Forge
-0.70
bians
-0.70
ulkan
-0.69
tes
-0.69
ppa
-0.68
hene
-0.67
kies
-0.66
zo
-0.66
izations
-0.66
POSITIVE LOGITS
irect
1.08
ead
1.02
aniel
1.00
ocument
1.00
irection
0.98
ragon
0.95
ynamic
0.95
ouble
0.94
isd
0.94
itional
0.88
Activations Density 0.030%