INDEX
Explanations
terms related to geopolitical regions
references to a specific geographical region
New Auto-Interp
Negative Logits
ãĥīãĥ©
-0.82
gement
-0.72
selage
-0.70
urations
-0.66
thood
-0.65
glers
-0.65
FIN
-0.64
veyard
-0.64
CRIPTION
-0.63
VID
-0.63
POSITIVE LOGITS
ally
0.83
region
0.82
wide
0.78
region
0.76
geographically
0.75
ional
0.74
wikipedia
0.70
side
0.70
economically
0.70
area
0.70
Activations Density 0.015%