INDEX
Explanations
mentions of neighboring or nearby entities
references to neighboring countries or regions
New Auto-Interp
Negative Logits
endi
-0.88
inen
-0.84
icer
-0.81
udeb
-0.80
odor
-0.78
erer
-0.76
hene
-0.74
apego
-0.73
ifter
-0.73
anche
-0.73
POSITIVE LOGITS
neighboring
1.06
neighbors
1.01
neighbouring
0.99
neighbor
0.92
Neigh
0.87
neighbours
0.85
neighbour
0.83
Borders
0.79
territories
0.78
regions
0.77
Activations Density 0.008%