INDEX
Explanations
references to borders
mentions of geographical or political borders
New Auto-Interp
Negative Logits
ynasty
-0.73
orah
-0.71
ibur
-0.70
ointed
-0.69
partName
-0.67
ctive
-0.65
verbs
-0.65
%]
-0.64
odor
-0.64
thora
-0.64
POSITIVE LOGITS
borders
0.94
Borders
0.89
ansas
0.80
lines
0.78
crossings
0.77
afety
0.75
radius
0.74
delim
0.73
rants
0.73
border
0.72
Activations Density 0.017%