INDEX
Explanations
phrases related to occupation or territories
New Auto-Interp
Negative Logits
uber
-0.80
raft
-0.75
nir
-0.74
peer
-0.73
ripp
-0.71
lass
-0.70
issues
-0.70
ect
-0.69
brother
-0.69
à¥
-0.69
POSITIVE LOGITS
Territories
0.95
occupied
0.91
Territory
0.86
territories
0.84
occupying
0.81
occupation
0.78
territory
0.74
Borders
0.72
Seventh
0.72
ational
0.69
Activations Density 0.029%