INDEX
Explanations
words related to geopolitics and international relations
keywords related to countries, organizations, and positions of power
New Auto-Interp
Negative Logits
prem
-0.83
ounded
-0.78
Supplemental
-0.69
hai
-0.69
iceps
-0.66
utenberg
-0.64
ounding
-0.64
bah
-0.61
ounds
-0.61
irens
-0.60
POSITIVE LOGITS
imaginable
1.30
whatsoever
1.14
conceivable
1.01
except
0.90
besides
0.75
describ
0.74
anywhere
0.74
Else
0.70
alike
0.68
soever
0.68
Activations Density 0.233%