INDEX
Explanations
references to geopolitical power dynamics and relationships
New Auto-Interp
Negative Logits
usan
-0.18
ilen
-0.17
lero
-0.16
å¢ĥ
-0.16
rava
-0.14
affer
-0.13
vala
-0.13
_tol
-0.13
ocrat
-0.13
utch
-0.13
POSITIVE LOGITS
powers
0.45
Powers
0.39
powers
0.37
power
0.34
power
0.29
-power
0.29
super
0.28
Power
0.27
Power
0.27
POWER
0.25
Activations Density 0.099%