INDEX
Explanations
references to nuclear power and weapons
New Auto-Interp
Negative Logits
LoadIdentity
-0.16
CTX
-0.16
gem
-0.15
ableView
-0.15
chaft
-0.14
ekil
-0.14
коном
-0.14
åIJ¾
-0.14
ÑĤин
-0.14
aurus
-0.14
POSITIVE LOGITS
-powered
0.23
weapons
0.20
power
0.20
weapon
0.19
medicine
0.18
powered
0.18
waste
0.18
powered
0.17
-power
0.16
Weapons
0.16
Activations Density 0.012%