INDEX
Explanations
references to nuclear weapons and related geopolitical issues
New Auto-Interp
Negative Logits
osa
-0.15
etu
-0.15
<!--[
-0.14
ispiel
-0.14
Raster
-0.13
andas
-0.13
ietet
-0.13
onda
-0.13
_PATCH
-0.13
اتر
-0.13
POSITIVE LOGITS
fiss
0.20
nuclear
0.20
Nuclear
0.17
uranium
0.17
plu
0.17
enrichment
0.16
Capability
0.16
enrich
0.16
plut
0.16
enriched
0.16
Activations Density 0.024%