INDEX
Explanations
references to military and conflict-related terms and events
New Auto-Interp
Negative Logits
_contin
-0.16
untu
-0.15
enda
-0.15
oze
-0.15
illez
-0.15
grace
-0.15
enze
-0.15
endas
-0.14
вен
-0.14
rael
-0.14
POSITIVE LOGITS
_NV
0.15
usz
0.15
erno
0.14
/ss
0.14
Blast
0.14
umble
0.13
Duy
0.13
ξά
0.13
zig
0.13
izzle
0.13
Activations Density 0.014%