INDEX
Explanations
descriptions of explosive events and their aftermath
New Auto-Interp
Negative Logits
349
-0.16
mlin
-0.16
éĢł
-0.16
ayet
-0.15
kraje
-0.15
bombed
-0.14
гоÑĤ
-0.14
umpy
-0.14
handjob
-0.14
Attacks
-0.14
POSITIVE LOGITS
Conc
0.21
cause
0.20
knock
0.20
Knock
0.19
dent
0.18
panc
0.18
vapor
0.18
gou
0.18
critically
0.17
conc
0.17
Activations Density 0.134%