INDEX
Explanations
mentions of explosive devices or situations
mentions of explosives or explosive-related terms
New Auto-Interp
Negative Logits
cedented
-0.87
aird
-0.83
SEA
-0.79
pai
-0.79
alian
-0.77
heit
-0.76
alth
-0.76
esan
-0.76
atche
-0.76
beard
-0.73
POSITIVE LOGITS
explosive
0.98
eru
0.96
decomp
0.86
incendiary
0.84
deton
0.75
disposal
0.74
flares
0.74
darts
0.74
propulsion
0.72
endiary
0.72
Activations Density 0.017%