INDEX
Explanations
phrases related to explosives
references to explosive devices or situations
New Auto-Interp
Negative Logits
cedented
-0.87
SEA
-0.83
wright
-0.81
pai
-0.81
ploma
-0.81
atche
-0.81
haps
-0.80
aird
-0.79
alian
-0.79
igan
-0.75
POSITIVE LOGITS
eru
1.01
explosive
0.94
incendiary
0.87
decomp
0.84
iating
0.80
deton
0.79
onite
0.76
bursts
0.75
endiary
0.75
aneously
0.73
Activations Density 0.020%