INDEX
Explanations
words related to fires and arson
references to fire and arson-related incidents
New Auto-Interp
Negative Logits
allery
-0.73
Surgery
-0.70
atem
-0.67
ajor
-0.66
Vag
-0.65
Senators
-0.65
apore
-0.65
glomer
-0.64
omo
-0.61
afort
-0.60
POSITIVE LOGITS
blaze
1.31
flame
0.98
flies
0.97
blazing
0.94
fires
0.93
flames
0.93
extingu
0.90
exting
0.89
extinguished
0.88
lda
0.85
Activations Density 0.017%