INDEX
Explanations
references to fire incidents
occurrences of the word "fire" and related phrases
New Auto-Interp
Negative Logits
xus
-0.99
Vide
-0.72
Lans
-0.71
DonaldTrump
-0.68
Virtue
-0.68
edy
-0.66
Birth
-0.66
atem
-0.65
Phar
-0.64
Vec
-0.64
POSITIVE LOGITS
exting
1.20
storm
1.13
fighting
1.10
flies
1.07
storms
1.06
proof
1.06
places
1.02
brand
1.01
fight
0.99
balls
0.98
Activations Density 0.039%