INDEX
Explanations
mentions of fire-related events, such as flames, smoke, and blood
references to fire, smoke, and injuries in distressing contexts
New Auto-Interp
Negative Logits
Circ
-0.73
Birth
-0.70
Canad
-0.70
Idol
-0.69
Childhood
-0.69
virtues
-0.68
IGN
-0.67
Chance
-0.66
Campus
-0.65
Gene
-0.65
POSITIVE LOGITS
haze
0.90
engulf
0.89
engulfed
0.88
flooded
0.87
overwhelmed
0.86
clinging
0.84
flakes
0.83
fog
0.82
overtake
0.81
crawled
0.81
Activations Density 0.174%