INDEX
Explanations
references to fire or flames
references to fire and flames
New Auto-Interp
Negative Logits
alian
-0.88
alth
-0.83
nai
-0.82
ettings
-0.80
riad
-0.75
ournal
-0.73
guyen
-0.72
States
-0.70
chant
-0.70
Roberts
-0.69
POSITIVE LOGITS
flame
1.05
flies
1.02
retard
1.02
extinguished
0.91
flames
0.89
orescence
0.88
torches
0.83
hotter
0.83
furnace
0.82
torch
0.82
Activations Density 0.020%