INDEX
Explanations
words related to burning or fire
references to burning or destruction
New Auto-Interp
Negative Logits
Slam
-0.70
maid
-0.66
ortment
-0.65
ele
-0.64
schild
-0.64
gravity
-0.63
ror
-0.60
padding
-0.59
BRE
-0.59
BUT
-0.59
POSITIVE LOGITS
ished
1.14
hotter
1.09
brightly
1.06
ishing
1.04
ishes
0.95
bridges
0.88
ashes
0.87
burning
0.83
ishable
0.83
alive
0.83
Activations Density 0.088%