INDEX
Explanations
mentions of burning or fire-related activities
references to the concept of burning or being burned
New Auto-Interp
Negative Logits
berman
-0.71
Chomsky
-0.68
thus
-0.66
Architects
-0.65
UFC
-0.64
atem
-0.64
plom
-0.64
Republic
-0.63
egal
-0.63
legal
-0.63
POSITIVE LOGITS
burn
1.21
burned
1.19
burns
1.14
burning
1.06
burning
1.04
burnt
1.03
ished
0.96
burner
0.92
burn
0.87
nesday
0.83
Activations Density 0.008%