INDEX
Explanations
mentions of fires, firefighters, and related events
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1013
+0.11
0.3%
576
+0.10
0.3%
939
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
946
+0.11
0.04
736
+0.10
0.05
166
+0.10
0.03
Negative Logits
:)))
-0.81
interro
-0.81
applau
-0.77
sonor
-0.71
ciment
-0.71
fondamental
-0.70
Realt
-0.70
hairc
-0.70
pép
-0.70
astu
-0.69
POSITIVE LOGITS
fire
0.89
firefighters
0.86
fires
0.83
burning
0.83
flames
0.81
firefighting
0.77
arson
0.76
fire
0.73
burn
0.70
extinguished
0.70
Activations Density 0.304%