INDEX
Explanations
references to fighting or struggle in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.16
0.9%
481
+0.14
0.8%
304
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
481
+0.16
0.04
56
+0.14
0.00
304
+0.13
0.04
Negative Logits
vez
-1.74
blogger
-1.69
TRODUCTION
-1.57
habl
-1.55
valuable
-1.51
ureus
-1.50
handy
-1.48
aid
-1.44
privileged
-1.41
purchased
-1.38
POSITIVE LOGITS
ground
2.22
hammer
2.13
grounds
2.11
oon
2.03
against
2.01
lord
1.98
oons
1.78
lords
1.77
front
1.76
scenes
1.74
Activations Density 0.383%