INDEX
Explanations
references to biting or bites in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
164
+0.15
0.9%
422
+0.13
0.8%
115
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
422
+0.15
0.01
505
+0.13
0.01
164
+0.13
0.01
Negative Logits
Ĥ
-2.87
Ħ
-2.77
µ
-2.75
ı
-2.69
Ń
-2.66
ħ
-2.58
Į
-2.58
³
-2.56
Ĺ
-2.56
ĵ
-2.52
POSITIVE LOGITS
gart
1.88
strom
1.81
fest
1.78
wagen
1.74
indo
1.68
ios
1.66
hard
1.63
garten
1.62
fors
1.58
slots
1.57
Activations Density 0.019%