INDEX
Explanations
references to intelligence or being clever
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
376
+0.14
0.8%
144
+0.14
0.8%
148
+0.14
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
339
+0.14
0.03
341
+0.14
0.03
144
+0.14
0.02
Negative Logits
wegian
-1.73
Suite
-1.49
Marathon
-1.41
ktop
-1.35
jas
-1.33
muse
-1.33
clusive
-1.33
ede
-1.31
darker
-1.30
href
-1.30
POSITIVE LOGITS
ĻĤ
2.38
contracts
1.96
ģ
1.90
Ħ
1.81
Ģ
1.76
ités
1.60
Īĺ
1.59
ĥ½
1.58
İ
1.56
ĸ
1.56
Activations Density 0.105%