INDEX
Explanations
references to tables or data structures in programming or database contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
23
+0.20
1.1%
420
+0.11
0.6%
353
+0.10
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
23
+0.20
0.02
56
+0.11
0.03
164
+0.10
0.02
Negative Logits
ĥ½
-1.95
ļ
-1.88
ĵ
-1.77
İ
-1.74
Ĵ
-1.69
Ń
-1.68
ı
-1.68
MOESM
-1.62
ķ
-1.62
£
-1.61
POSITIVE LOGITS
tack
1.64
threshold
1.54
reaction
1.50
oxin
1.43
sight
1.41
ÂĹ
1.41
foreseeable
1.40
overhead
1.40
estine
1.39
ude
1.39
Activations Density 0.228%