INDEX
Explanations
the concept of variability or differences in various contexts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
376
+0.13
0.7%
417
+0.12
0.7%
125
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
417
+0.13
0.01
15
+0.12
0.01
468
+0.11
0.01
Negative Logits
ORS
-1.80
UTERS
-1.62
orte
-1.53
---|---|---
-1.49
slightest
-1.45
ollo
-1.44
AC
-1.42
glass
-1.42
OR
-1.39
coli
-1.39
POSITIVE LOGITS
ĻĤ
1.84
Ļ
1.77
ually
1.74
Īĺ
1.70
sized
1.63
heta
1.60
ently
1.57
ħ
1.47
iously
1.46
ibus
1.43
Activations Density 0.356%