INDEX
Explanations
terms related to technology and terminology
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
752
+0.12
0.4%
50
+0.11
0.3%
678
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
16
+0.12
0.05
1226
+0.11
0.04
752
+0.10
0.04
Negative Logits
]")]
-0.57
Để
-0.56
Những
-0.55
PREFERRED
-0.55
HasOne
-0.54
will
-0.54
even
-0.53
just
-0.53
couldn
-0.53
Muchas
-0.53
POSITIVE LOGITS
„,
1.70
aen
1.58
mef
1.55
effe
1.49
stockholm
1.46
ftu
1.43
fta
1.43
milano
1.42
wien
1.40
lyon
1.39
Activations Density 0.405%