INDEX
Explanations
instances where something is led to or connected with another thing
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1464
+0.13
0.4%
1381
+0.10
0.3%
161
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1464
+0.13
0.04
1381
+0.10
0.04
225
+0.10
0.03
Negative Logits
<bos>
-0.89
ejecut
-0.55
andaag
-0.54
mittag
-0.47
audiovisuel
-0.47
FileInputStream
-0.46
reempla
-0.45
konzert
-0.45
besondere
-0.43
ECONDS
-0.43
POSITIVE LOGITS
aen
0.87
maer
0.85
lyon
0.85
fte
0.84
greate
0.84
squa
0.82
ecru
0.81
fortn
0.81
roth
0.80
intenance
0.80
Activations Density 0.193%