INDEX
Explanations
mentions of the word "both"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1065
+0.11
0.4%
405
+0.11
0.4%
1331
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1065
+0.11
0.05
405
+0.11
0.04
47
+0.11
0.04
Negative Logits
Я
-0.49
adomo
-0.48
Aan
-0.46
cap
-0.46
Мар
-0.45
etc
-0.45
xba
-0.45
footer
-0.45
databind
-0.45
uma
-0.45
POSITIVE LOGITS
fta
1.29
sii
1.29
bandung
1.26
territo
1.25
hcm
1.24
thut
1.22
aen
1.21
fortn
1.21
jaya
1.20
stockholm
1.19
Activations Density 0.092%