INDEX
Explanations
names of places and related concepts
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
2034
+0.14
0.4%
872
+0.11
0.3%
1013
+0.09
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
284
+0.14
0.08
2044
+0.11
0.09
1013
+0.09
0.08
Negative Logits
sappi
-1.31
dises
-1.14
roberto
-1.13
sergio
-1.13
vogli
-1.11
gius
-1.11
?...
-1.11
desir
-1.10
soggior
-1.09
ridu
-1.09
POSITIVE LOGITS
but
1.19
but
0.96
But
0.90
But
0.88
nhưng
0.85
whereas
0.81
;
0.81
BUT
0.80
pero
0.79
,
0.77
Activations Density 0.896%