INDEX
Explanations
phrases related to unity and collaboration
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
990
+0.16
0.6%
1671
+0.14
0.5%
1451
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
990
+0.16
0.04
1451
+0.14
0.03
1671
+0.12
0.04
Negative Logits
Transcrip
-0.75
Juventud
-0.60
Anm
-0.58
Nuc
-0.56
notori
-0.56
Avez
-0.55
Histology
-0.54
decla
-0.54
Cár
-0.53
ideolog
-0.52
POSITIVE LOGITS
together
0.96
together
0.88
Together
0.88
Together
0.86
gether
0.82
TOGETHER
0.81
bleus
0.66
juntos
0.59
leone
0.58
Zusammen
0.52
Activations Density 0.070%