INDEX
Explanations
specific numerical information, including counts and measurements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
382
+0.15
0.4%
1456
+0.08
0.2%
381
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.15
0.05
1456
+0.08
0.03
736
+0.08
0.04
Negative Logits
solidar
-0.84
település
-0.81
noten
-0.77
smtplib
-0.76
Geplaatst
-0.75
Tembelea
-0.72
Logement
-0.70
pymysql
-0.70
spion
-0.69
vernac
-0.69
POSITIVE LOGITS
intersper
1.01
two
0.93
four
0.89
three
0.88
unspeak
0.88
five
0.87
apprehen
0.86
ineffec
0.85
impra
0.84
six
0.84
Activations Density 0.280%