INDEX
Explanations
structured data entries like document IDs, titles, descriptions, and user information
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
876
+0.10
0.3%
1597
+0.10
0.3%
1699
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1597
+0.10
0.02
876
+0.10
-0.00
1147
+0.10
0.03
Negative Logits
kristal
-0.73
lü
-0.68
Demok
-0.68
konserv
-0.66
Kalifor
-0.65
Demokrat
-0.65
bakteri
-0.65
kompakt
-0.64
kooper
-0.63
maksi
-0.63
POSITIVE LOGITS
michelin
1.00
peppa
0.96
affor
0.96
parma
0.93
scrat
0.93
impractica
0.92
intermitt
0.91
tupperware
0.90
inconce
0.90
unden
0.89
Activations Density 0.221%