INDEX
Explanations
phrases related to central concepts or themes
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1870
+0.12
0.5%
406
+0.12
0.5%
1937
+0.12
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
406
+0.12
0.02
1052
+0.12
0.02
1026
+0.12
0.02
Negative Logits
paesaggio
-0.59
villaggio
-0.56
medesimo
-0.54
tramonto
-0.52
Isid
-0.52
sieur
-0.51
prolon
-0.50
sabato
-0.49
Campionato
-0.48
cammino
-0.48
POSITIVE LOGITS
core
1.34
Core
1.26
core
1.24
Core
1.21
CORE
1.10
CORE
1.09
cores
1.08
Cores
0.85
Cores
0.82
cores
0.76
Activations Density 0.065%