INDEX
Explanations
dates in a specific format, specifically month/day
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1978
+0.13
0.4%
1896
+0.11
0.3%
755
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1062
+0.13
0.03
1081
+0.11
0.03
420
+0.10
0.03
Negative Logits
impra
-1.36
snoopy
-1.33
indescri
-1.33
unspeak
-1.32
shenan
-1.24
hairc
-1.24
indestru
-1.19
intersper
-1.13
horrend
-1.12
hentai
-1.10
POSITIVE LOGITS
meras
0.90
alkoh
0.88
portu
0.84
solidar
0.84
utop
0.83
asfal
0.82
ideolog
0.81
keramik
0.79
prostitu
0.78
ostante
0.77
Activations Density 0.053%