INDEX
Explanations
descriptions related to philosophical or existential questions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1446
+0.12
0.4%
872
+0.12
0.3%
2015
+0.11
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1446
+0.12
0.05
134
+0.12
0.04
1368
+0.11
0.03
Negative Logits
praktik
-0.90
akade
-0.79
akut
-0.78
optik
-0.77
radikal
-0.77
Strukt
-0.75
kosme
-0.75
Demokrat
-0.73
kalender
-0.72
kompati
-0.72
POSITIVE LOGITS
SneakyThrows
0.57
trésor
0.57
nimic
0.57
commandement
0.56
créateur
0.54
époux
0.53
donnera
0.53
totul
0.53
"$@"
0.52
vœ
0.51
Activations Density 0.469%