INDEX
Explanations
information related to historical events and scientific research
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
690
+0.13
0.4%
382
+0.13
0.4%
1445
+0.13
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1445
+0.13
0.06
499
+0.13
0.04
382
+0.13
0.04
Negative Logits
distanciation
-0.80
gyhoeddwyd
-0.69
Geplaatst
-0.67
Himo
-0.66
stiller
-0.63
garan
-0.63
anse
-0.63
solidar
-0.62
graus
-0.62
dè
-0.62
POSITIVE LOGITS
unspeak
1.15
apprehen
1.07
reluct
1.01
impractica
1.01
disagre
0.99
impra
0.99
increa
0.96
Whence
0.96
unlaw
0.96
tolerably
0.94
Activations Density 0.262%