INDEX
Explanations
descriptive details of events or locations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1978
+0.15
0.5%
776
+0.14
0.4%
1967
+0.12
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
528
+0.15
0.05
776
+0.14
0.06
411
+0.12
0.05
Negative Logits
nadzie
-0.84
redhead
-0.80
a
-0.79
as
-0.76
in
-0.76
dachshund
-0.75
motherfucker
-0.74
ruddy
-0.74
languid
-0.74
at
-0.74
POSITIVE LOGITS
alkoh
2.09
utop
2.00
kosme
1.98
silikon
1.97
solidar
1.91
kön
1.87
minimalis
1.86
keramik
1.85
antik
1.81
karton
1.80
Activations Density 0.193%