INDEX
Explanations
references to specific events or performances
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
50
+0.18
0.9%
1343
+0.12
0.5%
406
+0.11
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1343
+0.18
0.04
1978
+0.12
0.04
1233
+0.11
0.04
Negative Logits
<bos>
-2.80
<?
-1.12
ⓧ
-1.00
-0.99
/**
-0.88
intersper
-0.83
/*
-0.78
quitted
-0.75
rehabilitate
-0.74
banish
-0.73
POSITIVE LOGITS
karton
1.49
silikon
1.42
kafe
1.40
keramik
1.38
alkoh
1.29
seksi
1.27
kosme
1.26
uhr
1.25
optik
1.16
krim
1.16
Activations Density 0.068%