INDEX
Explanations
numerical values and statistical data related to experiments
New Auto-Interp
Negative Logits
orsch
-0.17
bras
-0.15
asc
-0.15
haps
-0.14
arkin
-0.14
erif
-0.14
Reality
-0.14
akers
-0.14
Reality
-0.13
ORK
-0.13
POSITIVE LOGITS
±
0.52
±
0.48
+/-
0.41
+-
0.35
pm
0.30
±n
0.29
pm
0.29
+-
0.27
SD
0.27
(SE
0.26
Activations Density 0.029%