INDEX
Explanations
instances of numerical data and annotations related to timing or identifiers
New Auto-Interp
Negative Logits
/datatables
-0.15
aal
-0.15
utzer
-0.14
Neville
-0.14
.Magenta
-0.14
turnstile
-0.14
phia
-0.14
icari
-0.13
ynom
-0.13
Modal
-0.13
POSITIVE LOGITS
orient
0.30
controls
0.26
Controls
0.25
control
0.24
Controls
0.22
Ori
0.22
control
0.21
_controls
0.21
.control
0.21
Control
0.21
Activations Density 0.001%