INDEX
Explanations
instances of the word "let" used in various contexts
New Auto-Interp
Negative Logits
letting
-0.22
Lets
-0.21
lets
-0.21
Allow
-0.19
lets
-0.19
_allow
-0.17
shan
-0.17
Allow
-0.17
unce
-0.16
Let
-0.16
POSITIVE LOGITS
alone
0.30
alone
0.23
least
0.23
less
0.21
Alone
0.20
Least
0.19
lone
0.18
Least
0.18
-alone
0.18
less
0.17
Activations Density 0.005%