INDEX
Explanations
comparative phrases indicating relative size or frequency
New Auto-Interp
Negative Logits
essian
-0.17
lest
-0.16
laces
-0.15
ulu
-0.14
Halk
-0.14
uitka
-0.14
thalm
-0.14
šel
-0.14
quine
-0.14
ewise
-0.14
POSITIVE LOGITS
times
0.27
fold
0.25
fold
0.25
normal
0.22
Fold
0.20
Fold
0.20
(!
0.20
-fold
0.20
times
0.19
greater
0.19
Activations Density 0.035%