INDEX
Explanations
comparative language and expressions of magnitude or size
New Auto-Interp
Negative Logits
.mx
-0.17
ála
-0.15
Nguyen
-0.15
_allowed
-0.14
ly
-0.13
ìĿ´ìŀIJ
-0.13
UTO
-0.13
theid
-0.13
uet
-0.13
ête
-0.13
POSITIVE LOGITS
than
0.28
-than
0.24
than
0.23
_than
0.19
än
0.18
Than
0.17
THAN
0.17
než
0.17
Than
0.16
Bowen
0.15
Activations Density 0.299%