INDEX
Explanations
comparative phrases indicating a degree of superiority or excess
New Auto-Interp
Negative Logits
irim
-0.15
outu
-0.15
ic
-0.15
ESS
-0.15
omer
-0.15
uty
-0.15
inalg
-0.14
rou
-0.14
gun
-0.14
pot
-0.14
POSITIVE LOGITS
usual
0.14
dozen
0.14
ever
0.14
rát
0.14
ahn
0.13
793
0.13
azor
0.13
á»ķ
0.13
ToFront
0.13
sinking
0.13
Activations Density 0.041%