INDEX
Explanations
comparative and superlative adjectives or expressions
New Auto-Interp
Negative Logits
699
-0.14
rei
-0.14
cups
-0.14
amera
-0.14
52
-0.13
red
-0.13
55
-0.13
_ping
-0.13
reg
-0.13
guilty
-0.13
POSITIVE LOGITS
instead
0.20
instead
0.19
ihan
0.17
Instead
0.16
вмеÑģÑĤ
0.16
PELL
0.16
士
0.15
Instead
0.15
!***
0.15
awe
0.15
Activations Density 0.187%