INDEX
Explanations
intensifiers or modifiers that emphasize the degree of something, particularly the word "very"
New Auto-Interp
Negative Logits
irk
-0.17
olt
-0.16
ital
-0.15
ense
-0.14
ery
-0.14
ict
-0.14
igor
-0.14
aroo
-0.13
IQ
-0.13
vig
-0.13
POSITIVE LOGITS
yyyy
0.20
yyy
0.17
ocoder
0.17
yy
0.16
/ext
0.15
much
0.15
eur
0.15
uger
0.14
angel
0.14
yl
0.14
Activations Density 0.062%