INDEX
Explanations
adverbs and adjectives that indicate improvement or quality
New Auto-Interp
Negative Logits
eria
-0.15
elf
-0.15
achs
-0.15
cision
-0.14
ompiler
-0.14
iв
-0.14
">//
-0.14
Literal
-0.13
-lfs
-0.13
iors
-0.13
POSITIVE LOGITS
IDb
0.15
ly
0.15
irl
0.14
ãģ¾ãĤĭ
0.14
ãģĸ
0.14
Hast
0.13
imus
0.13
ikk
0.13
åķĨ
0.12
sola
0.12
Activations Density 0.136%