INDEX
Explanations
adjectives related to degree or level, such as "enough" and "insufficient"
terms related to authenticity and increasing intensity, particularly in negative contexts
New Auto-Interp
Negative Logits
anders
-0.75
ploma
-0.73
aden
-0.71
ADS
-0.68
illon
-0.68
astical
-0.67
annis
-0.67
andering
-0.67
antics
-0.65
ategory
-0.64
POSITIVE LOGITS
ly
2.76
LY
1.75
liness
1.35
lys
1.31
lies
1.21
ELY
1.19
fully
1.15
edly
1.14
ity
1.09
liest
1.04
Activations Density 0.124%