INDEX
Explanations
adjectives expressing certainty or inevitability
terms that convey certainty or strong affirmation
New Auto-Interp
Negative Logits
trim
-0.69
lower
-0.69
ramid
-0.67
oa
-0.66
lder
-0.65
yers
-0.65
specific
-0.64
ctors
-0.64
fancy
-0.64
ria
-0.63
POSITIVE LOGITS
undeniable
2.21
unstoppable
2.13
irresistible
2.04
unmist
1.92
irre
1.85
irreversible
1.81
irresist
1.80
inex
1.80
undeniably
1.74
unparalleled
1.70
Activations Density 0.060%