INDEX
Explanations
comparative statements indicating superiority or excellence
terms related to surpassing or exceeding benchmarks or limits
New Auto-Interp
Negative Logits
loc
-0.84
llo
-0.72
ll
-0.72
arty
-0.71
Fein
-0.67
mun
-0.63
hair
-0.63
random
-0.63
xious
-0.61
orientation
-0.60
POSITIVE LOGITS
surpassed
3.25
surpass
3.18
eclips
2.45
exceeded
2.03
exceed
1.76
exceeds
1.71
eclipse
1.70
overtake
1.59
outper
1.58
exceeding
1.53
Activations Density 0.031%