INDEX
Explanations
instances where adjective-adverb pairs are unexpectedly combined
adjectives that describe the quality of something, particularly focusing on poor or well descriptions
New Auto-Interp
Negative Logits
omore
-0.74
anos
-0.73
impossibility
-0.72
oleon
-0.72
atorium
-0.71
Pione
-0.67
anta
-0.67
itivity
-0.67
ACY
-0.66
VIDEOS
-0.66
POSITIVE LOGITS
formatted
1.38
calibrated
1.32
behaved
1.32
constructed
1.31
crafted
1.31
configured
1.30
structured
1.25
designed
1.25
tuned
1.22
organized
1.22
Activations Density 0.124%