INDEX
Explanations
superlatives and extremes, particularly focusing on size or importance
phrases emphasizing the significance or magnitude of specific concepts or issues
New Auto-Interp
Negative Logits
renheit
-0.77
ipel
-0.75
FIL
-0.70
PET
-0.70
][
-0.68
mid
-0.67
each
-0.67
rub
-0.67
ulum
-0.65
Phase
-0.64
POSITIVE LOGITS
reason
1.22
downside
1.19
distinguishing
1.18
caveat
1.16
thing
1.16
drawback
1.15
takeaway
1.14
question
1.10
difference
1.08
notable
1.06
Activations Density 0.170%