INDEX
Explanations
short phrases followed by intensifiers or modifiers like 'bit', 'pretty', 'good', 'whole', etc
expressions of quantity and assessment
New Auto-Interp
Negative Logits
hours
-0.79
effects
-0.77
anqu
-0.75
bots
-0.70
alks
-0.69
Eng
-0.69
asketball
-0.68
Autom
-0.68
today
-0.67
dule
-0.67
POSITIVE LOGITS
typo
1.09
lot
1.09
recipe
1.06
misconception
1.06
mistake
1.04
understatement
1.03
pretty
1.02
contradiction
1.02
prerequisite
1.01
rarity
1.01
Activations Density 0.157%