INDEX
Explanations
comparisons in quantity, often focusing on the concept of "fewer."
New Auto-Interp
Negative Logits
shire
-0.60
Mut
-0.58
Draft
-0.57
wine
-0.56
NB
-0.56
agency
-0.56
raised
-0.54
Immun
-0.53
MO
-0.53
Literary
-0.52
POSITIVE LOGITS
than
1.05
Than
0.75
than
0.71
bies
0.68
calories
0.67
een
0.66
digits
0.62
pesky
0.62
bells
0.61
copies
0.61
Activations Density 7.579%