INDEX
Explanations
phrases indicating quantity or count
New Auto-Interp
Negative Logits
Millions
-0.18
inski
-0.18
millions
-0.16
Two
-0.16
two
-0.15
billions
-0.15
xCD
-0.14
sted
-0.14
Huge
-0.14
Much
-0.14
POSITIVE LOGITS
dozen
0.63
dozens
0.47
scores
0.45
score
0.44
scores
0.37
doz
0.37
Scores
0.37
Score
0.36
handful
0.35
-score
0.35
Activations Density 0.111%