INDEX
Explanations
instances of the word "dozen"
references to quantities around a dozen
New Auto-Interp
Negative Logits
hra
-0.75
uria
-0.67
î
-0.67
ppers
-0.65
estine
-0.63
mathemat
-0.63
ICO
-0.62
otto
-0.62
Reviewer
-0.62
claimed
-0.61
POSITIVE LOGITS
uously
1.01
thousand
1.00
Thousand
0.96
dozen
0.96
dozen
0.92
uous
0.88
consecutive
0.84
acity
0.81
iterations
0.80
aciously
0.78
Activations Density 0.015%