INDEX
Explanations
references to large quantities or numerical values, particularly those involving the word "thousand."
New Auto-Interp
Negative Logits
iger
-0.16
inz
-0.15
lee
-0.15
umin
-0.15
ou
-0.14
ault
-0.14
che
-0.14
icon
-0.14
avia
-0.13
ud
-0.13
POSITIVE LOGITS
ths
0.25
aires
0.19
naire
0.18
naires
0.18
th
0.17
lerce
0.17
ibase
0.17
ToOne
0.17
fold
0.17
PLUS
0.16
Activations Density 0.058%