INDEX
Explanations
references to numerical values, particularly the words "four" and "five."
New Auto-Interp
Negative Logits
UGE
-0.74
Hub
-0.73
geist
-0.72
LER
-0.72
ARP
-0.70
ULE
-0.69
TM
-0.68
HY
-0.68
Rica
-0.67
asta
-0.66
POSITIVE LOGITS
teenth
1.74
teen
1.65
hundred
1.34
fold
1.30
eenth
1.23
een
1.22
aciously
1.16
thousand
1.10
acious
1.08
uously
1.01
Activations Density 0.122%