INDEX
Explanations
dates written in numeral form
numerical values and dates
New Auto-Interp
Negative Logits
abwe
-0.67
ably
-0.61
ikawa
-0.61
imaru
-0.61
ulsion
-0.58
withstanding
-0.58
theless
-0.57
Paste
-0.57
acco
-0.56
ensibly
-0.56
POSITIVE LOGITS
th
1.56
teenth
1.01
venth
0.98
rd
0.97
hest
0.94
TH
0.93
ieth
0.92
nd
0.85
th
0.85
acre
0.83
Activations Density 0.099%