INDEX
Explanations
adverbs ending in 'ly'
words ending in "ly."
New Auto-Interp
Negative Logits
ilater
-0.83
senal
-0.75
hemor
-0.71
respectively
-0.70
afety
-0.70
comprom
-0.65
ERA
-0.65
itivity
-0.65
corrective
-0.64
Annotations
-0.64
POSITIVE LOGITS
rics
0.98
ffe
0.88
zed
0.88
puff
0.88
sis
0.82
upe
0.81
tics
0.81
pha
0.79
waters
0.77
clad
0.76
Activations Density 0.029%