INDEX
Explanations
words ending in "-ly"
adverbs, particularly those that end in "ly."
New Auto-Interp
Negative Logits
ilater
-0.99
bucks
-0.79
pie
-0.77
eers
-0.77
irtual
-0.66
arella
-0.64
ulu
-0.64
afety
-0.64
Weiner
-0.64
irlf
-0.63
POSITIVE LOGITS
speaking
0.83
Sabha
0.77
Speaking
0.77
RELE
0.70
entimes
0.70
tics
0.69
theless
0.67
UTERS
0.67
sourced
0.67
\\\\
0.65
Activations Density 0.037%