INDEX
Explanations
adverbs that end in -ly
adverbs, particularly those describing manner
New Auto-Interp
Negative Logits
ilater
-0.87
GOODMAN
-0.80
afety
-0.80
Extrem
-0.76
Julio
-0.75
eport
-0.69
icans
-0.68
anian
-0.67
ilion
-0.67
esson
-0.63
POSITIVE LOGITS
rics
0.87
ffe
0.83
adv
0.83
tics
0.80
present
0.80
clad
0.78
zed
0.77
puff
0.74
dispose
0.74
gged
0.73
Activations Density 0.039%