INDEX
Explanations
prepositions and verbs indicating a connection or action with strong emphasis
prepositions and phrases that indicate relationships or connections
New Auto-Interp
Negative Logits
terness
-0.58
grain
-0.54
Champ
-0.53
anners
-0.52
spec
-0.51
ians
-0.51
olicy
-0.51
gged
-0.50
arty
-0.50
chanted
-0.50
POSITIVE LOGITS
anyways
0.89
besides
0.85
nowadays
0.83
lately
0.81
.
0.81
during
0.81
anyway
0.80
beforehand
0.79
.:
0.79
.,
0.77
Activations Density 0.176%