INDEX
Explanations
prepositions followed by words indicating relationships or emotions
expressions related to love, relationships, and moral actions
New Auto-Interp
Negative Logits
interstitial
-0.93
endiary
-0.69
Donation
-0.68
uran
-0.66
ulations
-0.63
arbon
-0.63
millenn
-0.62
buster
-0.62
vertisement
-0.60
prepar
-0.60
POSITIVE LOGITS
him
1.51
me
1.48
us
1.46
Him
1.33
others
1.33
themselves
1.30
him
1.29
oneself
1.28
yourself
1.25
herself
1.24
Activations Density 0.356%