INDEX
Explanations
pronouns or possessive adjectives followed by action verbs describing mental or physical states or actions
possessive pronouns and their associated contexts
New Auto-Interp
Negative Logits
avers
-0.76
enced
-0.69
parted
-0.66
aiden
-0.65
pps
-0.64
ĸļ
-0.63
Plex
-0.63
wered
-0.62
shorth
-0.62
ochet
-0.62
POSITIVE LOGITS
ikuman
0.74
stairs
0.73
conflic
0.71
ILCS
0.66
seams
0.66
calories
0.65
Ballistic
0.64
enegger
0.64
ISSION
0.63
gam
0.63
Activations Density 0.340%