INDEX
Explanations
postures or positions involving assertion or dominance
expressions of strong emotions or desires
New Auto-Interp
Negative Logits
strikingly
-0.73
respectively
-0.71
understandably
-0.64
ullah
-0.62
omin
-0.59
evidently
-0.56
Typically
-0.56
Published
-0.56
idges
-0.56
Historically
-0.56
POSITIVE LOGITS
myself
1.85
my
1.36
thee
1.10
MY
0.96
yours
0.93
ya
0.92
mine
0.87
somet
0.84
them
0.83
THEM
0.80
Activations Density 0.489%