INDEX
Explanations
phrases related to personal interactions or actions
phrases indicating admiration or respect for individuals
New Auto-Interp
Negative Logits
ãĤ´ãĥ³
-0.67
orts
-0.64
amaru
-0.56
ocy
-0.56
obos
-0.55
RESULTS
-0.53
igators
-0.53
ãĥ©ãĥ³
-0.53
yg
-0.53
amas
-0.53
POSITIVE LOGITS
someone
1.32
somebody
1.18
someone
1.17
something
1.09
something
1.09
Someone
1.08
Something
1.03
preferably
1.03
an
1.01
a
1.00
Activations Density 0.427%