INDEX
Explanations
verbs associated with personal actions or decisions
phrases indicating emotional or relational connections
New Auto-Interp
Negative Logits
ELS
-0.56
Eastern
-0.53
hawks
-0.52
los
-0.52
Adin
-0.52
Loaded
-0.50
Bridge
-0.50
Il
-0.50
Els
-0.49
thodox
-0.49
POSITIVE LOGITS
yourself
1.22
yourselves
1.11
Yourself
0.88
your
0.78
YOUR
0.71
your
0.66
Your
0.65
Your
0.61
poke
0.60
browsing
0.55
Activations Density 0.715%