INDEX
Explanations
verbs followed by pronouns indicating actions or consequences that may happen as a result of a specific behavior
references to the pronoun "you" in various contexts
New Auto-Interp
Negative Logits
ãĤ´ãĥ³
-0.79
surprisingly
-0.71
ãĤ¤ãĥĪ
-0.65
Estimates
-0.64
-0.62
albeit
-0.62
ULTS
-0.62
Anchorage
-0.62
Assembly
-0.62
âĵĺ
-0.61
POSITIVE LOGITS
wanna
1.26
're
1.23
ain
1.08
criticize
0.99
lose
0.98
offend
0.97
injure
0.94
succeed
0.94
want
0.94
piss
0.94
Activations Density 0.170%