INDEX
Explanations
intense expressions of dislike or hatred.
phrases related to strong negative emotions.
expressions of strong dislike or aversion, particularly focusing on the word "hate" and similar negative sentiments.
Expressing dislike or aversion
like or hate
New Auto-Interp
Negative Logits
DockStyle
-0.82
WriteAttribute
-0.75
帖最后由
-0.71
GIVEREF
-0.66
клопе
-0.63
Personendaten
-0.61
{@-0.60
متعلقه
-0.59
новништво
-0.59
)_/¯
-0.59
POSITIVE LOGITS
being
0.84
being
0.72
Being
0.71
Being
0.63
BEING
0.60
losing
0.59
living
0.54
被人
0.54
incon
0.54
admitting
0.53
Activations Density 0.192%