INDEX
Explanations
expressions of strong emotional responses and attitudes towards personal beliefs
Informal online communication
positive expressions and personal identity
New Auto-Interp
Negative Logits
autorytatywna
-0.74
=
-0.73
WaitGroup
-0.70
]-->
-0.68
AsUp
-0.64
;-)
-0.61
-0.60
;-)
-0.59
CHtml
-0.59
-0.57
POSITIVE LOGITS
ptid
0.68
🥺
0.67
idk
0.65
queer
0.65
abt
0.63
bc
0.61
uw
0.60
Idk
0.59
呜呜
0.57
lesbian
0.57
Activations Density 0.072%