INDEX
Explanations
references to personal rights and freedoms in the context of privacy and expression
frustration or negativity
profanity and strong emotions
New Auto-Interp
Negative Logits
])));
-0.62
>=",
-0.60
CopyWith
-0.57
محفوظة
-0.57
styleable
-0.56
"]);
-0.55
]
-0.55
ientôt
-0.53
/\.(
-0.53
towany
-0.53
POSITIVE LOGITS
fucking
0.97
fuckin
0.86
goddamn
0.82
fucking
0.80
damn
0.80
FUCKING
0.77
dammit
0.75
shitty
0.75
Fucking
0.73
freakin
0.71
Activations Density 0.501%