INDEX
Explanations
profanity and strong emotional expressions
New Auto-Interp
Negative Logits
كومونز
-0.69
তথ্যসূত্র
-0.68
ExtendWith
-0.66
lewati
-0.66
ductory
-0.65
Gale
-0.64
>");
-0.60
intptr
-0.60
ství
-0.60
,–
-0.59
POSITIVE LOGITS
fucking
1.31
fuck
1.26
fucking
1.24
FUCK
1.19
goddamn
1.19
Fuck
1.18
Fucking
1.16
Fuck
1.16
bullshit
1.16
fuckin
1.15
Activations Density 0.201%