INDEX
Explanations
strings of strong profanity, especially the f-word.
angry quotes
New Auto-Interp
Negative Logits
afone
-0.71
fjspx
-0.71
تضيفلها
-0.70
enterOuterAlt
-0.67
kaarangay
-0.65
mergeFrom
-0.65
SharedCtor
-0.64
sizeCache
-0.62
kasarigan
-0.61
nities
-0.60
POSITIVE LOGITS
venons
0.62
voulons
0.49
GPL
0.48
industriel
0.46
necesitan
0.46
defiant
0.45
blijven
0.45
耳
0.44
帖最后由
0.44
ferons
0.44
Activations Density 1.644%