INDEX
Explanations
explicit and potentially offensive language
expressions of strong profanity or expletives
New Auto-Interp
Negative Logits
NetMessage
-0.80
BIL
-0.73
ItemThumbnailImage
-0.66
PsyNetMessage
-0.65
conclud
-0.65
membr
-0.64
HCR
-0.64
renewal
-0.63
rece
-0.63
rouse
-0.62
POSITIVE LOGITS
holes
1.15
hole
1.10
lord
0.95
bags
0.92
fuck
0.91
buster
0.89
nuts
0.85
wit
0.85
glers
0.85
dump
0.84
Activations Density 0.014%