INDEX
Explanations
profane language
references to profanity or vulgar language
New Auto-Interp
Negative Logits
fman
-0.72
HCR
-0.67
PsyNetMessage
-0.66
Parables
-0.65
arb
-0.63
cffff
-0.61
EVA
-0.61
Expend
-0.60
tnc
-0.59
obser
-0.59
POSITIVE LOGITS
loads
1.22
bags
1.19
storm
1.16
heads
1.10
lords
1.06
faced
1.06
bag
1.03
post
1.03
lord
0.99
detector
0.98
Activations Density 0.045%