INDEX
Explanations
negative opinions or comments in text
New Auto-Interp
Negative Logits
ructure
-0.74
PTS
-0.72
ilitation
-0.71
ETF
-0.68
Applications
-0.68
ourses
-0.67
isSpecialOrderable
-0.67
consolidation
-0.67
onductor
-0.66
izons
-0.65
POSITIVE LOGITS
boobs
1.13
hilar
1.12
poop
1.11
dudes
1.09
dick
1.08
gigg
1.07
goof
1.06
jokes
1.05
hilarious
1.04
piss
1.01
Activations Density 1.670%