INDEX
Explanations
email-related text, including subscription prompts and requests for verification
New Auto-Interp
Negative Logits
mson
-0.72
Curve
-0.70
wagon
-0.69
anu
-0.69
andom
-0.66
Registered
-0.64
omon
-0.64
pex
-0.64
oret
-0.64
_>
-0.60
POSITIVE LOGITS
journalism
0.69
embed
0.65
transcription
0.62
promotions
0.58
promotional
0.58
giveaways
0.57
Privacy
0.56
redistribution
0.55
domains
0.55
graphics
0.55
Activations Density 13.204%