INDEX
Explanations
references or prompts to engage with the comments section of an online platform
references to user comments and engagement prompts
New Auto-Interp
Negative Logits
ISM
-0.88
relative
-0.81
agall
-0.73
Relative
-0.69
Ct
-0.67
cci
-0.66
Illegal
-0.65
NetMessage
-0.64
isen
-0.64
Definition
-0.64
POSITIVE LOGITS
comments
0.80
section
0.77
below
0.76
pring
0.75
Gle
0.69
sections
0.69
below
0.69
favourites
0.67
antha
0.66
box
0.66
Activations Density 0.057%