INDEX
Explanations
phrases related to terms and conditions, such as opt-outs and agreements
sentences related to user consent and options for opting out
New Auto-Interp
Negative Logits
Shades
-0.59
omorph
-0.57
Lauder
-0.57
igans
-0.55
zbollah
-0.54
Sons
-0.54
akov
-0.53
omers
-0.52
masc
-0.51
Cosponsors
-0.51
POSITIVE LOGITS
rame
0.66
Subscribe
0.63
estine
0.61
asp
0.61
Submit
0.57
osed
0.57
You
0.57
Want
0.56
xp
0.55
interrupted
0.55
Activations Density 0.015%