INDEX
Explanations
phrases related to receiving updates or special offers
references to news and updates
New Auto-Interp
Negative Logits
eur
-0.79
avery
-0.75
verning
-0.74
rifice
-0.71
asus
-0.69
arton
-0.68
amina
-0.68
agall
-0.68
odor
-0.67
iac
-0.66
POSITIVE LOGITS
updates
0.86
ilver
0.85
notifications
0.84
Updates
0.79
update
0.77
mith
0.74
peak
0.72
notification
0.71
peed
0.71
olitan
0.71
Activations Density 0.015%