INDEX
Explanations
affiliate links within text
references to affiliate marketing or affiliate links
New Auto-Interp
Negative Logits
STD
-0.81
pp
-0.80
ppo
-0.75
ests
-0.73
sen
-0.73
sa
-0.71
ft
-0.69
Quiet
-0.68
borgh
-0.68
boats
-0.66
POSITIVE LOGITS
affiliate
1.39
iliate
1.29
affiliates
1.00
affiliated
0.80
affili
0.78
agre
0.77
iliated
0.76
affiliation
0.75
network
0.74
ende
0.74
Activations Density 0.007%