INDEX
Negative Logits
regulating
0.46
reg
0.46
regs
0.45
regulatory
0.42
pul
0.40
ario
0.40
ereg
0.39
tis
0.38
uttore
0.38
ACCESS
0.38
POSITIVE LOGITS
advertisements
0.68
advertisement
0.65
ads
0.62
adverts
0.59
advertis
0.58
Ads
0.57
Advertisement
0.56
广告
0.55
Advert
0.53
advertise
0.53
Activations Density 0.000%