INDEX
Explanations
website features like social media buttons and loading indicators
references to content and engagement features on social media platforms
New Auto-Interp
Negative Logits
istries
-0.73
ornia
-0.70
idity
-0.66
sic
-0.64
formance
-0.63
cannabin
-0.63
exempt
-0.62
rity
-0.61
berth
-0.61
incial
-0.60
POSITIVE LOGITS
catentry
0.65
hetti
0.62
ãĤ¶
0.60
gest
0.60
ozy
0.60
iframe
0.59
spoiled
0.59
onz
0.59
ragon
0.59
devices
0.59
Activations Density 0.363%