INDEX
Explanations
phrases related to advertisements or promotional content
references to "AD" followed by numerical values, which likely relate to specific identifiers or categories
New Auto-Interp
Negative Logits
htt
-0.72
assetsadobe
-0.69
Kubrick
-0.68
Stronghold
-0.64
ĸļ
-0.61
externalToEVAOnly
-0.61
Trog
-0.60
Kus
-0.58
tons
-0.58
Yiannopoulos
-0.57
POSITIVE LOGITS
VERTIS
1.27
VERTISEMENT
1.23
MIN
1.16
vantage
1.08
venture
1.02
DIS
0.99
visor
0.99
DR
0.98
iamond
0.96
ILY
0.94
Activations Density 0.026%