INDEX
    Explanations

    references to tobacco-related products and names

    New Auto-Interp
    Negative Logits
    isd
    -0.17
    anth
    -0.16
    ิà¹Ģศษ
    -0.15
    åħ¼
    -0.15
    jed
    -0.15
    .Toolkit
    -0.15
    еÑģÑĤи
    -0.14
    apr
    -0.14
    urs
    -0.14
    pty
    -0.14
    POSITIVE LOGITS
    ább
    0.20
    acco
    0.18
    hiba
    0.18
    .LENGTH
    0.16
    istrovstvÃŃ
    0.16
    amak
    0.16
     пÑĢиÑĤ
    0.15
    yo
    0.15
    xic
    0.15
    gay
    0.14
    Act Density 0.021%

    No Known Activations