INDEX
    Explanations

    ads or advertisement-related phrases

    New Auto-Interp
    Negative Logits
    fruit
    -0.75
    ĵĺ
    -0.74
    terday
    -0.70
     Ago
    -0.67
     pity
    -0.66
    ¬¼
    -0.62
     Sunshine
    -0.61
     Stras
    -0.60
    chnology
    -0.60
     Constantin
    -0.60
    POSITIVE LOGITS
    rill
    1.31
    ouble
    1.28
    irect
    1.26
    icts
    1.25
    ragon
    1.25
    itions
    1.24
    der
    1.20
    iamond
    1.19
    aily
    1.18
    ifferent
    1.17
    Act Density 2.018%

    No Known Activations