INDEX
    Explanations

    phrases related to articles and advertisements

    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.01
    2:0.32
    3:0.07
    4:0.11
    5:0.03
    6:0.13
    7:0.06
    8:0.04
    9:0.05
    10:0.06
    11:0.05
    Negative Logits
    owship
    -1.57
    quartered
    -1.54
    arent
    -1.52
    oided
    -1.44
    ooting
    -1.42
    reet
    -1.38
    pless
    -1.36
    esville
    -1.34
    ezvous
    -1.33
    asted
    -1.32
    POSITIVE LOGITS
     Whats
    1.46
     裏�
    1.37
    Accessory
    1.36
    cmp
    1.35
    �醒
    1.33
    968
    1.32
     802
    1.28
     WhatsApp
    1.27
     cannabin
    1.27
    ilon
    1.25
    Act Density 0.003%

    No Known Activations