INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Debian
    -0.10
     irrig
    -0.09
     Ottawa
    -0.08
     정책
    -0.08
    .tiles
    -0.08
     biomass
    -0.08
     деревян
    -0.08
     Manitoba
    -0.08
     Moodle
    -0.08
     ischem
    -0.08
    POSITIVE LOGITS
     celebrity
    0.24
     celebrities
    0.22
     Celebrity
    0.21
    Celebrity
    0.21
     glamour
    0.18
     glamorous
    0.17
     papar
    0.16
    明星
    0.16
     célé
    0.15
     Cele
    0.15
    Act Density 0.258%

    No Known Activations