INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Belfast
    -0.07
     paranormal
    -0.07
     Ferdinand
    -0.06
    cassert
    -0.06
     sincerity
    -0.06
    ılım
    -0.06
    女人
    -0.06
    ün
    -0.06
     червня
    -0.06
     Blonde
    -0.06
    POSITIVE LOGITS
     to
    0.26
     TO
    0.22
     To
    0.21
    To
    0.18
    -to
    0.17
    to
    0.17
    _to
    0.16
    —to
    0.14
    -To
    0.13
    TO
    0.12
    Act Density 1.719%

    No Known Activations