INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _intersection
    -0.07
    psilon
    -0.06
     Zhu
    -0.06
     technological
    -0.06
    .EqualTo
    -0.06
    ilight
    -0.06
     obliv
    -0.06
     enamel
    -0.06
     Just
    -0.06
    item
    -0.06
    POSITIVE LOGITS
     Alfred
    0.07
    0.06
    -Петерб
    0.06
     Hob
    0.06
     Freud
    0.06
    0.06
     wed
    0.06
    BH
    0.06
    0.06
     Assume
    0.06
    Act Density 0.001%

    No Known Activations