INDEX
    Explanations

    Preserve code/configuration

    New Auto-Interp
    Negative Logits
    Reports
    -0.07
    born
    -0.07
    -abortion
    -0.07
     unacceptable
    -0.06
    نامه
    -0.06
     ως
    -0.06
    -0.06
     نامه
    -0.06
    _sp
    -0.06
     якої
    -0.06
    POSITIVE LOGITS
    0.07
    0.07
     psyched
    0.06
     CascadeType
    0.06
     víde
    0.06
    UID
    0.06
    시아
    0.06
    ain
    0.06
    OWER
    0.06
     Antoine
    0.06
    Act Density 0.014%

    No Known Activations