INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    vero
    -0.16
    fram
    -0.14
     Fleming
    -0.14
    otts
    -0.14
    alach
    -0.13
     Amar
    -0.13
     Rus
    -0.13
     Justice
    -0.13
    atis
    -0.13
     Origins
    -0.13
    POSITIVE LOGITS
    725
    0.15
    ynet
    0.15
    body
    0.15
    lád
    0.15
    548
    0.14
    opup
    0.14
    aket
    0.14
    ÄĽtÃŃ
    0.14
    ATAB
    0.14
    div
    0.14
    Act Density 0.036%

    No Known Activations