INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tua
    -0.07
    interpreter
    -0.07
    .picture
    -0.06
    storybook
    -0.06
     робот
    -0.06
     eag
    -0.06
    tera
    -0.06
    رى
    -0.06
     всех
    -0.06
    .sig
    -0.06
    POSITIVE LOGITS
     instit
    0.08
     Scandin
    0.08
     Phill
    0.07
    Brun
    0.07
     había
    0.06
    0.06
     Punch
    0.06
     acl
    0.06
    cstdlib
    0.06
     Partition
    0.06
    Act Density 0.045%

    No Known Activations