INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    achi
    -0.80
    rises
    -0.75
    pered
    -0.74
    ulus
    -0.74
    ramid
    -0.73
    wn
    -0.71
    alsa
    -0.69
    ptoms
    -0.67
    si
    -0.67
    ffff
    -0.66
    POSITIVE LOGITS
    editor
    1.14
     editor
    1.09
     Editor
    0.87
    ials
    0.83
    ially
    0.81
     Editors
    0.76
     editing
    0.75
     editors
    0.74
     drawer
    0.74
     pane
    0.73
    Act Density 0.011%

    No Known Activations