INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    indexes
    -0.08
    aisy
    -0.08
    -un
    -0.08
     activates
    -0.08
     선언
    -0.07
     capacit
    -0.07
    flower
    -0.07
    _Reset
    -0.07
     aktif
    -0.07
     Capac
    -0.07
    POSITIVE LOGITS
    0.08
     Fern
    0.08
     Nc
    0.08
    0.08
     Fargo
    0.08
    ummar
    0.08
     cort
    0.08
     mods
    0.08
     Newcastle
    0.07
     munsi
    0.07
    Act Density 0.002%

    No Known Activations