INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    GROUND
    -0.73
    wark
    -0.69
    cipline
    -0.67
    RECT
    -0.63
    FORE
    -0.62
    selves
    -0.62
    arers
    -0.62
    CHO
    -0.61
    keeper
    -0.60
    cemic
    -0.60
    POSITIVE LOGITS
    imus
    1.53
    imil
    1.35
    imize
    1.31
    imal
    1.13
    ima
    1.07
    imo
    0.99
    imates
    0.95
    ime
    0.94
    imen
    0.93
    ims
    0.92
    Act Density 0.015%

    No Known Activations