INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Zd
    -0.07
     homem
    -0.07
    aku
    -0.07
    -cl
    -0.06
     cylinder
    -0.06
    -0.06
     strokes
    -0.06
     indign
    -0.06
     Ble
    -0.06
     Unary
    -0.06
    POSITIVE LOGITS
    bote
    0.07
    (range
    0.07
    coder
    0.07
    ".↵↵↵↵
    0.06
    .Chat
    0.06
    ");↵↵↵
    0.06
    aligned
    0.06
       
    0.06
    _altern
    0.06
    ↵			↵
    0.06
    Act Density 0.049%

    No Known Activations