INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     vault
    -0.06
     IS
    -0.06
    -Out
    -0.06
    _repository
    -0.06
     ARE
    -0.06
    624
    -0.06
    آم
    -0.06
     bowed
    -0.06
    атар
    -0.06
    ALK
    -0.06
    POSITIVE LOGITS
     de
    0.09
    dm
    0.07
    .De
    0.07
    .Di
    0.07
     d
    0.07
    .design
    0.07
     von
    0.07
     диагности
    0.07
     del
    0.07
     from
    0.06
    Act Density 0.063%

    No Known Activations