INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    COMMAND
    0.29
    lidir
    0.29
     μου
    0.28
    0.28
    nobyl
    0.28
    Kamu
    0.28
    soever
    0.27
    mim
    0.27
    τοι
    0.27
     بھی
    0.27
    POSITIVE LOGITS
     чтобы
    0.32
    жа
    0.32
    э
    0.32
     what
    0.31
     a
    0.29
     the
    0.29
     лишь
    0.29
     that
    0.29
     an
    0.29
     devoid
    0.29
    Act Density 0.003%

    No Known Activations