INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    m
    0.91
    م
    0.84
     
    0.82
    h
    0.71
    hst
    0.70
    м
    0.68
    hm
    0.67
    N
    0.67
    _
    0.66
    DING
    0.65
    POSITIVE LOGITS
    ifício
    0.93
     болезнь
    0.89
    ргә
    0.89
     ninguém
    0.89
     Эр
    0.89
    ных
    0.86
     энерги
    0.85
    ným
    0.85
     йөк
    0.85
     принять
    0.85
    Act Density 0.000%

    No Known Activations