INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     polícia
    0.66
     услуг
    0.63
     ως
    0.63
     въ
    0.61
     tử
    0.60
    isions
    0.59
     його
    0.58
     khá
    0.58
     బయ
    0.58
    ä
    0.57
    POSITIVE LOGITS
    י
    1.09
    ه
    0.99
    م
    0.98
    ي
    0.95
    P
    0.94
    ب
    0.94
    0.93
    H
    0.91
     Foundation
    0.89
    ק
    0.89
    Act Density 0.001%

    No Known Activations