INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ((
    0.46
     envisions
    0.46
     namun
    0.45
     نگه
    0.45
     $:
    0.44
    ist
    0.44
    :
    0.43
     ingin
    0.42
     Warum
    0.42
     implique
    0.42
    POSITIVE LOGITS
    0.39
     እንዲሁም
    0.38
     ασ
    0.37
    0.37
    тун
    0.36
    父母
    0.36
     suốt
    0.36
    ру
    0.36
    ρο
    0.36
    SHOP
    0.36
    Act Density 0.412%

    No Known Activations