INDEX
    Explanations

    system/component descriptions

    New Auto-Interp
    Negative Logits
     کی۔
    0.43
    0.38
     في
    0.36
     Δη
    0.36
     فی
    0.34
    في
    0.34
    َرْ
    0.34
     گئی۔
    0.34
    пи
    0.34
    သည်။
    0.33
    POSITIVE LOGITS
     имеют
    0.43
     worden
    0.43
    都被
    0.41
    Are
    0.41
    0.41
     interag
    0.39
     are
    0.39
     interact
    0.39
    Interact
    0.39
     interacts
    0.38
    Act Density 0.049%

    No Known Activations