INDEX
    Explanations

    role and involvement in

    New Auto-Interp
    Negative Logits
    ວ່າ
    0.39
    sufficient
    0.38
    شاعرانه
    0.35
     చేసుకు
    0.35
     বাগের
    0.34
    Ligações
    0.34
     kwamba
    0.33
    pug
    0.33
    ίδα
    0.33
     basada
    0.33
    POSITIVE LOGITS
    ในการ
    1.52
    ក្នុងការ
    0.98
     katika
    0.89
     فى
    0.86
     în
    0.86
     във
    0.82
     dalam
    0.80
     trong
    0.79
    ใน
    0.79
     in
    0.73
    Act Density 0.021%

    No Known Activations