INDEX
    Explanations

    explaining, covering, or deciding

    New Auto-Interp
    Negative Logits
     nadal
    0.16
    0.14
     جوړونکو
    0.14
     kesalahan
    0.14
     يساعد
    0.13
     wych
    0.13
     ด่า
    0.13
     لگاتا
    0.13
    برى
    0.13
     varieties
    0.13
    POSITIVE LOGITS
     will
    0.21
     decided
    0.21
     apologize
    0.21
     opted
    0.19
     realize
    0.18
     noticed
    0.18
     chose
    0.18
     consulted
    0.18
     realise
    0.18
     need
    0.18
    Act Density 0.039%

    No Known Activations