INDEX
    Explanations

    questions and exclamations

    New Auto-Interp
    Negative Logits
     fournir
    0.36
    OR
    0.35
     deberán
    0.33
     효율
    0.33
     слот
    0.33
     निर्देशों
    0.32
    需要在
    0.31
     বিশ্লেষণের
    0.31
     मूल्यों
    0.31
     संसाधनों
    0.31
    POSITIVE LOGITS
     kenapa
    0.43
     무슨
    0.41
     unwell
    0.41
     why
    0.38
    ؟
    0.38
     dared
    0.38
     something
    0.38
     acaso
    0.37
    为什么
    0.37
    why
    0.36
    Act Density 0.116%

    No Known Activations