INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    하거나
    0.33
     专业
    0.32
     பற்றிய
    0.31
     파일을
    0.29
     hoặc
    0.28
     ಅಥವಾ
    0.28
     alebo
    0.27
     зміню
    0.27
     किंवा
    0.27
     খাবার
    0.26
    POSITIVE LOGITS
    认为
    0.45
     believes
    0.42
    認為
    0.42
     says
    0.40
     believe
    0.39
     argue
    0.38
     argues
    0.38
     stwier
    0.37
     suggests
    0.37
    believe
    0.37
    Act Density 0.103%

    No Known Activations