INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sympathy
    -0.08
     hend
    -0.08
     कभी
    -0.08
     sympathetic
    -0.08
     disinfect
    -0.08
     triste
    -0.08
     القرن
    -0.07
     குட
    -0.07
    िनी
    -0.07
     misery
    -0.07
    POSITIVE LOGITS
     assumed
    0.13
     Assuming
    0.12
     предполага
    0.11
     presumed
    0.11
     vermutlich
    0.11
     assum
    0.10
     Assume
    0.10
     presumably
    0.10
     assume
    0.10
     предп
    0.10
    Act Density 0.053%

    No Known Activations