INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ર્�
    -0.09
    ાવ
    -0.08
    ાવતા
    -0.08
    ાવે
    -0.08
    जार
    -0.08
     बताते
    -0.07
     રોક
    -0.07
    ીસ
    -0.07
     મેળ
    -0.07
    -0.07
    POSITIVE LOGITS
     biting
    0.08
    india
    0.08
     gefällt
    0.08
    (ii
    0.08
     ii
    0.08
     Depends
    0.07
    자인
    0.07
     Congress
    0.07
     guessed
    0.07
     było
    0.07
    Act Density 0.035%

    No Known Activations