INDEX
    Explanations

    discrimination

    New Auto-Interp
    Negative Logits
    >Add
    -0.06
    -0.06
    जब
    -0.06
     Lear
    -0.06
    іла
    -0.06
     دع
    -0.06
    led
    -0.06
     Intent
    -0.06
     entertaining
    -0.06
    est
    -0.06
    POSITIVE LOGITS
     strive
    0.08
    -form
    0.06
    football
    0.06
    Axes
    0.06
     unwitting
    0.06
     Faul
    0.06
    ній
    0.06
     germany
    0.06
    expired
    0.06
    'value
    0.06
    Act Density 0.058%

    No Known Activations