INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     admire
    -0.08
    earn
    -0.08
     convert
    -0.07
     tame
    -0.07
    -0.07
     Tap
    -0.07
    ுதல்
    -0.07
    hots
    -0.07
    oting
    -0.07
    peg
    -0.06
    POSITIVE LOGITS
     사항
    0.11
    사항
    0.10
     Thornton
    0.08
     comercial
    0.08
     macam
    0.08
     resolver
    0.08
     Andrade
    0.08
     fiscal
    0.08
     angu
    0.07
     gover
    0.07
    Act Density 0.004%

    No Known Activations