INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     clearly
    -0.08
     reportedly
    -0.07
    ()(
    -0.07
     tends
    -0.07
     accom
    -0.07
     আর
    -0.07
     امریکا
    -0.07
    Handlers
    -0.07
     vistas
    -0.07
    Library
    -0.07
    POSITIVE LOGITS
     देंगे
    0.09
     करेंगे
    0.09
     gloss
    0.08
     Gloss
    0.08
    ્યાન
    0.08
     Claim
    0.08
     molest
    0.08
     demikian
    0.08
     puzzled
    0.08
     देगा
    0.08
    Act Density 0.002%

    No Known Activations