INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     adequate
    -0.08
     distal
    -0.07
     extrusion
    -0.07
     empre
    -0.07
     SD
    -0.07
     liberdade
    -0.07
     escapes
    -0.07
    peer
    -0.07
    percent
    -0.07
     pred
    -0.07
    POSITIVE LOGITS
    ायण
    0.09
     никогда
    0.08
     ποτέ
    0.08
    /respond
    0.08
     μαζί
    0.08
    clos
    0.08
    /on
    0.08
    ्टर
    0.08
    -toggler
    0.08
    ейс
    0.07
    Act Density 0.005%

    No Known Activations