INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     CLASS
    0.31
    "
    0.29
     derivados
    0.28
     Raising
    0.28
     جوړونکي
    0.28
    OO
    0.27
    途径
    0.27
     MAY
    0.26
     NOTE
    0.26
    ريب
    0.26
    POSITIVE LOGITS
     despite
    0.66
    despite
    0.66
     trotz
    0.64
     Despite
    0.61
    Despite
    0.61
     несмотря
    0.59
    pesar
    0.55
     being
    0.55
     nonostante
    0.53
     apesar
    0.53
    Act Density 0.002%

    No Known Activations