INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sopr
    -0.77
     „,
    -0.73
     milano
    -0.72
     batte
    -0.72
     tramont
    -0.71
     isabel
    -0.70
     dci
    -0.70
     prado
    -0.69
     prega
    -0.69
     secon
    -0.69
    POSITIVE LOGITS
     but
    1.13
    but
    1.00
     But
    0.89
    But
    0.87
     nhưng
    0.87
     BUT
    0.82
     pero
    0.79
    BUT
    0.75
     lakini
    0.74
     แต่
    0.74
    Act Density 0.767%

    No Known Activations