INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    </b>
    -3.55
     The
    -2.50
    ↵↵
    -2.44
    h
    -2.44
     to
    -2.41
     I
    -2.31
     is
    -2.30
     you
    -2.27
     de
    -2.19
     herhangi
    -2.14
    POSITIVE LOGITS
    2.88
    2.75
    2.72
    2.67
     油画
    2.59
     花朵
    2.56
    2.56
     cláss
    2.55
    نوع
    2.50
    ١
    2.50
    Act Density 0.009%

    No Known Activations