INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Ihres
    0.97
     моих
    0.84
    !
    0.82
    ของคุณ
    0.82
     Gonna
    0.80
    M
    0.79
     моего
    0.78
     моей
    0.78
    H
    0.72
    *.
    0.72
    POSITIVE LOGITS
     although
    2.26
     despite
    1.96
    虽然
    1.91
     while
    1.89
    although
    1.79
     iako
    1.76
     apesar
    1.72
    雖然
    1.72
     eftersom
    1.71
     poiché
    1.71
    Act Density 0.047%

    No Known Activations