INDEX
    Explanations

    contrasting conjunctions

    New Auto-Interp
    Negative Logits
    但是我
    0.30
    ون
    0.29
     mutta
    0.27
     grosseur
    0.27
    ،
    0.27
     Nhưng
    0.27
    けど
    0.26
     nhưng
    0.26
     veoma
    0.26
    的代码
    0.25
    POSITIVE LOGITS
     it
    0.43
     there
    0.33
    0.28
    There
    0.27
    there
    0.27
     이는
    0.25
     ისინი
    0.25
    随着
    0.24
     itp
    0.24
    它可以
    0.24
    Act Density 0.453%

    No Known Activations