INDEX
    Explanations

    conjunctions and phrases indicating contrasts or comparisons

    New Auto-Interp
    Negative Logits
    Yet
    -0.18
     Yet
    -0.18
     chua
    -0.17
    thers
    -0.17
     HOWEVER
    -0.15
     yet
    -0.15
     elsewhere
    -0.14
    vice
    -0.14
     meanwhile
    -0.14
     GE
    -0.14
    POSITIVE LOGITS
     بÙĦÚ©Ùĩ
    0.27
     sino
    0.25
     sondern
    0.24
    ampo
    0.17
     but
    0.17
    моÑĢ
    0.16
     että
    0.15
    lijah
    0.15
    odelist
    0.14
    .TestTools
    0.14
    Act Density 0.021%

    No Known Activations