INDEX
    Explanations

    phrases indicating contrasting or conditional relationships

    New Auto-Interp
    Negative Logits
     itself
    -0.16
     nữa
    -0.14
     notamment
    -0.14
    iyim
    -0.14
     ÑģебÑı
    -0.14
     então
    -0.14
     quindi
    -0.14
     yani
    -0.14
    izzling
    -0.13
     otherwise
    -0.13
    POSITIVE LOGITS
     although
    0.45
     while
    0.42
     whereas
    0.37
     despite
    0.37
     unlike
    0.36
    although
    0.34
     since
    0.33
    while
    0.33
     when
    0.30
     unless
    0.30
    Act Density 0.930%

    No Known Activations