INDEX
    Explanations

    comparative phrases and contrasts

    New Auto-Interp
    Negative Logits
    <?>>
    -0.14
    該
    -0.14
    obot
    -0.14
    .opend
    -0.13
    zin
    -0.13
    orris
    -0.12
    WN
    -0.12
    WidgetItem
    -0.12
     rundown
    -0.12
     oltre
    -0.12
    POSITIVE LOGITS
     воÑĤ
    0.23
     Conversely
    0.21
     whereas
    0.21
     convers
    0.21
     counterpart
    0.20
     Whereas
    0.20
     naopak
    0.20
     же
    0.19
     Ø£Ùħا
    0.18
    对äºİ
    0.18
    Act Density 0.296%

    No Known Activations