INDEX
    Explanations

    instances of the word "isn't" and its variations, indicating negation or contradiction

    New Auto-Interp
    Negative Logits
    زاÙĨ
    -0.14
     spared
    -0.14
    lys
    -0.14
    wid
    -0.14
    ownik
    -0.13
    znam
    -0.13
    è͵
    -0.13
     Wid
    -0.13
     âľĶ
    -0.13
     ãĢIJ
    -0.13
    POSITIVE LOGITS
    oad
    0.16
     DISCLAIM
    0.16
    Labels
    0.16
    aaaaaaaa
    0.16
    keh
    0.15
    âĨIJ
    0.15
    acci
    0.15
    igin
    0.14
    oha
    0.14
    undry
    0.14
    Act Density 0.137%

    No Known Activations