INDEX
    Explanations

    negative phrases and contradictions in statements

    New Auto-Interp
    Negative Logits
     btw
    -0.17
    ache
    -0.16
    but
    -0.15
    ACION
    -0.15
    oop
    -0.15
     but
    -0.15
    acher
    -0.14
    ä¸įè¿ĩ
    -0.14
    uft
    -0.14
    305
    -0.14
    POSITIVE LOGITS
    ↵↵
    0.19
    WindowState
    0.16
    NodeType
    0.15
    BorderStyle
    0.15
     ones
    0.15
    unate
    0.15
    że
    0.15
    SPATH
    0.15
    ãģĿãĤĮãģ¯
    0.14
     actual
    0.14
    Act Density 0.393%

    No Known Activations