INDEX
    Explanations

    instances of negation or contradiction in statements

    New Auto-Interp
    Negative Logits
     Gro
    -0.15
    aily
    -0.14
    åĶĩ
    -0.14
    ERVE
    -0.14
    ICA
    -0.14
    ursive
    -0.14
    ÙĬب
    -0.14
    agle
    -0.14
    ashtra
    -0.14
    Gro
    -0.14
    POSITIVE LOGITS
    unde
    0.17
    apa
    0.15
    trace
    0.14
    scopes
    0.14
    UIApplicationDelegate
    0.14
    bir
    0.14
    696
    0.14
     Trace
    0.14
    burgh
    0.14
    anza
    0.14
    Act Density 0.002%

    No Known Activations