INDEX
    Explanations

    phrases indicating exceptions or notable differences in discussions

    New Auto-Interp
    Negative Logits
    borough
    -0.18
    pis
    -0.15
    uld
    -0.14
    ako
    -0.14
    ally
    -0.14
    -at
    -0.14
    ochen
    -0.13
    aland
    -0.13
    lando
    -0.13
    oš
    -0.13
    POSITIVE LOGITS
     exception
    1.21
     exceptions
    1.13
     Exceptions
    0.98
    exceptions
    0.93
     except
    0.82
    exception
    0.81
     EXCEPTION
    0.79
    Exceptions
    0.77
     Exception
    0.75
    except
    0.74
    Act Density 0.178%

    No Known Activations