INDEX
    Explanations

    phrases indicating ambiguity or uncertainty about specific criteria or definitions

    New Auto-Interp
    Negative Logits
    loh
    -0.18
    nees
    -0.14
    iset
    -0.14
    ican
    -0.14
    deal
    -0.14
    lero
    -0.14
    toc
    -0.14
    sea
    -0.14
    /parser
    -0.14
    sep
    -0.14
    POSITIVE LOGITS
    _DECLARE
    0.16
    à¹ĥà¸Ķ
    0.16
     particular
    0.15
    ingle
    0.15
    ripper
    0.14
    _specific
    0.14
    icit
    0.14
    -specific
    0.14
    _PROTO
    0.14
     pigeon
    0.14
    Act Density 0.080%

    No Known Activations