INDEX
    Explanations

    occurrences of negation or absence in various contexts

    New Auto-Interp
    Negative Logits
    anes
    -0.14
    emi
    -0.14
    hazi
    -0.14
     (
    -0.13
    iaz
    -0.13
     sext
    -0.13
    /posts
    -0.13
    fal
    -0.13
     sund
    -0.13
     Meyer
    -0.13
    POSITIVE LOGITS
    ildo
    0.16
    kea
    0.15
    okable
    0.15
    ãĤ§
    0.14
    akit
    0.14
    ãĥ¼ãĥĨ
    0.14
    ¶ģ
    0.14
    人æ°Ĺ
    0.14
    375
    0.14
    .Sdk
    0.14
    Act Density 0.009%

    No Known Activations