INDEX
    Explanations

    identifiers after quotes or parens

    New Auto-Interp
    Negative Logits
     nowhere
    -0.11
    @
    -0.10
    ::::::
    -0.10
    (
    -0.10
     Guy
    -0.10
    .ribbon
    -0.09
    .jpg
    -0.09
     orm
    -0.09
    /
    -0.09
     Lorem
    -0.09
    POSITIVE LOGITS
     my
    0.21
     My
    0.17
     example
    0.15
     test
    0.14
    My
    0.14
    my
    0.14
    \tmy
    0.14
    æĪijçļĦ
    0.14
    example
    0.13
    nameof
    0.12
    Act Density 0.108%

    No Known Activations