INDEX
    Explanations

    quoted strings and their variations in format

    New Auto-Interp
    Negative Logits
    adays
    -0.20
    {},
    -0.19
    {}.
    -0.18
    @@@@
    -0.18
    {}]
    -0.17
    {}'.
    -0.17
    {"
    -0.16
    wards
    -0.16
    '
    -0.16
    -a
    -0.15
    POSITIVE LOGITS
    %%
    0.17
    ..."↵
    0.16
    .*
    0.16
    adora
    0.16
    ."↵↵
    0.16
    uez
    0.16
    jes
    0.15
    ModelError
    0.15
    !"↵
    0.15
     "↵
    0.15
    Act Density 0.125%

    No Known Activations