INDEX
    Explanations

    asterisks as bullet points

    New Auto-Interp
    Negative Logits
     \...
    0.91
    0.91
    </strong>
    0.83
     $$\
    0.81
     \""
    0.79
     ।,
    0.79
     (…)
    0.78
    !!");
    0.78
    0.77
    0.76
    POSITIVE LOGITS
    *
    5.34
     *
    4.21
    *.
    3.87
    *,
    3.81
    .*
    3.50
    *'
    3.50
    *:
    3.41
    ,*
    3.37
    *$
    3.35
    *"
    3.34
    Act Density 3.161%

    No Known Activations