INDEX
    Explanations

    numeric values enclosed in square brackets

    the presence of numerical sequences or identifiers

    New Auto-Interp
    Negative Logits
     tremend
    -0.92
    olicy
    -0.89
    atre
    -0.85
    uppet
    -0.83
    isode
    -0.81
    enhagen
    -0.81
    orter
    -0.78
    ossession
    -0.78
    rint
    -0.77
    rison
    -0.77
    POSITIVE LOGITS
    384
    1.30
    6666
    1.19
    th
    0.94
    650
    0.86
    teen
    0.83
    07
    0.82
    66666666
    0.82
    05
    0.82
    340
    0.81
    06
    0.81
    Act Density 0.035%

    No Known Activations