INDEX
    Explanations

    dialogue followed by punctuation

    New Auto-Interp
    Negative Logits
    <h2>
    2.05
    ).\\
    1.61
    $.\\
    1.59
    ’।
    1.59
    》。
    1.46
    <blockquote>
    1.44
    ’.
    1.40
    </h2>
    1.34
    <h3>
    1.31
    }.}
    1.29
    POSITIVE LOGITS
    ,"
    3.46
    ),"
    3.15
    ,''
    2.97
    ,”
    2.94
    ],"
    2.91
    ,'"
    2.86
    ,”
    2.75
    ,\"
    2.75
    ),”
    2.67
    ,'
    2.53
    Act Density 0.031%

    No Known Activations