INDEX
    Explanations

    punctuation marks and the beginning of sentences

    New Auto-Interp
    Negative Logits
    ↵↵
    -0.71
    ↵↵↵
    -0.57
    ↵↵↵↵
    -0.55
    <eos>
    -0.53
     ‘
    -0.51
    -0.50
    -0.48
     or
    -0.48
    cy
    -0.47
    </strong>
    -0.46
    POSITIVE LOGITS
     Paglinawan
    0.95
    Diwedd
    0.93
     transfieras
    0.89
    AutoScaleMode
    0.85
    <bos>
    0.84
     allAfrica
    0.84
    IUrlHelper
    0.80
     purpoſe
    0.80
     chré
    0.77
     Савезне
    0.77
    Act Density 0.346%

    No Known Activations