INDEX
    Explanations

    snippets from different kinds of documents

    part, aspect, beginning

    New Auto-Interp
    Negative Logits
    <unused61>
    -1.08
    <unused62>
    -1.06
    <unused63>
    -1.03
    -1.00
    1
    -0.98
    <eos>
    -0.95
    -0.94
    <unused60>
    -0.94
    ...
    -0.93
    ↵↵
    -0.91
    POSITIVE LOGITS
     Theſe
    2.00
     Monfieur
    1.90
     Efq
    1.88
     myſelf
    1.84
     itſelf
    1.71
     Мексичка
    1.59
     purpoſe
    1.58
     Anſ
    1.56
     ſeveral
    1.55
     Jefus
    1.54
    Act Density 12.738%

    No Known Activations