INDEX
    Explanations

    option numbering or headings

    New Auto-Interp
    Negative Logits
    off
    0.72
    net
    0.70
    not
    0.64
    them
    0.64
    mat
    0.63
    no
    0.63
    to
    0.62
    nya
    0.62
    talk
    0.61
    time
    0.61
    POSITIVE LOGITS
    <unused2162>
    0.66
    <unused1753>
    0.64
     Méd
    0.62
    testAvg
    0.61
    Мар
    0.61
     Prensa
    0.61
    <unused2143>
    0.59
    <unused986>
    0.59
    <unused989>
    0.58
    0.58
    Act Density 0.065%

    No Known Activations