INDEX
    Explanations

    occurrences of quantifiers and references to groups or quantities

    New Auto-Interp
    Negative Logits
     myſelf
    -1.65
     itſelf
    -1.53
     ſind
    -1.50
    ^(@)
    -1.46
     Monfieur
    -1.46
     iſt
    -1.45
     Anſ
    -1.43
    ſelves
    -1.42
     ―――――
    -1.42
     дописавши
    -1.41
    POSITIVE LOGITS
    <eos>
    0.98
    ,
    0.89
    .
    0.84
    0.83
    -
    0.83
     and
    0.83
     in
    0.82
     of
    0.82
     for
    0.79
     (
    0.75
    Act Density 0.830%

    No Known Activations