INDEX
    Explanations

    references to quantities and articles in various contexts

    New Auto-Interp
    Negative Logits
     Monfieur
    -1.07
     Theſe
    -1.05
     Efq
    -1.03
     unknownFields
    -0.94
     iconTwitter
    -0.91
     autorytatywna
    -0.91
    PreferredItem
    -0.91
    ftagPool
    -0.89
     Majefty
    -0.89
     disambiguazione
    -0.87
    POSITIVE LOGITS
     a
    0.67
     “
    0.51
     unique
    0.51
     ‘
    0.51
     an
    0.50
    <eos>
    0.47
     A
    0.46
     уника
    0.42
     B
    0.41
     our
    0.41
    Act Density 0.336%

    No Known Activations