INDEX
    Explanations

    expressions indicating newness or beginner status

    New Auto-Interp
    Negative Logits
     “
    -0.53
    roch
    -0.47
     "
    -0.45
     ir
    -0.43
    -0.43
     prí
    -0.43
     en
    -0.41
     zu
    -0.40
     bij
    -0.40
     ti
    -0.39
    POSITIVE LOGITS
     myſelf
    1.28
     Shakspeare
    1.06
     Efq
    0.96
     themſelves
    0.93
     Wikimédia
    0.90
     reaſon
    0.90
     himſelf
    0.89
     Monfieur
    0.87
     houſe
    0.86
    EndContext
    0.86
    Act Density 0.407%

    No Known Activations