INDEX
    Explanations

    the word "and" preceded or followed by a number, a preposition, or an article

    New Auto-Interp
    Negative Logits
    <bos>
    -0.88
    Personensuche
    -0.78
    ↵↵
    -0.76
     the
    -0.75
    "
    -0.66
     at
    -0.66
    -0.66
     I
    -0.64
      
    -0.62
     for
    -0.62
    POSITIVE LOGITS
     Efq
    1.70
     myſelf
    1.45
     itſelf
    1.41
     Jefus
    1.38
     Reſ
    1.37
     himſelf
    1.37
     Eſ
    1.36
     Theſe
    1.36
     raiſ
    1.35
     Anſ
    1.35
    Act Density 0.691%

    No Known Activations