INDEX
    Explanations

    the word "In" alone on a line, which is a quirk of the data format

    New Auto-Interp
    Negative Logits
     myſelf
    -1.97
     itſelf
    -1.96
     Efq
    -1.94
     Monfieur
    -1.92
     Jefus
    -1.80
     Theſe
    -1.77
     pleaſure
    -1.74
     becauſe
    -1.73
     Inſ
    -1.70
     purpoſe
    -1.68
    POSITIVE LOGITS
     in
    2.52
     on
    1.19
     dalam
    1.08
     at
    1.05
    ,
    1.02
     is
    0.97
     for
    0.96
     в
    0.94
     to
    0.93
     as
    0.91
    Act Density 1.564%

    No Known Activations