INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     myſelf
    -1.50
     Efq
    -1.44
    protoimpl
    -1.42
     Monfieur
    -1.37
     purpoſe
    -1.28
     Theſe
    -1.27
     Jefus
    -1.27
     himſelf
    -1.27
     pleaſure
    -1.26
     snippetHide
    -1.20
    POSITIVE LOGITS
    ,
    0.83
    <eos>
    0.75
     the
    0.74
    0.70
    1
    0.67
    /
    0.66
    :
    0.65
    es
    0.65
     in
    0.63
    in
    0.62
    Act Density 1.619%

    No Known Activations