INDEX
    Explanations

    actions/processes

    New Auto-Interp
    Negative Logits
     Efq
    -1.87
     Majefty
    -1.78
     itſelf
    -1.77
     Jefus
    -1.76
     myſelf
    -1.76
     Theſe
    -1.75
     Monfieur
    -1.75
     auroit
    -1.73
     ſche
    -1.73
     pleaſure
    -1.73
    POSITIVE LOGITS
     in
    1.20
     and
    1.13
    ,
    1.02
     the
    0.95
     to
    0.94
    0.94
     (
    0.94
     for
    0.92
     are
    0.92
     all
    0.91
    Act Density 0.115%

    No Known Activations