INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.74
    -0.72
     or
    -0.72
    .
    -0.70
     «
    -0.65
     in
    -0.64
     (
    -0.64
     In
    -0.63
     To
    -0.60
     Now
    -0.59
    POSITIVE LOGITS
     Monfieur
    1.41
     purpoſe
    1.40
     myſelf
    1.39
     pleaſure
    1.36
     ſtate
    1.33
     ſeveral
    1.29
     faſt
    1.29
     raiſ
    1.27
     ſche
    1.27
     itſelf
    1.26
    Act Density 0.063%

    No Known Activations