INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     myſelf
    -1.24
     itſelf
    -1.24
    ſelf
    -1.23
     Efq
    -1.19
     pleaſure
    -1.15
     Jefus
    -1.15
    ſelves
    -1.15
     Majefty
    -1.11
     doubtnut
    -1.10
     Monfieur
    -1.10
    POSITIVE LOGITS
     int
    0.75
     A
    0.73
     di
    0.72
     x
    0.71
     par
    0.69
     m
    0.69
     ha
    0.69
     bar
    0.69
     no
    0.68
     qu
    0.68
    Act Density 3.895%

    No Known Activations