INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     itſelf
    -1.13
     myſelf
    -1.11
     Monfieur
    -0.99
     please
    -0.97
     Majefty
    -0.94
     raiſ
    -0.93
     Jefus
    -0.91
     whofe
    -0.88
     pleaſure
    -0.88
     whoſe
    -0.87
    POSITIVE LOGITS
     Cheung
    0.55
    paramref
    0.51
    erape
    0.51
     сы
    0.51
    ggars
    0.50
     ImGui
    0.49
    antMatchers
    0.49
    !
    0.48
    Dieter
    0.47
    Sprintf
    0.47
    Act Density 0.257%

    No Known Activations