INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     itſelf
    -1.02
     myſelf
    -0.92
     themſelves
    -0.87
     Efq
    -0.85
     ſhe
    -0.85
     himſelf
    -0.84
     ſeveral
    -0.84
     ſmall
    -0.83
     Majefty
    -0.82
     greateſt
    -0.81
    POSITIVE LOGITS
     do
    0.45
     end
    0.42
     mix
    0.41
     stay
    0.41
     five
    0.41
     start
    0.40
     handle
    0.40
    IConfiguration
    0.40
    .
    0.40
     try
    0.39
    Act Density 0.070%

    No Known Activations