INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     myſelf
    -0.86
     itſelf
    -0.83
     ſeveral
    -0.77
    ſelves
    -0.76
     foon
    -0.73
     himſelf
    -0.72
     themſelves
    -0.72
     poffible
    -0.71
     pleaſure
    -0.71
     ſhe
    -0.71
    POSITIVE LOGITS
     maxn
    0.47
    bukkit
    0.45
    twimg
    0.45
     that
    0.45
    ioutil
    0.45
    apimachinery
    0.44
    dataclass
    0.44
     t
    0.43
     by
    0.43
    onAttach
    0.43
    Act Density 0.053%

    No Known Activations