INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Theſe
    -1.35
     Monfieur
    -1.28
     Efq
    -1.20
     pleaſure
    -1.20
     myſelf
    -1.19
     purpoſe
    -1.18
     Houſe
    -1.12
    ^(@)
    -1.11
     itſelf
    -1.11
     ſmall
    -1.09
    POSITIVE LOGITS
    ing
    0.59
     plan
    0.58
     ans
    0.54
     “
    0.53
    /******/
    0.53
     lang
    0.52
     sch
    0.52
     du
    0.51
     se
    0.49
     apa
    0.49
    Act Density 0.040%

    No Known Activations