INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     houſe
    -1.19
     pleaſure
    -1.19
     blast
    -1.17
     purpoſe
    -1.16
     itſelf
    -1.16
     reaſon
    -1.13
     blasts
    -1.13
     ſche
    -1.10
     ſtate
    -1.09
     myſelf
    -1.09
    POSITIVE LOGITS
     of
    0.77
    ,
    0.71
     and
    0.65
    .
    0.63
     at
    0.62
     in
    0.61
     on
    0.60
     like
    0.59
     away
    0.59
    ;
    0.59
    Act Density 1.111%

    No Known Activations