INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     itſelf
    -1.45
     myſelf
    -1.41
     Houſe
    -1.32
     auffi
    -1.26
    ſelves
    -1.23
     themſelves
    -1.22
     pleaſure
    -1.20
     doubtnut
    -1.17
     purpoſe
    -1.16
     photolibrary
    -1.15
    POSITIVE LOGITS
    '
    0.82
    ed
    0.78
    ,
    0.77
     in
    0.76
     or
    0.74
     (
    0.74
    0.71
     the
    0.70
     al
    0.68
     '
    0.68
    Act Density 0.776%

    No Known Activations