INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     myſelf
    -1.12
     Efq
    -1.04
     Jefus
    -0.99
     Theſe
    -0.98
     pleaſure
    -0.98
     themſelves
    -0.88
     himſelf
    -0.86
    ſelf
    -0.85
     Majefty
    -0.85
     whoſe
    -0.85
    POSITIVE LOGITS
     a
    0.58
     one
    0.56
    SequentialGroup
    0.56
    rawDesc
    0.55
    SourceChecksum
    0.55
     that
    0.52
     an
    0.51
     like
    0.51
     utafitiHapana
    0.50
     el
    0.49
    Act Density 0.042%

    No Known Activations