INDEX
    Explanations

    references to web domains and online resources

    New Auto-Interp
    Negative Logits
     faſt
    -0.91
     Efq
    -0.90
     leſs
    -0.85
     pleaſure
    -0.84
     Jefus
    -0.84
     itſelf
    -0.81
     Diſ
    -0.80
     reaſon
    -0.80
     Houſe
    -0.80
     Shakspeare
    -0.80
    POSITIVE LOGITS
    LogFactory
    0.60
    ader
    0.53
    первых
    0.52
     H
    0.51
     der
    0.50
     F
    0.49
     r
    0.49
     I
    0.47
     As
    0.47
     Sam
    0.47
    Act Density 0.717%

    No Known Activations