INDEX
    Explanations

    references to mathematical notation or functions

    New Auto-Interp
    Negative Logits
     De
    -0.71
     D
    -0.69
     N
    -0.67
    -0.65
     G
    -0.62
    De
    -0.62
     the
    -0.62
    de
    -0.61
     de
    -0.61
    D
    -0.61
    POSITIVE LOGITS
     itſelf
    1.42
     themſelves
    1.25
     himſelf
    1.24
     pleaſure
    1.23
     Anſ
    1.21
     myſelf
    1.19
     Efq
    1.16
     raiſ
    1.16
     Majefty
    1.15
     neceff
    1.14
    Act Density 0.621%

    No Known Activations