INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     tra
    -0.57
     rad
    -0.57
     Map
    -0.56
     hip
    -0.55
    emin
    -0.54
     cer
    -0.53
    bootstrapcdn
    -0.53
     bu
    -0.52
    DoubleQuotes
    -0.51
     Pe
    -0.50
    POSITIVE LOGITS
     Jefus
    1.35
     itſelf
    1.35
    ſelf
    1.33
     myſelf
    1.30
     Reſ
    1.30
     Anſ
    1.28
     ſta
    1.23
     Monfieur
    1.23
     houſe
    1.22
     ſche
    1.21
    Act Density 1.118%

    No Known Activations