INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    hof
    -0.79
    hog
    -0.68
    mars
    -0.62
    tır
    -0.56
    oriasis
    -0.54
    ρους
    -0.53
    thier
    -0.52
    dahl
    -0.49
    asche
    -0.48
     códigos
    -0.47
    POSITIVE LOGITS
     withal
    0.72
     himſelf
    0.69
     reaſon
    0.68
     myſelf
    0.68
     Maciej
    0.66
     Efq
    0.65
     preſent
    0.63
    findpost
    0.62
     reafon
    0.62
     Falkland
    0.61
    Act Density 0.142%

    No Known Activations