INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Lambert
    -0.73
    atro
    -0.65
     Brethren
    -0.58
    Lambert
    -0.47
     Señora
    -0.45
     Girolamo
    -0.43
    ')"
    -0.43
     näher
    -0.43
    ing
    -0.42
    \"");
    -0.41
    POSITIVE LOGITS
    findpost
    0.90
     noDo
    0.69
    wpi
    0.68
    AutoScaleMode
    0.63
    AndEndTag
    0.61
     שוליים
    0.61
    PyExc
    0.61
     canlynol
    0.61
    Blk
    0.60
    erol
    0.60
    Act Density 1.625%

    No Known Activations