INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Efq
    -0.93
     ſeveral
    -0.93
    ^(@)
    -0.91
     Jefus
    -0.90
     perfons
    -0.87
     Paglinawan
    -0.86
     $_"
    -0.83
     mourut
    -0.82
     thoſe
    -0.81
     Theſe
    -0.80
    POSITIVE LOGITS
    .
    0.50
     redor
    0.46
     OTHERWISE
    0.43
     /
    0.43
     care
    0.41
    /
    0.41
     track
    0.40
     read
    0.40
     later
    0.39
     otherwise
    0.39
    Act Density 0.045%

    No Known Activations