INDEX
    Explanations

    abandoned and irrelevant

    New Auto-Interp
    Negative Logits
     oike
    0.70
    特别
    0.66
    ave
    0.66
    semi
    0.65
     semi
    0.63
    explicit
    0.62
     chcete
    0.61
     not
    0.61
    ERK
    0.61
     বিশেষ
    0.60
    POSITIVE LOGITS
     irrelevant
    1.11
     anyway
    1.08
     anyways
    1.02
     insign
    0.99
     shrugged
    0.97
     unimportant
    0.92
    nonsense
    0.91
     Ignore
    0.90
     shrug
    0.90
     negligible
    0.89
    Act Density 0.358%

    No Known Activations