INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _tokens
    -0.08
     admissions
    -0.07
    prim
    -0.06
    learner
    -0.06
    thumb
    -0.06
     comma
    -0.06
     neuron
    -0.06
     malaysia
    -0.06
     bowel
    -0.06
     Adult
    -0.06
    POSITIVE LOGITS
    ーティ
    0.06
     Shut
    0.06
    нию
    0.06
     Gandhi
    0.06
     สพป
    0.06
    ัม
    0.06
    0.06
    517
    0.06
    AJ
    0.06
    unction
    0.06
    Act Density 0.011%

    No Known Activations