INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _bal
    -0.07
     INCLUDE
    -0.06
     beasts
    -0.06
     quizzes
    -0.06
     Gunn
    -0.06
    .members
    -0.06
    Hel
    -0.06
     boss
    -0.06
    contain
    -0.06
    .od
    -0.06
    POSITIVE LOGITS
     //----------------
    0.07
    PMC
    0.06
    我们
    0.06
     اختل
    0.06
    agree
    0.06
    positive
    0.06
     Shake
    0.06
    tweet
    0.06
    GHz
    0.06
     inconsistency
    0.06
    Act Density 0.005%

    No Known Activations