INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     pregn
    -0.81
    gdala
    -0.79
     Rebell
    -0.75
     princ
    -0.74
     answ
    -0.73
    bub
    -0.72
     Pound
    -0.71
     compan
    -0.69
     taxp
    -0.69
    shit
    -0.68
    POSITIVE LOGITS
    broad
    0.66
     unsupported
    0.65
     integer
    0.64
    ague
    0.62
     appreci
    0.62
     unusually
    0.62
     outsiders
    0.61
     organized
    0.61
     adjusted
    0.60
     trolls
    0.59
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.