INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
    Joined
    -0.68
     Koen
    -0.67
    ŃĶ
    -0.64
    iday
    -0.63
    ILLE
    -0.63
     discrep
    -0.63
     pigeon
    -0.63
    boxes
    -0.63
    ozyg
    -0.61
     jumper
    -0.61
    POSITIVE LOGITS
    arta
    0.94
    DonaldTrump
    0.68
    sand
    0.67
    abis
    0.62
    Reason
    0.61
    Merit
    0.61
     Ingredients
    0.60
    uana
    0.60
    nesty
    0.59
    Therefore
    0.58
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.