INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.07
    2:0.08
    3:0.07
    4:0.07
    5:0.08
    6:0.07
    7:0.09
    8:0.09
    9:0.07
    10:0.08
    11:0.09
    Negative Logits
    mill
    -1.99
    die
    -1.71
    rine
    -1.66
    Gu
    -1.61
    amphetamine
    -1.56
    license
    -1.55
    rison
    -1.55
    gan
    -1.55
    ratulations
    -1.54
    stead
    -1.53
    POSITIVE LOGITS
    wcsstore
    1.90
    atures
    1.85
     Vanity
    1.82
    ovember
    1.78
     Courage
    1.75
    BALL
    1.73
     Hearts
    1.69
     Zionism
    1.66
     Naked
    1.64
     curls
    1.56
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.