INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.06
    2:0.08
    3:0.09
    4:0.09
    5:0.09
    6:0.07
    7:0.08
    8:0.09
    9:0.07
    10:0.09
    11:0.07
    Negative Logits
     Kardash
    -1.97
     Kardashian
    -1.94
     tabl
    -1.92
     cele
    -1.91
     headlines
    -1.89
     rumors
    -1.88
     celebrities
    -1.84
     Oprah
    -1.82
     medications
    -1.81
     slogans
    -1.80
    POSITIVE LOGITS
    annot
    2.23
    adra
    1.98
    REM
    1.95
    inet
    1.89
    ateur
    1.87
    adapt
    1.87
    vere
    1.82
    yth
    1.74
    very
    1.73
    ogh
    1.72
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.