INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.09
    2:0.08
    3:0.07
    4:0.08
    5:0.09
    6:0.07
    7:0.07
    8:0.08
    9:0.07
    10:0.08
    11:0.08
    Negative Logits
    —"
    -1.91
     whom
    -1.79
     schizophren
    -1.78
     Sands
    -1.76
    ,—
    -1.75
    ?",
    -1.75
     Psychiatry
    -1.72
    -1.70
    HS
    -1.69
     Khalid
    -1.67
    POSITIVE LOGITS
    retty
    2.13
    aceous
    2.06
    ohyd
    2.05
    atform
    1.93
    ograp
    1.89
    ategory
    1.89
    urtle
    1.88
    orative
    1.86
    jug
    1.84
    ivated
    1.81
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.