INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.09
    1:0.06
    2:0.09
    3:0.09
    4:0.09
    5:0.07
    6:0.07
    7:0.07
    8:0.08
    9:0.09
    10:0.07
    11:0.08
    Negative Logits
    chini
    -1.95
    helle
    -1.94
    lesi
    -1.86
    hoff
    -1.77
    efer
    -1.66
    sed
    -1.65
    sten
    -1.59
    gz
    -1.56
     Mek
    -1.55
    omsky
    -1.52
    POSITIVE LOGITS
    ��
    2.10
     masc
    1.80
     Borders
    1.77
    ocom
    1.59
     umb
    1.58
    querque
    1.56
     conventions
    1.55
     Dogs
    1.53
     Democr
    1.51
     reperto
    1.51
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.