INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.09
    2:0.08
    3:0.08
    4:0.08
    5:0.07
    6:0.08
    7:0.09
    8:0.07
    9:0.08
    10:0.07
    11:0.07
    Negative Logits
     EEG
    -3.04
    Putin
    -2.85
    -2.83
    chuk
    -2.73
    araoh
    -2.71
    iak
    -2.70
     Putin
    -2.67
    byter
    -2.66
    omsky
    -2.65
    ichick
    -2.63
    POSITIVE LOGITS
     Dublin
    2.71
     Maiden
    2.61
     dear
    2.60
     Lord
    2.56
     Flavoring
    2.55
    wen
    2.55
    letter
    2.52
     Winchester
    2.49
     Lords
    2.43
     Wick
    2.42
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.