INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.14
    1:0.02
    2:0.16
    3:0.09
    4:0.06
    5:0.04
    6:0.14
    7:0.03
    8:0.10
    9:0.04
    10:0.06
    11:0.07
    Negative Logits
    onom
    -1.51
    emouth
    -1.46
    afety
    -1.44
    apper
    -1.41
    lake
    -1.40
    angelo
    -1.40
    ktop
    -1.38
    gow
    -1.35
    Executive
    -1.35
     curator
    -1.34
    POSITIVE LOGITS
    ……
    1.69
    1.59
    én
    1.58
    ||
    1.57
    »
    1.52
    -----------
    1.45
    |
    1.43
    }"
    1.43
    ——
    1.43
    -|
    1.40
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.