INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.08
    2:0.08
    3:0.11
    4:0.08
    5:0.07
    6:0.08
    7:0.07
    8:0.07
    9:0.07
    10:0.09
    11:0.06
    Negative Logits
     SX
    -2.56
     follow
    -2.44
     Virtue
    -2.39
     Ou
    -2.35
     Equity
    -2.32
     Coy
    -2.31
     outset
    -2.23
     Manifest
    -2.21
     unsurprisingly
    -2.21
     Pers
    -2.11
    POSITIVE LOGITS
    missing
    3.39
    2.89
    ctrl
    2.77
     rotting
    2.65
    rimp
    2.60
    Frag
    2.44
    buf
    2.42
    Flight
    2.41
    finger
    2.39
    osta
    2.38
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.