INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.07
    2:0.08
    3:0.07
    4:0.09
    5:0.09
    6:0.08
    7:0.07
    8:0.08
    9:0.06
    10:0.09
    11:0.08
    Negative Logits
     numbering
    -1.84
    TextColor
    -1.66
     numbered
    -1.55
    iform
    -1.54
     numer
    -1.51
     strikingly
    -1.49
     inhib
    -1.48
     brightest
    -1.47
     differed
    -1.46
     noteworthy
    -1.45
    POSITIVE LOGITS
    milo
    1.79
     yourself
    1.65
     Yourself
    1.61
    package
    1.59
     podcast
    1.58
     Goes
    1.57
    morrow
    1.56
     answ
    1.52
     Explain
    1.51
    voice
    1.51
    Act Density 0.000%

    No Known Activations

    This feature has no known activations.