INDEX
    Explanations

    indications of challenges or difficulties in various contexts

    New Auto-Interp
    Head Attr Weights
    0:0.04
    1:0.04
    2:0.07
    3:0.16
    4:0.06
    5:0.07
    6:0.05
    7:0.09
    8:0.03
    9:0.04
    10:0.14
    11:0.16
    Negative Logits
    "},{"
    -3.02
    "></
    -2.96
     </
    -2.81
    -2.71
    )</
    -2.66
    \">
    -2.53
    -2.44
    },{"
    -2.37
    ">
    -2.35
    -2.32
    POSITIVE LOGITS
    bably
    2.40
     whiff
    2.21
     kidding
    2.21
     quirks
    1.94
    oops
    1.93
     apiece
    1.90
     Canaver
    1.86
    metic
    1.86
     yawn
    1.84
    yip
    1.83
    Act Density 0.001%

    No Known Activations