INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.02
    1:0.01
    2:0.04
    3:0.06
    4:0.04
    5:0.03
    6:0.47
    7:0.05
    8:0.05
    9:0.06
    10:0.06
    11:0.04
    Negative Logits
    ngth
    -1.60
    BIL
    -1.41
    selves
    -1.32
     loads
    -1.29
    irlf
    -1.26
    drm
    -1.26
    bread
    -1.23
    letal
    -1.22
    Union
    -1.22
    loads
    -1.20
    POSITIVE LOGITS
     Stra
    1.48
    bush
    1.31
    aroo
    1.31
    umbered
    1.28
     Blanc
    1.27
     hatch
    1.23
     Bris
    1.23
     Gau
    1.22
    combe
    1.22
    uzz
    1.21
    Act Density 0.001%

    No Known Activations