INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.01
    1:0.04
    2:0.10
    3:0.25
    4:0.02
    5:0.02
    6:0.10
    7:0.13
    8:0.03
    9:0.07
    10:0.07
    11:0.11
    Negative Logits
     hydra
    -1.10
     sweat
    -0.97
    tackle
    -0.92
    PLA
    -0.92
     symb
    -0.91
     ankles
    -0.91
     podium
    -0.90
     footprints
    -0.90
     proxies
    -0.90
     poles
    -0.88
    POSITIVE LOGITS
    oute
    1.27
    atta
    1.21
    ull
    1.19
    olic
    1.18
    pei
    1.17
    oulos
    1.16
    reck
    1.15
    ourning
    1.13
    opol
    1.13
    bilt
    1.13
    Act Density 0.008%

    No Known Activations