INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.02
    2:0.19
    3:0.05
    4:0.06
    5:0.03
    6:0.04
    7:0.16
    8:0.04
    9:0.04
    10:0.11
    11:0.18
    Negative Logits
    astered
    -1.92
    alon
    -1.73
    hyde
    -1.63
    ogun
    -1.60
    olics
    -1.58
    joy
    -1.57
    itia
    -1.56
    hner
    -1.53
    livion
    -1.52
    bern
    -1.50
    POSITIVE LOGITS
    ACTION
    1.72
    rab
    1.71
    Fighting
    1.65
    Range
    1.59
    urnal
    1.59
    Poké
    1.56
    href
    1.54
     ambush
    1.53
    Brave
    1.52
    Hidden
    1.50
    Act Density 0.001%

    No Known Activations