INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.07
    1:0.07
    2:0.13
    3:0.04
    4:0.03
    5:0.05
    6:0.10
    7:0.05
    8:0.03
    9:0.04
    10:0.29
    11:0.05
    Negative Logits
     Stronghold
    -2.46
     Wonders
    -2.43
     Soy
    -2.39
    Unt
    -2.39
     Tant
    -2.39
     Aster
    -2.36
     Secrets
    -2.31
     Misty
    -2.25
     gent
    -2.23
     Priv
    -2.23
    POSITIVE LOGITS
    HL
    4.08
     HL
    2.92
    hal
    2.86
    igl
    2.64
    emort
    2.60
     NHL
    2.53
    iverpool
    2.50
    jri
    2.47
    onga
    2.38
    LET
    2.36
    Act Density 0.000%

    No Known Activations