INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.02
    2:0.06
    3:0.04
    4:0.04
    5:0.03
    6:0.50
    7:0.04
    8:0.04
    9:0.06
    10:0.06
    11:0.04
    Negative Logits
     Cheong
    -1.20
     Gry
    -1.16
     hesitation
    -1.14
     Ary
    -1.10
     itch
    -1.09
     Yanuk
    -1.08
     Vox
    -1.08
     Whites
    -1.08
    ndra
    -1.08
     unused
    -1.08
    POSITIVE LOGITS
    illi
    1.65
    arios
    1.43
    sie
    1.39
    udo
    1.34
    addr
    1.29
    ciation
    1.26
    Cong
    1.26
     Powered
    1.25
    1.23
    etary
    1.22
    Act Density 0.005%

    No Known Activations