INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.11
    2:0.08
    3:0.05
    4:0.02
    5:0.06
    6:0.08
    7:0.09
    8:0.15
    9:0.08
    10:0.12
    11:0.07
    Negative Logits
     srf
    -1.09
    ÃÂ
    -0.95
     tradem
    -0.90
     colonies
    -0.86
     pione
    -0.85
     births
    -0.84
     majors
    -0.84
     theor
    -0.84
    uberty
    -0.83
    aeper
    -0.82
    POSITIVE LOGITS
     "@
    1.20
    Offline
    1.11
     "#
    1.08
    ESA
    1.06
    ory
    1.01
    "))
    0.99
    orage
    0.99
    Nik
    0.97
    seless
    0.96
    Snow
    0.95
    Act Density 0.045%

    No Known Activations