INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.07
    2:0.09
    3:0.08
    4:0.08
    5:0.07
    6:0.07
    7:0.08
    8:0.08
    9:0.07
    10:0.09
    11:0.07
    Negative Logits
    !/
    -2.71
     Raf
    -2.65
     Warner
    -2.57
     Brist
    -2.41
     Gareth
    -2.40
     Spani
    -2.37
     sums
    -2.37
     capt
    -2.30
     customer
    -2.29
     film
    -2.28
    POSITIVE LOGITS
    CBC
    2.84
    Sax
    2.83
    NPR
    2.78
    JS
    2.70
     Init
    2.68
    Ruby
    2.60
     nep
    2.60
    NJ
    2.60
     Tradable
    2.57
    JB
    2.57
    Act Density 0.000%

    No Known Activations