INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.08
    1:0.07
    2:0.09
    3:0.08
    4:0.08
    5:0.09
    6:0.09
    7:0.08
    8:0.06
    9:0.07
    10:0.07
    11:0.08
    Negative Logits
    abases
    -2.86
     Pell
    -2.77
    berra
    -2.70
     Publications
    -2.66
    library
    -2.60
     lig
    -2.41
    pdf
    -2.38
    link
    -2.38
     Gibbs
    -2.38
     ICC
    -2.35
    POSITIVE LOGITS
     veter
    3.05
    fired
    2.60
     ginger
    2.54
     sickness
    2.52
     sank
    2.52
     Fired
    2.50
     gren
    2.46
     mug
    2.45
     carts
    2.43
     thieves
    2.42
    Act Density 0.000%

    No Known Activations