INDEX
    Explanations
    New Auto-Interp
    Head Attr Weights
    0:0.03
    1:0.01
    2:0.10
    3:0.10
    4:0.06
    5:0.03
    6:0.15
    7:0.17
    8:0.08
    9:0.06
    10:0.10
    11:0.07
    Negative Logits
     Provided
    -1.70
     Provides
    -1.61
    antine
    -1.58
     Contains
    -1.54
    -1.53
     ensures
    -1.45
    Alpha
    -1.45
     proves
    -1.41
     PROV
    -1.40
     consists
    -1.39
    POSITIVE LOGITS
    thood
    1.71
    die
    1.63
    athing
    1.59
     itch
    1.55
    mire
    1.54
    joice
    1.53
     weep
    1.50
    react
    1.49
     splash
    1.43
     admire
    1.42
    Act Density 0.001%

    No Known Activations