INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    7
    -0.09
     once
    -0.08
     twelve
    -0.08
     thirteen
    -0.07
     period
    -0.07
    609
    -0.07
     periods
    -0.07
     ice
    -0.07
     last
    -0.07
     truth
    -0.07
    POSITIVE LOGITS
    able
    0.16
    ABLE
    0.11
    ible
    0.11
    ability
    0.10
    ables
    0.09
    ableObject
    0.08
     Noble
    0.08
    ubl
    0.08
    orable
    0.08
    oble
    0.08
    Act Density 0.106%

    No Known Activations