INDEX
    Explanations

    punctuation marks and colons that introduce lists or statements

    New Auto-Interp
    Negative Logits
     TNT
    -0.61
    Reloaded
    -0.59
     bour
    -0.55
     abyss
    -0.54
    IZ
    -0.54
    ELD
    -0.53
     Zombies
    -0.53
    maid
    -0.52
    erville
    -0.51
     ages
    -0.51
    POSITIVE LOGITS
    ividual
    1.00
    vote
    0.73
    cknow
    0.73
    keep
    0.71
    imize
    0.70
     listen
    0.67
     interfere
    0.66
    gradation
    0.66
     try
    0.66
    peat
    0.65
    Act Density 0.424%

    No Known Activations