INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    liness
    -0.68
    INESS
    -0.65
    yship
    -0.64
     ALR
    -0.64
    RuleContext
    -0.63
    izations
    -0.63
    ties
    -0.63
    izers
    -0.62
    iness
    -0.61
    ization
    -0.61
    POSITIVE LOGITS
    ant
    1.19
    ANT
    0.99
    antly
    0.98
    anten
    0.80
    ants
    0.79
    antin
    0.72
    antic
    0.70
    ante
    0.70
    antes
    0.70
     ant
    0.69
    Act Density 0.017%

    No Known Activations