INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    AUT
    -0.77
    FORMATION
    -0.70
    OGR
    -0.67
    ostics
    -0.65
    Safety
    -0.64
    Asset
    -0.64
    UCT
    -0.63
    YL
    -0.62
    osis
    -0.61
    SOURCE
    -0.61
    POSITIVE LOGITS
    hips
    1.11
     wars
    1.04
    lords
    1.00
    bucks
    0.96
    hip
    0.94
    poons
    0.91
    riors
    0.89
     waged
    0.82
     raged
    0.79
     battles
    0.79
    Act Density 0.010%

    No Known Activations