INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    CLASSIFIED
    -0.62
    Magikarp
    -0.59
     Seym
    -0.59
    GoldMagikarp
    -0.57
    Balt
    -0.55
    Ire
    -0.54
    Interstitial
    -0.54
     adolesc
    -0.53
     behavi
    -0.52
     newcom
    -0.51
    POSITIVE LOGITS
     is
    2.08
     isn
    1.68
     has
    1.58
     represents
    1.51
     appears
    1.50
     belongs
    1.48
     does
    1.47
     exists
    1.47
     occurs
    1.44
     differs
    1.44
    Act Density 0.596%

    No Known Activations