INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     be
    -0.69
     kill
    -0.68
     feed
    -0.67
     assist
    -0.66
     deal
    -0.66
     starve
    -0.66
     draw
    -0.65
     exploit
    -0.65
     change
    -0.64
     get
    -0.63
    POSITIVE LOGITS
     avoient
    0.76
     étoit
    0.68
     réguli
    0.68
     marins
    0.65
     confronti
    0.63
     feroit
    0.62
     igång
    0.62
     complètes
    0.62
     ouvertes
    0.61
     démocr
    0.60
    Act Density 0.002%

    No Known Activations