INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    prus
    -0.73
    etsk
    -0.71
    kered
    -0.70
    spect
    -0.69
     midterm
    -0.66
    zza
    -0.65
    ======
    -0.64
    cling
    -0.64
    )=(
    -0.64
    kins
    -0.63
    POSITIVE LOGITS
    inates
    1.16
    inator
    1.11
    inated
    1.10
    inately
    1.01
    ination
    1.00
    nance
    0.99
    inating
    0.96
    inators
    0.95
    inarily
    0.92
    ova
    0.91
    Act Density 0.021%

    No Known Activations