INDEX
    Explanations

    minimize/maximize

    New Auto-Interp
    Negative Logits
     minimization
    -1.52
     minimize
    -1.52
     minimizing
    -1.51
    Avoiding
    -1.51
     minimise
    -1.50
     minimizes
    -1.48
     Minimize
    -1.46
    avoid
    -1.39
     Avoiding
    -1.38
     avoids
    -1.35
    POSITIVE LOGITS
     the
    0.70
    s
    0.65
     its
    0.63
     his
    0.57
     death
    0.52
     their
    0.52
     all
    0.50
    im
    0.48
     any
    0.47
    nt
    0.46
    Act Density 0.034%

    No Known Activations