INDEX
    Explanations

    phrases or words indicating the "best" or "top" choice or option

    phrases that indicate rankings or comparisons

    New Auto-Interp
    Negative Logits
    factor
    -0.68
    krit
    -0.64
    shell
    -0.64
    hyde
    -0.62
    ADD
    -0.61
    ãĥ£
    -0.59
    taking
    -0.59
    kamp
    -0.59
    nsic
    -0.59
    ¿
    -0.59
    POSITIVE LOGITS
     luck
    1.05
     Worst
    0.89
    nesota
    0.78
    owitz
    0.76
     breed
    0.74
     intentions
    0.73
     Nanto
    0.70
     Luck
    0.69
    luck
    0.69
     Practices
    0.68
    Act Density 0.074%

    No Known Activations