INDEX
    Explanations

    phrases indicating requirement or necessity

    New Auto-Interp
    Negative Logits
     americ
    -0.74
     diction
    -0.68
    cart
    -0.67
    oci
    -0.65
    Liter
    -0.65
    cript
    -0.65
     concess
    -0.64
    laughter
    -0.62
    """
    -0.61
    uras
    -0.61
    POSITIVE LOGITS
     prove
    1.00
     regain
    0.94
     overcome
    0.93
     patience
    0.91
     convince
    0.90
     hurry
    0.86
     rematch
    0.86
     retake
    0.84
     succeed
    0.80
     earn
    0.80
    Act Density 0.111%

    No Known Activations