INDEX
    Explanations

    hypotheses and assumptions

    New Auto-Interp
    Negative Logits
    showAlert
    0.44
     literalmente
    0.42
     Reveals
    0.41
    Obviously
    0.40
     obtiene
    0.39
    注意事項
    0.39
    0.39
     Preference
    0.38
     showAlert
    0.38
     Obviously
    0.37
    POSITIVE LOGITS
     assume
    1.14
    assume
    0.91
     presume
    0.85
     Assume
    0.83
     say
    0.80
    Assume
    0.79
     assumed
    0.77
     assum
    0.76
     assumes
    0.75
     conclude
    0.65
    Act Density 0.052%

    No Known Activations