INDEX
    Explanations

    instances of the word "have."

    New Auto-Interp
    Negative Logits
    ilation
    -0.71
    stre
    -0.70
    FIG
    -0.70
    soon
    -0.65
     Slowly
    -0.64
    attery
    -0.63
    Introdu
    -0.61
    Soc
    -0.61
    odge
    -0.61
     Initially
    -0.60
    POSITIVE LOGITS
     recourse
    0.96
     enough
    0.93
     permission
    0.93
     anymore
    0.91
     backups
    0.88
     sufficient
    0.88
     adequate
    0.85
     access
    0.85
    stood
    0.85
     any
    0.84
    Act Density 0.044%

    No Known Activations