INDEX
    Explanations

    offering to do something

    New Auto-Interp
    Negative Logits
     doubt
    0.74
     reversals
    0.71
     wider
    0.71
     critiques
    0.69
     confirmation
    0.69
     adaptations
    0.69
     see
    0.68
     portability
    0.67
     confirmations
    0.66
     broader
    0.66
    POSITIVE LOGITS
     Taking
    0.70
    Trying
    0.68
    试图
    0.67
     tratando
    0.66
     Trying
    0.61
     Treating
    0.60
    Solving
    0.60
    Taking
    0.60
    स्तू
    0.60
     Simply
    0.60
    Act Density 0.076%

    No Known Activations