INDEX
    Explanations

    variations of the pronoun "it."

    New Auto-Interp
    Negative Logits
    noticed
    -0.66
    ichever
    -0.65
    herent
    -0.64
    mma
    -0.61
     Friendly
    -0.61
    dding
    -0.60
    cknow
    -0.59
    eligible
    -0.58
    ighton
    -0.55
    tted
    -0.55
    POSITIVE LOGITS
     entails
    1.01
     boils
    0.99
     hurts
    0.96
    alian
    0.92
     happens
    0.91
     happened
    0.90
     transpired
    0.89
     feels
    0.88
     takes
    0.84
     mattered
    0.83
    Act Density 0.048%

    No Known Activations