INDEX
    Explanations

    instances of the word "care" in various contexts

    New Auto-Interp
    Negative Logits
    oris
    -0.17
    ille
    -0.14
    á¹
    -0.14
    ramework
    -0.14
    ensing
    -0.14
    à¸ļà¸ģ
    -0.14
     excellent
    -0.14
    ardo
    -0.14
    ovie
    -0.14
    eping
    -0.13
    POSITIVE LOGITS
    lessly
    0.28
     whether
    0.23
     deeply
    0.19
     enough
    0.18
     about
    0.18
     less
    0.18
    fully
    0.17
     passionately
    0.17
     squat
    0.17
     how
    0.16
    Act Density 0.027%

    No Known Activations