INDEX
    Explanations

    occurrences of the word "have" and its variations, indicating discussions around possession or necessity

    New Auto-Interp
    Negative Logits
     itself
    -0.25
     Its
    -0.18
     themselves
    -0.18
     its
    -0.17
     himself
    -0.17
    Its
    -0.16
    ties
    -0.15
    ince
    -0.15
     bana
    -0.15
    irse
    -0.15
    POSITIVE LOGITS
     ourselves
    0.22
     seen
    0.21
     heard
    0.20
     known
    0.20
    Seen
    0.19
     Seen
    0.18
    heard
    0.17
     talked
    0.17
     learned
    0.17
     discussed
    0.16
    Act Density 0.144%

    No Known Activations