INDEX
    Explanations

    phrases with the word "have" followed by some additional description or context

    the phrase "we have" in various contexts

    New Auto-Interp
    Negative Logits
    cone
    -0.75
    eem
    -0.74
    oshi
    -0.70
    drive
    -0.68
    tip
    -0.68
    conom
    -0.67
    roll
    -0.65
    alter
    -0.65
    ensing
    -0.63
    icking
    -0.62
    POSITIVE LOGITS
     seen
    1.20
     ourselves
    1.14
     witnessed
    1.07
     heard
    1.05
     been
    0.96
     talked
    0.96
     learned
    0.95
     gotten
    0.95
     reached
    0.94
     learnt
    0.92
    Act Density 0.134%

    No Known Activations