INDEX
    Explanations

    phrases indicating people doing activities or spending time together

    instances of the word "together."

    New Auto-Interp
    Negative Logits
    ysis
    -0.76
    anty
    -0.66
    ream
    -0.65
    ble
    -0.62
    cock
    -0.62
    Fed
    -0.62
    amac
    -0.61
    geon
    -0.58
    cre
    -0.56
    articles
    -0.56
    POSITIVE LOGITS
    ieth
    0.70
     impunity
    0.67
    inian
    0.65
    with
    0.65
    ees
    0.64
     wi
    0.63
    ivity
    0.62
     with
    0.62
    heid
    0.62
    ee
    0.62
    Act Density 0.028%

    No Known Activations