INDEX
    Explanations

    instances of the word "and" followed by other words, particularly when multiple occurrences are close together with high activation values

    connections and conjunctions in sentences

    New Auto-Interp
    Negative Logits
    advertisement
    -0.74
    arro
    -0.69
    edia
    -0.67
    tesy
    -0.67
     prosecut
    -0.66
    ONSORED
    -0.63
    coat
    -0.63
    hower
    -0.63
    theless
    -0.62
     bluff
    -0.62
    POSITIVE LOGITS
     valleys
    0.97
    uries
    0.95
     oranges
    0.93
     Soviets
    0.80
     necks
    0.77
     territories
    0.75
     Territories
    0.75
     Earthqu
    0.73
     Titans
    0.72
    izons
    0.72
    Act Density 0.271%

    No Known Activations