INDEX
    Explanations

    the word "nights" with strong activations

    occurrences of the word "nights."

    New Auto-Interp
    Negative Logits
     bloc
    -0.72
     Swiss
    -0.66
    lda
    -0.66
    ression
    -0.65
    ific
    -0.65
    offic
    -0.63
     Canary
    -0.63
    BN
    -0.61
    eering
    -0.61
     Democrat
    -0.61
    POSITIVE LOGITS
    creen
    1.35
    mith
    1.25
    pring
    1.12
    hift
    1.08
    uits
    1.06
    cape
    1.04
    poons
    1.03
    hips
    1.01
    hops
    1.01
    cale
    1.01
    Act Density 0.031%

    No Known Activations