INDEX
    Explanations

    mentions of specific episodes in a TV series

    references to specific episodes in a series

    New Auto-Interp
    Negative Logits
    forts
    -0.78
    lying
    -0.70
    paces
    -0.64
     CONS
    -0.63
    minist
    -0.62
    isse
    -0.62
    helm
    -0.62
    ror
    -0.61
    bos
    -0.61
    RO
    -0.60
    POSITIVE LOGITS
     episode
    3.63
    episode
    2.61
     episodes
    2.58
     Episode
    2.35
    Episode
    2.15
    isode
    1.66
    isodes
    1.62
     chapter
    1.45
     installment
    1.33
     podcast
    1.33
    Act Density 0.014%

    No Known Activations