INDEX
    Explanations

    specific mentions of the word "contents" within a text

    repeated mentions of "contents."

    New Auto-Interp
    Negative Logits
    roads
    -0.73
    ansson
    -0.71
    Äĩ
    -0.70
    ARP
    -0.64
     Rapp
    -0.62
    ño
    -0.61
    verbs
    -0.61
    arms
    -0.61
     Architects
    -0.60
    Vill
    -0.60
    POSITIVE LOGITS
     contents
    1.02
    ĸļ
    0.88
    afety
    0.86
    ascript
    0.86
    matter
    0.85
    ecause
    0.84
     Contents
    0.84
    ious
    0.83
    uggest
    0.77
    ions
    0.76
    Act Density 0.013%

    No Known Activations