INDEX
    Explanations

    references to written content or authorship

    instances of the word "writes."

    New Auto-Interp
    Negative Logits
     RIS
    -0.64
    gest
    -0.61
    cept
    -0.61
    rium
    -0.60
    erest
    -0.60
     ground
    -0.58
     trailer
    -0.56
     season
    -0.56
     halftime
    -0.55
    frac
    -0.55
    POSITIVE LOGITS
     writes
    3.58
     wrote
    2.29
     write
    1.91
     reads
    1.82
    writ
    1.75
    Writ
    1.66
    wrote
    1.65
     publishes
    1.61
     observes
    1.52
    Writing
    1.49
    Act Density 0.012%

    No Known Activations