INDEX
    Explanations

    instances of the word "On" used to indicate transitions or themes in the text

    New Auto-Interp
    Negative Logits
    /by
    -0.19
    /from
    -0.18
    imli
    -0.17
    ãĤĩ
    -0.15
    duct
    -0.15
    олÑĮно
    -0.15
    clusions
    -0.14
    memberof
    -0.14
    alties
    -0.14
    cul
    -0.14
    POSITIVE LOGITS
    ward
    0.30
     balance
    0.28
     closer
    0.26
     average
    0.25
     paper
    0.24
    es
    0.24
     reflection
    0.24
     top
    0.22
     thing
    0.21
     rare
    0.21
    Act Density 0.072%

    No Known Activations