INDEX
    Explanations

    punctuation marks, particularly commas and quotation marks

    New Auto-Interp
    Negative Logits
    ’e
    -0.17
     “[
    -0.17
    ’ÑĹ
    -0.15
    -0.15
    -0.15
    “Oh
    -0.15
     (“
    -0.15
    âĢŀM
    -0.14
    âĢŀV
    -0.14
    âĢŀN
    -0.14
    POSITIVE LOGITS
     says
    0.29
     said
    0.27
    ÂĿ
    0.25
     reads
    0.24
     read
    0.22
     say
    0.21
     he
    0.20
     according
    0.19
    says
    0.19
     wrote
    0.19
    Act Density 0.108%

    No Known Activations