INDEX
    Explanations

    mathematical equations and references within the text

    New Auto-Interp
    Negative Logits
     Teach
    -0.14
    lak
    -0.14
     Rowe
    -0.14
    ôt
    -0.14
    ÏĦή
    -0.13
    FieldValue
    -0.13
     Honest
    -0.13
    å¹²
    -0.13
    \$
    -0.13
     Verse
    -0.13
    POSITIVE LOGITS
    ref
    0.31
     ref
    0.20
    -ref
    0.20
     hyper
    0.18
     æ¾
    0.17
    	ref
    0.17
    Ref
    0.17
    _ref
    0.16
    .ref
    0.16
    pag
    0.16
    Act Density 0.030%

    No Known Activations