INDEX
    Explanations

    proper nouns or names, specifically those that start with a capital letter

    empty passages or sections within the text

    New Auto-Interp
    Negative Logits
     theirs
    -0.67
     beforehand
    -0.65
    .</
    -0.65
     themselves
    -0.61
    qi
    -0.59
    â̦â̦
    -0.59
    âĢij
    -0.58
     with
    -0.58
    Ãĥ
    -0.58
    ����
    -0.58
    POSITIVE LOGITS
    resa
    1.45
    odore
    1.44
    oret
    1.36
    ories
    1.18
     simplest
    1.03
    nce
    1.00
    atre
    0.99
     downside
    0.97
     latest
    0.94
     easiest
    0.92
    Act Density 0.498%

    No Known Activations