INDEX
    Explanations

    references to specific numerical codes and identifiers

    New Auto-Interp
    Negative Logits
    lect
    -0.18
    egot
    -0.17
     sen
    -0.16
     Peace
    -0.15
    erus
    -0.15
     Policy
    -0.15
     senior
    -0.14
    LARI
    -0.14
    imbus
    -0.14
    ertext
    -0.14
    POSITIVE LOGITS
    oog
    0.15
    ikal
    0.15
    /trunk
    0.15
    OKIE
    0.15
    adows
    0.15
    around
    0.14
    HEST
    0.14
     赤
    0.14
    Owner
    0.13
    ippers
    0.13
    Act Density 0.035%

    No Known Activations