INDEX
    Explanations

    phrases related to achievements and records

    New Auto-Interp
    Negative Logits
    isan
    -0.17
    ppo
    -0.15
    elts
    -0.15
    ardin
    -0.15
    stants
    -0.15
    λÏī
    -0.15
    pis
    -0.14
    itorio
    -0.14
    nement
    -0.14
    elman
    -0.14
    POSITIVE LOGITS
    -breaking
    0.40
    breaking
    0.38
    -setting
    0.38
     breaking
    0.33
    setting
    0.31
     setting
    0.29
     Breaking
    0.29
    -high
    0.27
    Breaking
    0.27
     Setting
    0.26
    Act Density 0.024%

    No Known Activations