INDEX
    Explanations

    references to personal identity and self-description

    New Auto-Interp
    Negative Logits
    egin
    -0.18
     adopt
    -0.16
    ä¸ĢåĮº
    -0.15
    ÑĢиг
    -0.15
    ıt
    -0.15
    rying
    -0.14
     expand
    -0.14
    lix
    -0.14
    [email
    -0.14
    starter
    -0.13
    POSITIVE LOGITS
     frequent
    0.20
     frequ
    0.19
     regularly
    0.19
     moon
    0.19
    collect
    0.18
    moon
    0.18
     Collect
    0.17
    Collect
    0.17
     frequently
    0.16
     freq
    0.16
    Act Density 0.415%

    No Known Activations