INDEX
    Explanations

    references to specific items and projects such as games, movies, and tournaments

    New Auto-Interp
    Negative Logits
    gerald
    -0.71
    awaru
    -0.70
    otle
    -0.68
    ured
    -0.66
    manship
    -0.64
    iership
    -0.63
     Ik
    -0.63
    urated
    -0.63
    rolet
    -0.62
    etz
    -0.62
    POSITIVE LOGITS
    nd
    2.14
    ND
    1.19
     thirds
    1.06
    133
    0.99
    147
    0.98
    160
    0.95
     externalToEVAOnly
    0.93
    187
    0.90
     halves
    0.89
    245
    0.88
    Act Density 0.671%

    No Known Activations