INDEX
    Explanations

    expressions of personal reflection or admission

    New Auto-Interp
    Negative Logits
     transfer
    -0.15
     schem
    -0.14
     pie
    -0.14
    Com
    -0.14
    lex
    -0.14
    vir
    -0.14
     instead
    -0.13
    mand
    -0.13
    618
    -0.13
     dar
    -0.13
    POSITIVE LOGITS
    ãĥ³ãĥķ
    0.16
    bÃŃ
    0.16
    artz
    0.15
    rovers
    0.15
    ÑĪиб
    0.14
    rowse
    0.14
    fcn
    0.14
    .opengl
    0.14
    arcy
    0.14
    emade
    0.14
    Act Density 0.086%

    No Known Activations