INDEX
    Explanations

    references to the concept of concentration in various contexts

    New Auto-Interp
    Negative Logits
    jour
    -0.16
    idas
    -0.15
    ãĤ¿ãĥ¼
    -0.15
    мовÑĸÑĢ
    -0.15
    772
    -0.14
    pers
    -0.14
    iz
    -0.14
    olle
    -0.14
    isto
    -0.14
    igers
    -0.14
    POSITIVE LOGITS
    eza
    0.17
    -sama
    0.17
    amac
    0.16
    ration
    0.16
     gradient
    0.15
    -gradient
    0.15
    urat
    0.15
    worthy
    0.15
    arton
    0.15
    ric
    0.15
    Act Density 0.048%

    No Known Activations