INDEX
    Explanations

    fundamental

    New Auto-Interp
    Negative Logits
    lude
    -0.08
    flächen
    -0.08
    benzisa
    -0.08
    -0.08
     Furthermore
    -0.07
    -0.07
     чер
    -0.07
     coords
    -0.07
    nosti
    -0.07
     prospect
    -0.07
    POSITIVE LOGITS
     pillar
    0.07
    0.07
     constant
    0.07
     básicos
    0.07
     compet
    0.07
     foundational
    0.07
    0.07
     competitions
    0.07
     childhood
    0.07
     gir
    0.06
    Act Density 0.013%

    No Known Activations