INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     chilly
    -0.07
     anth
    -0.07
    ouro
    -0.07
     Orchard
    -0.06
     around
    -0.06
     Crunch
    -0.06
    λου
    -0.06
     Πο
    -0.06
     вд
    -0.06
     okres
    -0.06
    POSITIVE LOGITS
     Germany
    0.09
     GE
    0.08
     German
    0.08
    German
    0.07
    Germany
    0.07
    .DE
    0.07
    ermann
    0.07
     Germans
    0.07
    émon
    0.07
    $_
    0.06
    Act Density 0.016%

    No Known Activations