INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ",",
    -0.07
     halten
    -0.07
     Kazakhstan
    -0.07
    /sm
    -0.06
     kork
    -0.06
     Charm
    -0.06
     swamp
    -0.06
     hern
    -0.06
     hj
    -0.06
    ンク
    -0.06
    POSITIVE LOGITS
    0.07
    (dictionary
    0.07
    riel
    0.07
     infections
    0.06
     reproductive
    0.06
    0.06
    0.06
    Direct
    0.06
     Former
    0.06
    0.06
    Act Density 0.004%

    No Known Activations