INDEX
    Explanations

    multiple/several

    New Auto-Interp
    Negative Logits
    .lock
    -0.10
    raph
    -0.08
    modules
    -0.08
     race
    -0.08
     освоб
    -0.08
     Jugendlichen
    -0.07
     Jr
    -0.07
    .shift
    -0.07
    ('../
    -0.07
    race
    -0.07
    POSITIVE LOGITS
     dek
    0.08
    ほど
    0.08
     genug
    0.08
     genügend
    0.08
     vocabulary
    0.08
    $string
    0.07
     telephone
    0.07
     רח
    0.07
    0.07
     תנ
    0.07
    Act Density 0.019%

    No Known Activations