INDEX
    Explanations

    directional and positional language

    New Auto-Interp
    Negative Logits
    pis
    -0.17
    .writeln
    -0.16
    utsch
    -0.16
    ÙĪØµ
    -0.15
    reh
    -0.15
    dob
    -0.15
    pha
    -0.14
    _FF
    -0.14
    ixo
    -0.14
    go
    -0.13
    POSITIVE LOGITS
    ward
    0.17
    chemas
    0.16
    zung
    0.16
    wards
    0.15
    .scalablytyped
    0.14
     Aph
    0.14
    WARD
    0.14
    hattan
    0.14
     åħī
    0.14
    415
    0.14
    Act Density 0.073%

    No Known Activations