INDEX
    Explanations

    phrases indicating location and direction

    New Auto-Interp
    Negative Logits
     Huyá»ĩn
    -0.15
     Zuk
    -0.14
    grily
    -0.14
    klad
    -0.14
    fov
    -0.13
    bler
    -0.13
    Äįan
    -0.13
    .Scheme
    -0.13
    uw
    -0.13
    ungan
    -0.13
    POSITIVE LOGITS
     left
    0.91
     right
    0.90
    left
    0.75
     Left
    0.73
     Right
    0.71
    right
    0.71
    å·¦
    0.70
     LEFT
    0.69
     RIGHT
    0.69
    Left
    0.68
    Act Density 0.201%

    No Known Activations