INDEX
    Explanations

    references to geographical locations and directions

    New Auto-Interp
    Negative Logits
     higher
    -0.18
    higher
    -0.17
     overhead
    -0.16
    alo
    -0.16
    hd
    -0.16
     oben
    -0.15
    ichert
    -0.15
    éłĤ
    -0.15
    é¡¶
    -0.15
     виÑģок
    -0.15
    POSITIVE LOGITS
     bottom
    0.35
     below
    0.34
    below
    0.32
    bottom
    0.30
    -bottom
    0.29
    Bottom
    0.28
     Bottom
    0.28
     BOTTOM
    0.27
    ä¸ĭ
    0.27
     Below
    0.27
    Act Density 0.134%

    No Known Activations