INDEX
    Explanations

    instances of various prepositions and spatial phrases

    New Auto-Interp
    Negative Logits
     ―――――
    -1.08
     ་་
    -0.98
     ――――――――
    -0.97
     Anſ
    -0.95
     iſt
    -0.92
     pleaſure
    -0.89
    ſelf
    -0.89
     houſe
    -0.87
     Zacks
    -0.84
     ―――
    -0.84
    POSITIVE LOGITS
     at
    0.93
     en
    0.92
     na
    0.87
     on
    0.83
     на
    0.79
     på
    0.77
     AT
    0.76
     σε
    0.75
    Σε
    0.74
     bei
    0.73
    Act Density 0.022%

    No Known Activations