INDEX
    Explanations

    prepositions and expressions of direction or movement

    New Auto-Interp
    Negative Logits
     somewhere
    -0.16
     none
    -0.15
     humans
    -0.15
    none
    -0.15
     None
    -0.15
    sez
    -0.14
    oir
    -0.14
    .none
    -0.14
    ά
    -0.14
    quet
    -0.13
    POSITIVE LOGITS
     everything
    0.71
    everything
    0.60
     Everything
    0.56
    Everything
    0.54
    ä¸ĢåĪĩ
    0.46
     every
    0.45
     tudo
    0.45
     EVERY
    0.40
     everyone
    0.40
     alles
    0.39
    Act Density 0.042%

    No Known Activations