INDEX
    Explanations

    phrases related to actions, particularly involving personal relationships and interactions

    New Auto-Interp
    Negative Logits
    xis
    -0.18
    immers
    -0.15
    اÙģØª
    -0.14
    imbus
    -0.14
     Ấ
    -0.14
    abox
    -0.13
    .TestTools
    -0.13
    ĭ
    -0.13
    akis
    -0.13
    asc
    -0.13
    POSITIVE LOGITS
     along
    1.28
    along
    1.14
     Along
    1.10
    Along
    1.06
    沿
    0.65
     junto
    0.49
     alongside
    0.41
     langs
    0.32
    -al
    0.31
     cùng
    0.28
    Act Density 0.186%

    No Known Activations