INDEX
    Explanations

    questions related to experiences and situations

    New Auto-Interp
    Negative Logits
     your
    -0.70
    your
    -0.66
     YOUR
    -0.59
    ä½łçļĦ
    -0.57
    Your
    -0.56
    æĤ¨çļĦ
    -0.54
     ваÑĪ
    -0.54
    -your
    -0.54
    YOUR
    -0.54
     Your
    -0.53
    POSITIVE LOGITS
     you
    0.54
     You
    0.42
    you
    0.41
    You
    0.39
     bạn
    0.35
    _you
    0.32
     você
    0.30
     à¤Ĩप
    0.30
    -you
    0.29
    ä½ł
    0.28
    Act Density 0.369%

    No Known Activations