INDEX
Explanations
questions related to experiences and situations
New Auto-Interp
Negative Logits
your
-0.70
your
-0.66
YOUR
-0.59
ä½łçļĦ
-0.57
Your
-0.56
æĤ¨çļĦ
-0.54
ваÑĪ
-0.54
-your
-0.54
YOUR
-0.54
Your
-0.53
POSITIVE LOGITS
you
0.54
You
0.42
you
0.41
You
0.39
bạn
0.35
_you
0.32
você
0.30
à¤Ĩप
0.30
-you
0.29
ä½ł
0.28
Activations Density 0.369%