INDEX
Explanations
phrases that indicate reflection on personal experiences
New Auto-Interp
Negative Logits
undi
-0.18
زد
-0.14
æĦ
-0.14
_SUPPORTED
-0.14
ania
-0.14
ëįĶëĭĪ
-0.14
æĮģ
-0.14
dil
-0.14
.HtmlControls
-0.14
ignon
-0.13
POSITIVE LOGITS
looking
0.73
look
0.70
looking
0.66
Looking
0.65
Look
0.63
looked
0.61
looks
0.61
Looking
0.61
look
0.59
-looking
0.59
Activations Density 0.124%