INDEX
Explanations
expressions related to personal reflection and decision-making
New Auto-Interp
Negative Logits
themselves
-0.81
their
-0.80
-0.67
ponerse
-0.66
utafitiHapana
-0.64
Their
-0.63
OGND
-0.59
Their
-0.59
InputBorder
-0.58
pédie
-0.57
POSITIVE LOGITS
myself
1.46
myself
1.27
myſelf
1.17
Myself
1.04
my
0.91
minhas
0.82
我的
0.81
meinem
0.78
mojej
0.77
my
0.77
Activations Density 2.051%