INDEX
Explanations
phrases indicating desires or intentions
New Auto-Interp
Negative Logits
themselves
-0.92
="@+
-0.71
กัน
-0.68
CloseOperation
-0.65
utafitiHapana
-0.65
themselves
-0.65
their
-0.62
Lordships
-0.60
själva
-0.60
bedoeld
-0.60
POSITIVE LOGITS
myself
1.85
myself
1.51
Myself
1.37
my
1.15
myſelf
1.00
我自己
0.97
خودم
0.96
personally
0.95
Myself
0.94
my
0.90
Activations Density 1.195%