INDEX
Explanations
affirmative actions and expressions of preference or support
New Auto-Interp
Negative Logits
InputBorder
-0.63
transQ
-0.61
themselves
-0.56
发表于
-0.56
their
-0.56
OGND
-0.56
utafitiHapana
-0.54
pcm
-0.52
JTable
-0.51
washingtonpost
-0.51
POSITIVE LOGITS
myself
0.93
myſelf
0.87
myself
0.77
Myself
0.68
minhas
0.67
我自己
0.65
my
0.59
मैं
0.57
LookAnd
0.57
خودم
0.56
Activations Density 0.906%