INDEX
Explanations
phrases related to self-interest and exploitation in various contexts
New Auto-Interp
Negative Logits
zep
-0.50
juridiques
-0.47
next
-0.45
UnifiedTopology
-0.44
rá
-0.44
jScrollPane
-0.42
tillon
-0.42
SCI
-0.41
nueces
-0.41
چ
-0.41
POSITIVE LOGITS
متعلقه
0.82
PreferredItem
0.81
Portale
0.78
autorytatywna
0.76
motives
0.74
<=",
0.72
parsedMessage
0.72
RegressionTest
0.71
motive
0.70
للاسماء
0.69
Activations Density 0.401%