INDEX
Explanations
sentences that express existential questions and reflections on purpose
New Auto-Interp
Negative Logits
<bos>
-0.94
Baillargeon
-0.74
postIndex
-0.69
ⓧ
-0.67
unsuccessfully
-0.64
?】
-0.62
principalColumn
-0.61
specified
-0.61
]--;
-0.59
Sinon
-0.59
POSITIVE LOGITS
humans
0.62
humanity
0.59
mennesker
0.59
hidupan
0.58
essentielles
0.54
ideas
0.53
mankind
0.53
beings
0.53
Kjelder
0.52
sensibilité
0.52
Activations Density 0.245%