INDEX
Explanations
references to giving, receiving, or being offered opportunities or choices
New Auto-Interp
Negative Logits
tep
-0.16
bic
-0.15
ime
-0.15
endi
-0.14
åĭĴ
-0.14
leur
-0.14
deo
-0.14
essel
-0.14
antu
-0.13
sou
-0.13
POSITIVE LOGITS
tasks
0.23
opportunity
0.20
task
0.20
choice
0.19
blank
0.19
instructions
0.18
tasks
0.17
ä»»åĬ¡
0.17
tasked
0.17
Tasks
0.17
Activations Density 0.137%