INDEX
Explanations
phrases that discuss the importance of decision-making and the factors that influence outcomes
New Auto-Interp
Negative Logits
.infinity
-0.16
ifen
-0.16
avery
-0.15
chwitz
-0.15
pires
-0.14
mediately
-0.14
ragments
-0.14
naturally
-0.14
lass
-0.14
ypse
-0.13
POSITIVE LOGITS
simple
0.30
simples
0.27
simple
0.26
simply
0.25
factors
0.25
Simply
0.25
combination
0.24
Simple
0.23
Simply
0.23
Factors
0.22
Activations Density 0.048%