INDEX
Explanations
positive descriptors or phrases indicating quality and suitability
New Auto-Interp
Negative Logits
917
-0.14
298
-0.14
rire
-0.14
552
-0.14
enment
-0.14
TypeEnum
-0.14
830
-0.13
488
-0.13
453
-0.13
893
-0.13
POSITIVE LOGITS
way
0.37
place
0.27
addition
0.27
choice
0.24
excuse
0.23
ways
0.23
chance
0.22
opportunity
0.22
.way
0.22
fit
0.21
Activations Density 0.116%