INDEX
Explanations
positive experiences and moments of relaxation or change
New Auto-Interp
Negative Logits
ULO
-0.18
Bas
-0.16
ivot
-0.15
Ramp
-0.15
Baseline
-0.14
opup
-0.14
ela
-0.14
.onView
-0.14
amps
-0.13
inois
-0.13
POSITIVE LOGITS
instead
0.15
zav
0.15
ÅĻÃŃm
0.15
completion
0.15
ãĥ¼ãĥ³
0.15
ivent
0.15
·æĸ°
0.15
odos
0.15
chu
0.15
unlike
0.14
Activations Density 0.305%