INDEX
Explanations
phrases related to expectations and actions driven by needs or desires
New Auto-Interp
Negative Logits
oeff
-0.17
inding
-0.17
lein
-0.15
oom
-0.15
T
-0.14
emme
-0.14
Vide
-0.14
emachine
-0.14
Propel
-0.13
ober
-0.13
POSITIVE LOGITS
doing
0.35
doing
0.33
done
0.33
åģļ
0.31
Doing
0.29
Doing
0.28
done
0.24
à¸Ĺำ
0.23
Done
0.22
-done
0.20
Activations Density 0.228%