INDEX
Explanations
phrases related to guidance and support in various contexts
New Auto-Interp
Negative Logits
udd
-0.15
Confirmation
-0.14
è§
-0.13
æĮĻ
-0.13
íĥĿ
-0.13
çľ
-0.13
alice
-0.13
GLfloat
-0.13
safely
-0.12
widespread
-0.12
POSITIVE LOGITS
discipline
0.35
discipl
0.34
disciplinary
0.32
correction
0.31
span
0.31
reb
0.31
lect
0.30
stern
0.30
spanking
0.30
disciplines
0.28
Activations Density 0.549%