INDEX
Explanations
words related to reflection and self-assessment
New Auto-Interp
Negative Logits
mouths
-0.42
mouth
-0.41
Nap
-0.39
WARS
-0.39
putString
-0.36
beef
-0.36
Wars
-0.35
optString
-0.35
StringTo
-0.33
wikia
-0.33
POSITIVE LOGITS
reflection
2.20
Reflection
2.00
reflection
1.94
reflections
1.91
reflected
1.84
reflect
1.82
Reflection
1.79
reflecting
1.74
reflect
1.71
Reflect
1.70
Activations Density 0.052%