INDEX
Explanations
numerical options or choices in various contexts
New Auto-Interp
Negative Logits
aths
-0.76
mouth
-0.76
mination
-0.73
culosis
-0.72
thing
-0.71
ended
-0.69
rowth
-0.68
DEBUG
-0.68
kamp
-0.67
ibur
-0.66
POSITIVE LOGITS
amongst
0.94
assorted
0.92
among
0.91
list
0.88
randomly
0.88
various
0.83
lists
0.82
alphabet
0.81
random
0.80
repertoire
0.79
Activations Density 0.107%