INDEX
Explanations
the word "alpha" at varying activation levels
references to "alpha" ratings or levels in various contexts
New Auto-Interp
Negative Logits
Ö¼
-0.98
ROR
-0.87
ding
-0.82
SIGN
-0.79
enegger
-0.78
ITE
-0.76
enance
-0.75
ronic
-0.75
DM
-0.74
vous
-0.73
POSITIVE LOGITS
Centauri
1.15
alpha
1.04
predator
0.79
males
0.79
fide
0.76
Moroc
0.75
predators
0.75
male
0.73
xual
0.73
alpha
0.73
Activations Density 0.006%