INDEX
Explanations
variables and parameters related to dimensions and counts in code
New Auto-Interp
Negative Logits
pity
-0.61
shame
-0.57
pity
-0.51
shame
-0.49
principalColumn
-0.48
Pity
-0.48
Shame
-0.48
afficheront
-0.47
insuffisamment
-0.46
Shame
-0.45
POSITIVE LOGITS
+=
0.90
-=
0.63
+=
0.60
]+=
0.57
.=
0.50
/=
0.50
*=
0.47
|=
0.47
&=
0.47
///</
0.46
Activations Density 0.182%