INDEX
Explanations
questions and requests for clarification
New Auto-Interp
Negative Logits
>=",
-0.84
")){
-0.84
%");
-0.82
OOTDTY
-0.82
"):
-0.81
Παραπομπές
-0.81
NUMX
-0.80
'])){
-0.79
%</
-0.76
'):
-0.75
POSITIVE LOGITS
Anyway
0.65
Anyway
0.65
Sorry
0.64
sorry
0.64
So
0.63
↵
0.59
So
0.58
I
0.58
Maybe
0.55
But
0.55
Activations Density 0.374%