INDEX
Explanations
questions or phrases that express a sense of inquiry about reasons or explanations
New Auto-Interp
Negative Logits
rollup
-0.49
CBM
-0.48
esu
-0.47
GSC
-0.47
UCC
-0.47
CMC
-0.47
IMC
-0.45
DRS
-0.45
ennan
-0.45
GCM
-0.44
POSITIVE LOGITS
why
1.85
why
1.70
Why
1.61
Why
1.60
WHY
1.53
WHY
1.47
pourquoi
1.38
waarom
1.27
Warum
1.27
Pourquoi
1.27
Activations Density 0.013%