INDEX
Explanations
repeated usage of the word "always."
New Auto-Interp
Negative Logits
Muffins
-0.84
parís
-0.82
âgées
-0.79
bombe
-0.78
er
-0.77
Niels
-0.74
Lippincott
-0.73
Noire
-0.72
vnto
-0.70
ه
-0.69
POSITIVE LOGITS
always
1.92
Always
1.85
ALWAYS
1.77
always
1.77
Always
1.76
ALWAYS
1.63
siempre
1.30
alway
1.28
Siempre
1.26
Siempre
1.25
Activations Density 0.071%