INDEX
Explanations
the presence of a specific start token in the text
Follows initial capital letter abbreviations
New Auto-Interp
Negative Logits
Monday
-0.88
Monday
-0.88
lundi
-0.76
MONDAY
-0.74
MONDAY
-0.67
Lundi
-0.67
monday
-0.65
lunedì
-0.64
Montag
-0.64
Mondays
-0.61
POSITIVE LOGITS
Fridays
0.82
Friday
0.80
Friday
0.75
Fri
0.73
Fri
0.72
friday
0.68
friday
0.66
FRIDAY
0.61
viernes
0.58
venerdì
0.57
Activations Density 0.049%