INDEX
Explanations
phrases indicating time periods or durations
during a time
New Auto-Interp
Negative Logits
anſ
-0.68
Majefty
-0.68
houſe
-0.68
purpoſe
-0.67
ſta
-0.67
ſtand
-0.66
ſte
-0.66
pleaſure
-0.66
dieux
-0.63
perſon
-0.60
POSITIVE LOGITS
during
1.91
during
1.79
DURING
1.64
During
1.61
During
1.59
durante
1.50
durante
1.47
tijdens
1.38
Durante
1.31
Durante
1.27
Activations Density 0.037%