INDEX
Explanations
phrases that emphasize continuity and persistence over time
New Auto-Interp
Negative Logits
bezeichneter
-0.64
précédente
-0.61
estekak
-0.60
клопе
-0.59
lenker
-0.57
Houſe
-0.57
erequisites
-0.57
חיצוניים
-0.57
первых
-0.55
Previous
-0.54
POSITIVE LOGITS
tutt
0.78
still
0.72
still
0.68
至今
0.65
今でも
0.64
今も
0.63
fortfarande
0.61
ัจ
0.59
geblieben
0.56
fortsatt
0.55
Activations Density 0.224%