INDEX
Explanations
verbs that indicate recent actions or events
New Auto-Interp
Negative Logits
Biôgrafia
-0.45
InitVars
-0.43
Життєпис
-0.41
nanti
-0.40
ptid
-0.35
Билгалдахарш
-0.34
hedef
-0.34
hrad
-0.33
Préférences
-0.33
eqn
-0.33
POSITIVE LOGITS
Newly
0.94
appena
0.93
newly
0.93
newly
0.92
Newly
0.88
freshly
0.83
剛
0.82
刚
0.81
recién
0.81
للتو
0.80
Activations Density 0.241%