INDEX
Explanations
phrases indicating ongoing actions or states of being related to experience and participation
New Auto-Interp
Negative Logits
illy
-0.15
دÛĮد
-0.15
ulfilled
-0.14
ë²½
-0.14
ãģ«åħ¥
-0.14
erece
-0.14
ãĥ¼ãĥį
-0.14
ilateral
-0.14
:"-"`↵
-0.14
à¥ģव
-0.14
POSITIVE LOGITS
since
0.26
since
0.20
Since
0.19
Since
0.19
ince
0.18
ÙħÙĨذ
0.18
assa
0.17
aga
0.16
seit
0.16
existence
0.16
Activations Density 0.172%