INDEX
Explanations
phrases that express a purpose or reason behind an action
New Auto-Interp
Negative Logits
_UNUSED
-0.14
endif
-0.14
angu
-0.14
iske
-0.14
erner
-0.14
ÏĮν
-0.14
اÙĨد
-0.14
asts
-0.13
aea
-0.13
алÑĥ
-0.13
POSITIVE LOGITS
fun
0.25
kicks
0.23
aging
0.21
FUN
0.20
_fun
0.20
Fun
0.20
granted
0.20
Fun
0.19
sport
0.19
pleasure
0.18
Activations Density 0.160%