INDEX
Explanations
phrases indicating prevalence or quantification
New Auto-Interp
Negative Logits
原始内容存档于
-0.66
UnusedPrivate
-0.52
wanted
-0.48
HOT
-0.48
ToLeft
-0.48
as
-0.47
الض
-0.47
مق
-0.47
مق
-0.46
Hozzáférés
-0.46
POSITIVE LOGITS
ſelf
0.94
pleaſure
0.94
Jefus
0.92
itſelf
0.92
greateſt
0.91
purpoſe
0.87
beſt
0.86
poffible
0.86
ſtre
0.85
tranſ
0.85
Activations Density 0.232%