INDEX
Explanations
expressions related to intensity and qualities
New Auto-Interp
Negative Logits
تضيفلها
-0.62
not
-0.52
")[
-0.51
FLICT
-0.51
dula
-0.50
setUse
-0.49
})));
-0.49
atore
-0.49
"))
-0.48
ile
-0.48
POSITIVE LOGITS
pleaſure
0.80
greateſt
0.75
myſelf
0.75
itſelf
0.74
Majefty
0.69
themſelves
0.68
himſelf
0.68
againſt
0.67
purpoſe
0.65
دانشنامهٔ
0.65
Activations Density 0.373%