INDEX
Explanations
phrases indicating ease or simplicity in various actions or situations
New Auto-Interp
Negative Logits
mtree
-0.16
INTR
-0.16
mts
-0.15
prot
-0.15
prot
-0.15
umpt
-0.14
ÏİÏĤ
-0.14
ÐŁÑĢоÑĤ
-0.14
irc
-0.14
Diamond
-0.14
POSITIVE LOGITS
antan
0.17
azen
0.16
boys
0.15
eyh
0.15
fell
0.15
dÃłng
0.15
uela
0.14
angelo
0.14
ausal
0.14
athan
0.13
Activations Density 0.033%