INDEX
Explanations
references to interpersonal relationships and personal perspectives
New Auto-Interp
Negative Logits
odable
-0.16
ãĥ¼ãĤº
-0.14
ahl
-0.14
apore
-0.14
thumb
-0.13
μοί
-0.13
ê°Ī
-0.13
èª
-0.13
ľ
-0.13
antar
-0.13
POSITIVE LOGITS
accomplished
0.33
accomplish
0.31
accompl
0.28
doing
0.27
done
0.25
doing
0.25
Doing
0.23
Doing
0.21
_done
0.20
accom
0.20
Activations Density 0.203%