INDEX
Explanations
references to "words" and their usage in various contexts
New Auto-Interp
Negative Logits
becauſe
-0.72
zelve
-0.72
sphase
-0.71
aughter
-0.70
pleaſure
-0.67
Initialise
-0.66
первых
-0.65
rospy
-0.65
RegistryLite
-0.65
dieux
-0.64
POSITIVE LOGITS
Wink
0.65
vers
0.58
ers
0.57
ψ
0.55
bote
0.54
0.54
kali
0.53
esper
0.53
mata
0.53
Omega
0.53
Activations Density 0.004%