INDEX
Explanations
numeric values stated as words
specific numerical values and their associations within various contexts
New Auto-Interp
Negative Logits
osaurs
-0.47
favor
-0.43
pres
-0.43
î
-0.41
éĸ
-0.39
rocket
-0.39
kj
-0.39
ildo
-0.38
minecraft
-0.38
LOS
-0.38
POSITIVE LOGITS
etheless
0.67
tradem
0.61
awaru
0.60
rul
0.59
corrid
0.57
destro
0.56
DragonMagazine
0.55
theless
0.55
skelet
0.54
challeng
0.53
Activations Density 1.972%