INDEX
Explanations
dates in a specific format - month, day, and year
instances of the letter "O" in varying contexts
New Auto-Interp
Negative Logits
ãĤ¼ãĤ¦ãĤ¹
-0.84
wip
-0.83
ãĥ¼ãĥĨãĤ£
-0.82
yards
-0.74
à¨
-0.74
ãĥķ
-0.72
ÑĮ
-0.71
sinks
-0.70
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.69
ãĥ¼ãĥĨ
-0.65
POSITIVE LOGITS
vernight
1.14
culus
1.12
tto
1.11
ceans
1.10
lymp
1.10
zzy
1.08
scill
1.06
oga
1.06
atmeal
1.04
oops
1.00
Activations Density 0.020%