INDEX
Explanations
phrases that describe instances of putting a positive spin on experiences or concepts
New Auto-Interp
Negative Logits
loom
-0.18
etine
-0.16
fold
-0.16
etchup
-0.15
igh
-0.15
541
-0.15
zug
-0.15
ola
-0.14
883
-0.14
Fold
-0.14
POSITIVE LOGITS
ulus
0.16
ibase
0.15
RID
0.14
Frontier
0.14
anyways
0.13
vic
0.13
æĹĹ
0.13
elier
0.13
ulumi
0.13
vic
0.13
Activations Density 0.095%