INDEX
Explanations
names that are similar to well-known figures or terms that may carry controversial or sensitive associations
New Auto-Interp
Negative Logits
!:
-0.46
spoilers
-0.46
Canaver
-0.45
Patreon
-0.43
DragonMagazine
-0.41
FANTASY
-0.41
Originally
-0.41
cautiously
-0.41
tattoos
-0.40
digitally
-0.40
POSITIVE LOGITS
)).
0.92
]).
0.87
.).
0.81
%).
0.79
]."
0.77
]),
0.76
)."
0.76
).
0.73
)),
0.71
));
0.69
Activations Density 2.096%