INDEX
Explanations
common punctuation and structure elements that indicate questions or transitions in text
New Auto-Interp
Negative Logits
Promise
-0.15
ersh
-0.14
lam
-0.14
spoiler
-0.14
-0.13
رÛĮب
-0.13
ãĥ©ãĤ¤ãĥ³
-0.13
ç¤
-0.13
scram
-0.13
Kund
-0.13
POSITIVE LOGITS
itsu
0.15
olds
0.15
urons
0.14
malink
0.14
anim
0.14
lichkeit
0.13
HK
0.13
gw
0.13
onym
0.13
_IMPL
0.13
Activations Density 0.004%