INDEX
Explanations
references to personal websites and blogs
before colons, semicolons, or quotes
Reader, writer, apologize
New Auto-Interp
Negative Logits
تقاوى
-0.56
nakalista
-0.56
незавершена
-0.52
محفوظة
-0.52
ValueStyle
-0.51
vPvB
-0.50
الإنجليزية
-0.48
🟤
-0.48
UnitTesting
-0.46
цездатний
-0.46
POSITIVE LOGITS
apologize
0.35
Reader
0.35
مُعرِّف
0.35
Reader
0.33
something
0.33
apologizing
0.33
veggie
0.32
apologized
0.31
...
0.31
goddamn
0.31
Activations Density 0.222%