INDEX
Explanations
instances where rules or violations are mentioned
phrases related to legal or regulatory actions
New Auto-Interp
Negative Logits
uador
-0.55
DragonMagazine
-0.54
bably
-0.53
©¶æ
-0.52
akespeare
-0.49
Architects
-0.47
ventus
-0.47
ĸļ
-0.46
icho
-0.45
ibaba
-0.45
POSITIVE LOGITS
.?
1.29
.(
1.23
.
1.23
!.
1.22
.</
1.19
.'
1.17
.�
1.16
anymore
1.14
%.
1.12
.–
1.12
Activations Density 1.441%