INDEX
Explanations
phrases indicating copyright or ownership of content
New Auto-Interp
Negative Logits
ader
-0.15
ahl
-0.15
alous
-0.14
off
-0.14
UNCH
-0.14
bomb
-0.14
ones
-0.14
обÑĢаз
-0.14
urger
-0.14
ple
-0.13
POSITIVE LOGITS
rights
0.32
Rights
0.31
RIGHTS
0.28
Rights
0.22
-rights
0.22
пÑĢава
0.21
_rights
0.20
rights
0.20
trademarks
0.17
CharacterSet
0.16
Activations Density 0.008%