INDEX
Explanations
specific clauses with restrictive language or imposing limitations
phrases related to policies and regulations
New Auto-Interp
Negative Logits
noticed
-0.74
Horizons
-0.68
Ready
-0.66
odiac
-0.66
Grip
-0.64
DragonMagazine
-0.64
Ready
-0.64
GGGGGGGG
-0.63
nice
-0.63
Jol
-0.63
POSITIVE LOGITS
violate
1.46
jeopard
1.41
undermine
1.41
infring
1.39
endanger
1.39
impede
1.31
unnecessarily
1.26
adversely
1.25
deprive
1.24
harm
1.23
Activations Density 0.322%