INDEX
Explanations
words related to legal and crime activities
instances of legal terminology or references to laws
New Auto-Interp
Negative Logits
favorably
-1.11
favor
-1.08
favoring
-0.98
subsidized
-0.96
favored
-0.95
favors
-0.92
flavor
-0.90
willfully
-0.89
mustache
-0.88
categorized
-0.87
POSITIVE LOGITS
Scroll
1.78
BBC
1.72
Labour
1.70
Shape
1.68
Scotland
1.67
However
1.64
Speaking
1.64
Writing
1.60
Professor
1.59
Mr
1.59
Activations Density 0.292%