INDEX
Explanations
phrases related to legal definitions, regulations, and criteria
phrases related to legal definitions and gender-related concepts
New Auto-Interp
Negative Logits
brass
-0.71
snapped
-0.69
jer
-0.68
stash
-0.66
buddies
-0.65
Geral
-0.65
Lyft
-0.65
bartender
-0.65
iPhones
-0.65
Winc
-0.64
POSITIVE LOGITS
impair
1.06
adequ
0.96
reasonable
0.95
impairment
0.93
sufficient
0.90
worthiness
0.90
ocative
0.88
reason
0.88
cellence
0.87
acceptable
0.87
Activations Density 0.721%