INDEX
Explanations
phrases or terms related to legal matters and regulations that would be subject to scrutiny or review
New Auto-Interp
Negative Logits
latt
-0.60
yip
-0.60
Zur
-0.59
Zimmer
-0.58
Sutherland
-0.58
cane
-0.58
ttp
-0.56
Euras
-0.56
kr
-0.55
Tib
-0.55
POSITIVE LOGITS
ively
1.27
ivity
1.18
ivist
1.02
ioned
1.00
ivating
1.00
ive
0.97
ivities
0.96
itatively
0.94
iving
0.93
Reviewer
0.93
Activations Density 0.015%