INDEX
Explanations
phrases related to legal issues and consequences
the article "the"
New Auto-Interp
Negative Logits
ontent
-0.73
berman
-0.70
ratulations
-0.69
ecided
-0.67
scape
-0.66
utan
-0.65
lly
-0.65
heit
-0.64
gans
-0.64
kson
-0.64
POSITIVE LOGITS
outset
1.36
behest
1.24
same
1.16
forefront
1.06
periphery
1.01
slightest
1.01
expense
1.00
intersections
0.96
conclusion
0.96
earliest
0.96
Activations Density 0.179%