INDEX
Explanations
sentences describing legal actions and sentences
New Auto-Interp
Negative Logits
ãĥ¤
-0.71
edia
-0.66
âĹ¼
-0.64
verified
-0.63
mom
-0.61
atically
-0.61
Near
-0.61
events
-0.60
editor
-0.60
scouts
-0.60
POSITIVE LOGITS
prison
1.22
jail
1.14
imprisonment
1.13
confinement
1.09
probation
1.04
incarceration
1.04
Detention
1.03
prison
0.99
incarcer
0.98
detention
0.97
Activations Density 0.135%