INDEX
Explanations
words related to legal or formal terms related to judgment or decision making
terms related to legal and medical processes or outcomes
New Auto-Interp
Negative Logits
unh
-0.66
pepper
-0.62
ev
-0.62
squat
-0.61
skate
-0.59
near
-0.58
unarmed
-0.58
squats
-0.58
Gr
-0.58
WA
-0.57
POSITIVE LOGITS
ication
4.53
ications
3.45
icated
3.11
icating
2.90
icate
2.76
icates
2.60
icator
2.45
icators
2.19
icative
2.02
icable
1.68
Activations Density 0.005%