INDEX
Explanations
legal cases and proceedings
New Auto-Interp
Negative Logits
agra
-0.73
gone
-0.67
yre
-0.64
laure
-0.60
cius
-0.58
Polk
-0.57
pass
-0.55
aye
-0.55
chery
-0.55
pire
-0.54
POSITIVE LOGITS
],[
1.12
]).
1.12
]
1.11
][
1.03
]),
1.02
]:
0.99
])
0.98
].
0.98
]"
0.96
]);
0.88
Activations Density 1.940%