INDEX
Explanations
phrases related to legal and financial claims
New Auto-Interp
Negative Logits
ugl
-0.17
alle
-0.16
uet
-0.15
ikh
-0.15
andin
-0.15
udic
-0.15
stupid
-0.15
chimp
-0.14
uder
-0.14
atar
-0.14
POSITIVE LOGITS
suspect
0.27
fatally
0.22
faulty
0.21
sketch
0.21
flawed
0.21
defective
0.20
fault
0.19
Fault
0.19
fall
0.18
bere
0.17
Activations Density 0.167%