INDEX
Explanations
situations or attributes that are extreme or incredulous
phrases indicating exaggerated or implausible scenarios
New Auto-Interp
Negative Logits
kind
-0.72
largeDownload
-0.65
aldi
-0.62
updated
-0.61
ancest
-0.60
chwitz
-0.58
Products
-0.57
ARA
-0.57
atis
-0.57
Cake
-0.56
POSITIVE LOGITS
withstand
0.83
compete
0.80
feas
0.78
contemplate
0.78
Fail
0.76
anymore
0.75
outweigh
0.74
survive
0.74
tolerate
0.73
bother
0.73
Activations Density 0.087%