INDEX
Explanations
phrases related to activities done at the expense of something or someone else
references to costs or consequences incurred by others
New Auto-Interp
Negative Logits
arre
-0.65
hairs
-0.64
oded
-0.62
ecided
-0.62
ruciating
-0.62
terday
-0.62
orious
-0.60
odes
-0.60
swick
-0.59
oggles
-0.59
POSITIVE LOGITS
expense
0.86
detriment
0.79
士
0.73
ebook
0.73
altar
0.73
adle
0.72
disadvantage
0.71
hyde
0.70
ãĤ«
0.70
ARM
0.69
Activations Density 0.034%