INDEX
Explanations
words related to injuries or potential harm
keywords related to injuries, harm, and their consequences
New Auto-Interp
Negative Logits
last
-0.85
achment
-0.78
soDeliveryDate
-0.76
itta
-0.73
atar
-0.73
ãĤ¨
-0.71
liam
-0.71
ku
-0.70
yssey
-0.67
adle
-0.67
POSITIVE LOGITS
nor
1.36
anymore
1.17
whatsoever
1.13
slightest
0.81
anybody
0.75
anything
0.75
laure
0.74
anywhere
0.72
Laughs
0.68
except
0.68
Activations Density 0.398%