INDEX
Explanations
incidents involving actions that can cause physical harm or distress
New Auto-Interp
Negative Logits
olg
-0.51
Background
-0.49
paña
-0.49
feer
-0.47
nahilalakip
-0.47
Vort
-0.46
orie
-0.46
background
-0.45
Ag
-0.45
donate
-0.45
POSITIVE LOGITS
complexContent
0.78
addContainerGap
0.66
arrival
0.66
ViewFeatures
0.65
arrives
0.65
arrive
0.62
UnknownFieldSet
0.62
contextLoads
0.61
arrivals
0.60
useStyles
0.60
Activations Density 0.223%