INDEX
Explanations
instances of fleeing or escaping actions
New Auto-Interp
Negative Logits
Exacts
-0.41
المعيارى
-0.39
ganggu
-0.36
віч
-0.36
Uniform
-0.35
GLint
-0.35
uniform
-0.34
nmax
-0.34
ops
-0.33
ochi
-0.33
POSITIVE LOGITS
away
0.77
departure
0.73
oprot
0.71
escape
0.70
Flucht
0.69
fleeing
0.68
離開
0.67
flee
0.66
leaving
0.65
exitRule
0.65
Activations Density 0.423%