INDEX
Explanations
references to actions and observations
New Auto-Interp
Negative Logits
OwnProperty
-0.16
ounder
-0.16
zes
-0.15
observations
-0.15
Bristol
-0.14
esian
-0.14
vert
-0.14
avra
-0.14
jišť
-0.14
Observ
-0.14
POSITIVE LOGITS
evidence
0.18
faces
0.18
signs
0.18
burgh
0.17
face
0.16
Signs
0.16
iface
0.15
Äįku
0.15
/sm
0.15
results
0.15
Activations Density 0.093%