INDEX
Explanations
phrases indicating negative states or situations
New Auto-Interp
Negative Logits
externi
-0.37
BeginInit
-0.34
Kanpo
-0.33
テー
-0.32
iam
-0.30
FXML
-0.30
vel
-0.29
TestTools
-0.29
pulled
-0.29
دیکھیے
-0.29
POSITIVE LOGITS
nowhere
1.00
bounds
0.89
sight
0.86
whack
0.83
sync
0.82
reach
0.75
Bounds
0.71
harms
0.71
0.69
synch
0.68
Activations Density 0.193%