INDEX
Negative Logits
spilling
0.38
summing
0.36
witch
0.35
idea
0.34
consolation
0.34
useDispatch
0.34
dealing
0.33
이야
0.33
stirring
0.32
molding
0.32
POSITIVE LOGITS
trapped
0.91
traps
0.86
Trap
0.80
Trap
0.79
trap
0.79
trap
0.77
陷
0.68
trapping
0.67
entrap
0.54
captive
0.51
Activations Density 0.000%