INDEX
Explanations
instances of the word "notice" and its variations
New Auto-Interp
Negative Logits
SSIP
-0.20
rose
-0.16
uppe
-0.15
/or
-0.15
iggers
-0.15
Ngb
-0.14
plet
-0.14
=".$_
-0.14
oper
-0.14
vigor
-0.14
POSITIVE LOGITS
ably
0.28
ingly
0.18
lessly
0.16
Notice
0.16
Notice
0.15
ances
0.15
erved
0.15
tir
0.15
lijk
0.14
abwe
0.14
Activations Density 0.040%