INDEX
Explanations
instances where someone is noticing or being noticed
instances of the word "notice" and its variations
New Auto-Interp
Negative Logits
raft
-0.80
sis
-0.78
adr
-0.76
ulum
-0.74
quer
-0.74
export
-0.72
venge
-0.72
uga
-0.71
prep
-0.70
wives
-0.70
POSITIVE LOGITS
how
0.88
discrepancies
0.78
similarities
0.76
flies
0.71
inconsistencies
0.71
noticing
0.70
spikes
0.68
something
0.68
Improvement
0.67
noticed
0.66
Activations Density 0.034%