INDEX
Explanations
instances of the word "still."
New Auto-Interp
Negative Logits
apro
-0.15
rtle
-0.15
ed
-0.15
ricks
-0.15
edBy
-0.14
ially
-0.14
chts
-0.14
chied
-0.14
ellen
-0.14
QC
-0.14
POSITIVE LOGITS
waters
0.26
water
0.26
birth
0.24
waters
0.23
ness
0.23
Waters
0.22
lifes
0.20
born
0.19
others
0.19
wagon
0.19
Activations Density 0.016%