INDEX
Explanations
future situations and past feelings
New Auto-Interp
Negative Logits
Unable
0.30
Unable
0.29
Consider
0.29
choose
0.29
providing
0.28
Expect
0.28
synchronized
0.27
Associ
0.27
Choose
0.27
fournit
0.27
POSITIVE LOGITS
really
0.41
worked
0.40
hurts
0.37
mattered
0.37
feels
0.36
happened
0.36
REALLY
0.36
creep
0.34
resonate
0.34
works
0.33
Activations Density 0.063%