INDEX
Explanations
giving advice or recommendationssuggesting future actionsstating assumptions or implications
New Auto-Interp
Negative Logits
0.25
succeed
0.24
estimate
0.24
\
0.23
0.23
```
0.23
↵
0.23
regarded
0.22
envisioned
0.22
yw
0.22
POSITIVE LOGITS
having
0.66
getting
0.55
taking
0.55
putting
0.53
bringing
0.52
having
0.51
obtaining
0.50
placing
0.49
storing
0.47
داشتن
0.47
Activations Density 0.335%