INDEX
Explanations
phrases related to significant events or actions
references to significant actions and their consequences
New Auto-Interp
Negative Logits
sleeps
-0.76
tains
-0.64
waits
-0.63
Pros
-0.62
è£ıè
-0.62
Uses
-0.60
resides
-0.59
craw
-0.59
likes
-0.59
hands
-0.58
POSITIVE LOGITS
illustrate
1.53
underscore
1.46
demonstrate
1.38
indicate
1.38
remind
1.34
imply
1.33
undermine
1.31
signify
1.29
reinforce
1.28
constitute
1.27
Activations Density 0.371%