INDEX
Explanations
words and phrases related to future events or actions
New Auto-Interp
Negative Logits
Vive
-0.32
Balanced
-0.30
mates
-0.30
Tib
-0.28
Gob
-0.28
azar
-0.28
Pug
-0.28
shared
-0.28
Pend
-0.28
unts
-0.28
POSITIVE LOGITS
?:
0.40
step
0.36
!?
0.35
?,
0.34
unfold
0.34
emerge
0.33
steps
0.33
?
0.33
?!
0.33
.<
0.32
Activations Density 11.515%