INDEX
Explanations
phrases related to decision-making and personal experiences
New Auto-Interp
Negative Logits
currently
-0.78
presently
-0.77
Dialogue
-0.70
now
-0.69
anymore
-0.68
now
-0.67
currently
-0.66
ethy
-0.65
arta
-0.65
ulum
-0.64
POSITIVE LOGITS
originally
1.18
yesterday
1.05
last
1.03
earlier
1.02
previously
0.98
initially
0.92
hes
0.91
recently
0.87
wolves
0.87
unsuccessful
0.82
Activations Density 2.929%