INDEX
Explanations
verbs related to communication or interaction
New Auto-Interp
Negative Logits
suppose
-0.62
terday
-0.56
namely
-0.54
anchored
-0.54
hosted
-0.52
applied
-0.52
viz
-0.52
awaited
-0.51
she
-0.51
requiring
-0.50
POSITIVE LOGITS
oneself
1.05
ourselves
1.03
yourselves
1.03
yourself
1.00
them
0.93
ulate
0.89
themselves
0.88
ãĥĥãĥī
0.82
igate
0.81
iate
0.80
Activations Density 2.402%