INDEX
Explanations
active verbs and phrases related to work, learning, and social interactions
verbs and what follows them
New Auto-Interp
Negative Logits
do
-0.37
I
-0.34
when
-0.34
l
-0.33
if
-0.32
m
-0.31
match
-0.31
T
-0.31
when
-0.31
my
-0.31
POSITIVE LOGITS
surla
0.89
ロウィン
0.85
ðsíða
0.79
majánló
0.77
ſicht
0.76
queſta
0.75
<unused14>
0.74
<unused41>
0.74
<unused8>
0.74
[@BOS@]
0.74
Activations Density 0.300%