INDEX
Explanations
various forms of verbs related to actions or processes
research and ongoing work
New Auto-Interp
Negative Logits
lendemain
-0.32
">—
-0.27
MarshalTo
-0.24
mattina
-0.21
tutto
-0.21
machung
-0.20
nictwa
-0.20
מש
-0.19
respald
-0.19
TagMode
-0.19
POSITIVE LOGITS
<unused41>
0.87
<unused14>
0.87
<unused8>
0.87
[@BOS@]
0.87
<unused79>
0.87
<unused23>
0.87
<unused51>
0.87
<unused28>
0.87
<unused3>
0.87
<unused16>
0.87
Activations Density 0.039%