INDEX
Explanations
phrases and concepts related to permanence and existence
New Auto-Interp
Negative Logits
did
-0.21
did
-0.19
does
-0.18
does
-0.17
doing
-0.17
do
-0.17
DOES
-0.16
Did
-0.15
DID
-0.15
.did
-0.15
POSITIVE LOGITS
bec
0.22
becomes
0.20
becoming
0.20
become
0.20
became
0.19
Bec
0.19
Become
0.17
ä¹Łæĺ¯
0.17
Become
0.17
IS
0.17
Activations Density 0.195%