INDEX
Explanations
phrases related to vertical or top-down orientation
New Auto-Interp
Negative Logits
ushes
-0.71
...]
-0.62
KING
-0.61
arag
-0.60
ushima
-0.58
thy
-0.58
ugh
-0.58
english
-0.57
May
-0.56
amoto
-0.56
POSITIVE LOGITS
stairs
0.75
version
0.70
versions
0.70
stock
0.63
spin
0.62
paddle
0.61
stretched
0.59
syndrome
0.59
overlay
0.58
cart
0.57
Activations Density 12.274%