INDEX
Explanations
occurrences of directional or positional words and related phrases
New Auto-Interp
Negative Logits
Hibernate
-0.17
Helmet
-0.17
Hover
-0.16
Hop
-0.16
Holder
-0.16
isis
-0.16
Hover
-0.16
Hobby
-0.16
Hint
-0.15
hinges
-0.15
POSITIVE LOGITS
hand
0.84
hand
0.66
-hand
0.65
Hand
0.62
Hand
0.60
æīĭ
0.58
_hand
0.56
HAND
0.56
.hand
0.55
æīĭ
0.50
Activations Density 0.122%