INDEX
Explanations
instances of the word "onto" along with related phrases conveying movement or direction
New Auto-Interp
Negative Logits
X
-0.81
h
-0.77
r
-0.74
h
-0.71
r
-0.71
m
-0.71
E
-0.69
Fields
-0.69
m
-0.69
M
-0.69
POSITIVE LOGITS
onto
1.46
onto
1.10
Ont
1.03
myſelf
1.00
Ont
1.00
₂+
0.98
Cuánt
0.96
.}~\
0.93
gether
0.93
chofe
0.92
Activations Density 0.071%