INDEX
Explanations
words related to location and proximity
New Auto-Interp
Negative Logits
enez
-0.63
ãĥ£
-0.63
tec
-0.62
goodbye
-0.61
FORE
-0.61
rey
-0.60
lan
-0.60
ennes
-0.59
ense
-0.59
Ger
-0.59
POSITIVE LOGITS
bounds
0.87
reach
0.87
ciating
0.83
isine
0.83
¥ŀ
0.82
ĵĺ
0.81
ĨĴ
0.80
Reach
0.79
ģĸ
0.78
parentheses
0.78
Activations Density 0.026%