INDEX
Explanations
words related to movement towards a specific direction
occurrences of the word "towards."
New Auto-Interp
Negative Logits
umm
-0.77
chell
-0.69
CD
-0.68
ItemImage
-0.67
nz
-0.66
RC
-0.65
tel
-0.65
cell
-0.64
nesia
-0.64
PIN
-0.64
POSITIVE LOGITS
towards
1.22
wards
1.08
toward
1.07
Towards
0.99
ward
0.90
WARD
0.84
infinity
0.77
fulfil
0.75
vernment
0.73
itiveness
0.73
Activations Density 0.013%