INDEX
Explanations
words related to a specific direction, often indicating a negative connotation
references to the concept of direction
New Auto-Interp
Negative Logits
nai
-0.76
LU
-0.75
bians
-0.74
tein
-0.73
TL
-0.70
TED
-0.69
VA
-0.66
zb
-0.66
enos
-0.66
Prof
-0.65
POSITIVE LOGITS
direction
1.31
directions
1.22
direction
0.86
finding
0.85
eering
0.82
naire
0.77
Direction
0.76
arity
0.74
alignment
0.73
orientation
0.73
Activations Density 0.010%