INDEX
Explanations
words related to cardinal or ordinal directions
directional terms related to travel or movement
New Auto-Interp
Negative Logits
aints
-0.67
ellation
-0.64
arth
-0.61
Owner
-0.60
lest
-0.59
tie
-0.59
Tale
-0.58
acea
-0.58
Scal
-0.57
Return
-0.57
POSITIVE LOGITS
wards
0.83
WARD
0.81
unnoticed
0.79
ModLoader
0.79
iless
0.78
toward
0.73
towards
0.70
noticed
0.70
stairs
0.69
stairs
0.68
Activations Density 0.113%