INDEX
Explanations
directional terms and references to navigation
New Auto-Interp
Negative Logits
lest
-0.15
utr
-0.15
ledge
-0.15
/interface
-0.14
inson
-0.14
Hole
-0.14
hole
-0.14
Wide
-0.14
majority
-0.14
asaki
-0.14
POSITIVE LOGITS
ward
0.28
wards
0.27
bound
0.25
toward
0.21
towards
0.20
Bound
0.19
WARDS
0.17
WARD
0.17
direction
0.17
Ñıж
0.15
Activations Density 0.024%