INDEX
Explanations
prepositions or directional words related to physical movement
New Auto-Interp
Negative Logits
ancial
-0.83
incent
-0.79
orld
-0.77
wcs
-0.72
discrimination
-0.72
GV
-0.71
oples
-0.70
iners
-0.68
interstitial
-0.68
Thousands
-0.67
POSITIVE LOGITS
him
1.02
hers
1.01
Pyrrha
0.98
Jaune
0.97
her
0.97
Korra
0.96
Elsa
0.91
his
0.90
me
0.88
Chloe
0.87
Activations Density 0.162%