INDEX
Explanations
phrases related to actions or processes
phrases indicating relationships or connections
New Auto-Interp
Negative Logits
yd
-0.73
ophy
-0.71
cci
-0.66
JD
-0.65
amiya
-0.65
tri
-0.65
È
-0.64
ournal
-0.63
chant
-0.62
cro
-0.62
POSITIVE LOGITS
pload
0.74
theirs
0.74
Havana
0.69
Fidel
0.66
hers
0.65
slavery
0.65
rubble
0.64
Pin
0.64
rout
0.60
steroids
0.60
Activations Density 0.417%