INDEX
Explanations
words related to control, direction, or dominance
terms related to obstacles or challenges
New Auto-Interp
Negative Logits
ĻĤ
-0.64
Ń·
-0.60
zbek
-0.59
Ô
-0.59
¿½
-0.57
vae
-0.57
ccording
-0.56
Hass
-0.56
Worldwide
-0.55
Marketable
-0.53
POSITIVE LOGITS
iest
1.17
liest
0.80
throne
0.71
osphere
0.70
hest
0.69
ultimate
0.68
aisle
0.68
antry
0.67
tray
0.66
basket
0.65
Activations Density 0.649%