INDEX
Explanations
words related to fitting a description or role, especially in a professional or physical context
New Auto-Interp
Negative Logits
ãĥ£
-0.67
authorize
-0.65
inciting
-0.63
wrest
-0.63
unable
-0.62
refusing
-0.61
icides
-0.61
downed
-0.61
oking
-0.61
iger
-0.60
POSITIVE LOGITS
precon
1.06
snug
0.94
criteria
0.92
stereotype
0.90
mold
0.87
neatly
0.87
ahime
0.85
niche
0.85
stereotypes
0.84
mould
0.83
Activations Density 0.080%