INDEX
Explanations
phrases indicating accessibility to the public or various groups
phrases indicating openness and accessibility to the public
New Auto-Interp
Negative Logits
å¦
-0.72
Dur
-0.71
Slug
-0.68
yan
-0.64
pring
-0.64
wise
-0.63
bowl
-0.63
arse
-0.62
chem
-0.61
urga
-0.61
POSITIVE LOGITS
interpretation
0.96
accommodate
0.87
scrutiny
0.79
newcomers
0.76
outsiders
0.74
Flavoring
0.74
criticism
0.73
explore
0.71
experimentation
0.69
exploration
0.67
Activations Density 0.122%