INDEX
Explanations
items or concepts that can be of varied types or purposes
terms indicating inclusivity or the ability to use various options
New Auto-Interp
Negative Logits
alus
-0.83
illard
-0.83
sson
-0.71
SA
-0.69
mand
-0.67
irth
-0.65
BA
-0.65
opsis
-0.63
pu
-0.63
pee
-0.62
POSITIVE LOGITS
imaginable
1.22
THING
1.08
conceivable
1.00
where
0.97
body
0.95
else
0.89
sorts
0.87
kind
0.85
thin
0.82
sort
0.78
Activations Density 0.064%