INDEX
Explanations
phrases that denote various types of abilities and capacities
New Auto-Interp
Negative Logits
ishly
-0.18
ilo
-0.17
راÙĨ
-0.17
ers
-0.17
inee
-0.16
rej
-0.15
ery
-0.15
GRAT
-0.15
eters
-0.15
Ậ
-0.15
POSITIVE LOGITS
-bodied
0.30
/dis
0.17
bod
0.17
ius
0.16
hood
0.16
son
0.16
unch
0.16
ies
0.16
uali
0.16
ty
0.15
Activations Density 0.029%