INDEX
Explanations
the prefix "po-" associated with various activities or objects
terms related to physical punishment or harm
New Auto-Interp
Negative Logits
owship
-0.92
ATIONAL
-0.76
hips
-0.75
IELD
-0.70
Mellon
-0.68
embodiments
-0.68
ITNESS
-0.67
AGES
-0.66
EDITION
-0.65
AGE
-0.65
POSITIVE LOGITS
achers
0.96
pper
0.92
oper
0.92
pert
0.90
ppy
0.87
oting
0.86
ffer
0.86
Äį
0.86
etary
0.84
itives
0.84
Activations Density 0.008%