INDEX
Explanations
emotional adjectives expressing varying degrees of difficulty, rarity, and risk
New Auto-Interp
Negative Logits
ummy
-0.16
oro
-0.16
830
-0.14
uzzi
-0.14
errick
-0.14
ungs
-0.14
eydi
-0.14
opensource
-0.13
303
-0.13
aeda
-0.13
POSITIVE LOGITS
thing
0.28
enough
0.23
Thing
0.19
indeed
0.18
addition
0.18
thing
0.17
affair
0.17
way
0.17
phenomena
0.16
prospect
0.16
Activations Density 0.130%