INDEX
Explanations
instances where there is a small or insignificant amount of something
phrases indicating scarcity or lack of something
New Auto-Interp
Negative Logits
ules
-0.83
olds
-0.78
agents
-0.78
adows
-0.76
asions
-0.76
mates
-0.76
hov
-0.75
ategories
-0.74
roo
-0.73
ħĭ
-0.72
POSITIVE LOGITS
incentive
1.15
indication
1.10
recourse
1.04
opportunity
1.03
overlap
1.01
evidence
1.00
chance
1.00
appetite
0.97
clarity
0.97
oversight
0.95
Activations Density 0.123%