Saturday, August 8, 2015

Overpowered reward mechanisms and the rat race

During my final semester of college, I spent some time reflecting on my mood: How do I feel, day to day? What thoughts / events / interactions trigger which moods?


Reward Mechanisms


I found that my mood is regulated in large part by what I'll call "short-term reward mechanisms". Each reward mechanism is an kind of rule-enforcer (e.g. "turn in your problem sets on time"); when I follow the rule, I feel good, and when I break the rule I feel bad. I have many different reward mechanisms that correspond to many different rules; most of them have to do with being "productive" in some domain.

To first order, these short-term reward mechanisms are really useful to me. They translate my long term goals--which are sometimes too complicated to be emotionally salient--into bite-sized pieces that I can feel without taking the time to reflect deeply. My reward mechanisms constitute and internal incentive system that keeps me on track toward doing the things I care about.

However, they also come with a host of more subtle drawbacks (in order of increasing problematic-ness):
  1. I can approximate what I really care about using a handful of binary rules, but the simple structure necessarily misses out on some nuances. So I sometimes feel compelled to do things which don't make sense, like working on an end-of-semester problem set that will improve neither my understanding nor my grade.
  2. It's often easier to measure the quality of the outcomes my actions cause than it is to measure the quality of my actions themselves (e.g. when randomness is involved). As a result, I tend to build rules that reward me for outcomes rather than effort. This can make me emotionally responsible for things outside of my control, which I dislike.
  3. Sometimes, I can get caught in downward spirals. First, I am unproductive and a reward mechanism makes me feel bad about it. Second, my bad mood promotes moping rather than being productive. Repeat for a few hours or days until something shakes me out of it.
  4. Due to my reward mechanisms, I don't need to reflect on my real values in order to feel motivated to act toward them. As a result, I do less reflection, and sometimes forget why I'm doing what I'm doing. Following rules begins to seem like a hollow goal; I can feel like I'm just stuck in the rat race. That's probably not conducive to achieving my goals in the long term.
  5. I fundamentally value the quality of my day-to-day internal experience, so it's too bad that my mood is tied up regulating my motivation. I would prefer to have a source of motivation that operates independently of my mood (at least most of the time), so I can be both motivated and happy (and whatever other moods).


How this could get better


Changing the time scale


How often should I pause, evaluate my actions and their consequences, and dole out emotional rewards or punishments to myself?

Suppose I were able to decrease the frequency with which I checked up on myself--say, to a few times a week. I think this would reduce some of my grievances against reward mechanisms. First, it would eliminate the risk of downward spirals, since the productivity-inhibiting mopey-ness induced by emotional "punishments" only lasts a few hours. Second, by bringing my reward mechanisms' time scale closer to the time scale of my fundamental values, I would be able to use rules which were better approximations to what I really care about. Third, having short-term rules more similar to my long-term goals could promote better awareness of my reasons for doing things.

Unfortunately, I don't think I can just decrease the frequency with which I apply my reward mechanisms. It is true that sometimes my application of reward mechanisms is triggered by specific events (e.g. getting a test back) or specific time markers (e.g. before bed each night); I could change the frequency of these applications. However, reward mechanisms also implicitly affect my mood because I anticipate their explicit application later on. For example, putting off a problem set tends to make me worry and feel guilty, not because there is a rule that says "don't put off problem sets", but because there is a rule that says, "turn in your problem sets on time" and I realize that putting off problem sets will cause bad feelings later. Just as my reward mechanisms are short-term reinforcers of my ultimate goals, my guilt-and-worrying feelings are instantaneous reinforcers of reward mechanisms. Even if I decide to explicitly feel good or bad less often, my anticipation of these feelings will still exist, and the anticipation comprises most of the effect anyways.

Making it less zero-sum


Earlier I said, "when I follow the rule, I feel good, and when I break the rule I feel bad." What if I just added a constant amount of good feelings on top of this, so that I feel great when I follow rules and neutral otherwise?

Bad news again: I think I'm more motivated to avoid feeling bad than I am to seek out feeling good. So if my reward mechanisms were such that I always felt at least neutral, they probably wouldn't motivate me very well. This is something for me to work on changing in the long term, but for now, it means I can't just adjust where zero is on my mood scale.

A more structural change


What if--rather than just shifting my reward mechanisms to have different a better average mood or a longer time scale--I could just decouple my mood from my productivity, without ceasing to be productive? At first, these seems impossible; if I don't care about how I'm doing, why would I act?

I think insufficiently nuanced language makes it hard see this clearly. The problem seems to be with the word "utility". The colloquial meaning of "utility" is a hedonistic one: something gives me hedonistic-utility in proportion to how much it makes me feel happy (or fulfilled, or meaning-feeling, etc., choose whatever variant). But "utility" has a different meaning in microeconomic theory: rational agents make the choices that give them the highest choice-utility. Choice-utility and hedonistic-utility need not be equal.

For example, it would be unusual, but not irrational, for someone to take unhappiness as a goal. In this case, rational goal-pursuit would lead to low hedonistic-utility, but it would still be rational, and therefore choice-utility maximizing by definition. Similarly, it would be unusual, but not irrational, if I were to set more human goals (e.g. learning, helping others, having meaningful personal experiences) and then pursue them without allowing my progress to affect my mood. At least in theory, I could just choose to be happy, and I could also pursue my goals.

I'm a little bit embarrassed by just how foreign this sounds to me; alas, I've been brought up in a Judeo-Christian society. But foreign or not, it sounds pretty great--especially when we consider that I could change my default from time to time (I wouldn't always have to be happy, per say).

Realistically, I'll probably never get to the point where successes or failures have no effect on my mood. Rather, the best outcome I can hope for is that my mood will be centered on the mood I choose for myself, with small deviations due to happiness at success or frustration at failure. In that sense, it would be like the original mechanism plus a constant, as discussed in the previous subsection, except for that now the original mechanism component would carry less weight.

The relevance of this connection--between total mood decoupling and zero-shifting--is that I will face a similar difficulty taking either approach. Both require me to get better at letting myself be happy, independent of my productivity day-to-day. I'm not great at that yet. This is important, problematic, and unfortunately pretty typical; it's a big thing for me to work on moving forward.


A side observation on reward mechanisms


I think that most people I know have some of these short term reward mechanisms; we all use our mood as an internal incentive system to encourage taking small steps toward our greater goals. For now, let's suppose that I'm correct about that. Then we are able to explain the following oft-observed emotional oddity: It's much easier to get emotional about the things in life that we work for than it is to appreciate the things we'll have whether we work or not, even if the latter contribute more to our wellbeing (compare how much time you spend feeling happy about getting good grades to how much time you spend feeling happy that you have access to the internet).

When we think about this in terms of reward mechanisms, it makes sense. We attach our emotions to the topics where we need internal incentives in order to get what we want. So we become accustomed to being emotional about things that we have to work to achieve, and we become unaccustomed to being emotional about things that we'll have whatever we do. In the extreme, we have only feelings about a subject when that feeling will help us get what we want; if you'll get it anyways, there's no incentive-reason to let it affect your mood. What an unpleasant picture!


Summary


  • My mood is significantly affected by "short-term reward mechanisms".
    • These give me emotionally-salient internal incentives to work toward my long-term goals.
    • They also have some downsides: downward mood spirals, losing track of what I care about, and wasting my mood on behavioral regulation.
  • It would be cool (and in theory possible) to have a radically different reward system.
    • I think I would prefer to choose my mood independently of recent progress toward personal goals.
    • This is not a contradiction; hedonistic-utility and choice-utility are different.
    • In order to make this change, I would need to get better at allowing myself be unconditionally happy.
  • Reward mechanisms explain why it's harder to feel emotional about things that require little effort: we mostly create emotional rewards to incentivize actions that are hard to do, so we are unaccustomed to being emotional about subjects where we get what we want without much effort.


















1 comment:

  1. I like this article and I think you're right that most people have these short term reward mechanisms. I especially believe this is true for the highly motivated people in competitive environments (ex not being satisfied with what you have already accomplished and needing to move on to the next thing). My goal for this has been to find a way to divorce my own happiness and sense of self worth from my short term evaluations on what I am doing/accomplishing. And I think a lot of that comes from setting new baseline goals.

    For example, I know I'm happiest whenever I help other people and get to serve them. Now the chosen path I've taken to reach this goal is through medicine so I'll stick with this example. In my journey, I could be upset if I get low test scores or if I feel like I am not doing enough research or if I don't get into the top residency program etc. All of those short term goals are in line with wanting to be a doctor true but the mindset is more aligned with wanting to be the best and most successful doctor. And when I don't accomplish those goals I lose sight of why it is I even wanted to get into medicine in the first place.

    Now if I could ground my actions and have a baseline to operate from, I could mitigate my negative self evaluations. For example, I am thankful I even am in a position to learn and be on this journey. I am thankful for my family support and that I am a source of pride and a role model for my brothers (they don't care what grades I get). I am thankful for just getting to experience each day really.


    I'm learning how to just "be" and appreciate what I already have, which is a lot. I mean I am really privileged compared to most people in this world.
    From this perspective, the day to day goal accomplishment doesn't matter as much. Every incremental goal has the potential to be good, but I have a strong foundation to move out from (well I'm still working on that lol oh I'm taking up yoga soon that should be fun).

    This comment is a little disjointed, I apologize for that. But that's what I had to say!

    ReplyDelete