Results from one month of collecting data on myself

Summary: I started collecting data on my habits. This will help me design control phases when I start experimenting with changing my habits. I learned some interesting things about myself, but the largest benefit was that I have learned more about the best ways to measure my life.


On December 19, during my last final of the semester, I started developing a list of health- and happiness-related questions that I could answer better after experimenting on myself. (I also found a few questions that I could answer better by reading more, and did a lot of reading.) Here are a few questions where research is too thin or personal experience too variable for me to answer without collecting personalized data:

  1. How much should I sleep? How do mild amounts of sleep deprivation, like sleeping seven hours a night, affect me?
  2. Would consuming mostly Soylent make me feel worse?
  3. How does the amount of exercise I get affect my mood and energy level? Does it matter what type of exercise it is?
  4. Is polyphasic sleep a good idea?
  5. Would stimulants like caffeine or Modafinil enhance my mood or productivity?

As I wrote these questions down, I noticed two big difficulties for constructing good experiments. Changing any of my habits might be beneficial in the short term but harmful in the long term. If I start sleeping seven hours a night and feel good for a month then start getting frequent headaches, can I really blame the sleep change? Surely many other things will have changed in that month. I think the experiments should follow the pattern “control phase – intervention phase – control phase” and, ideally, repeat several times. Due to the time this will take, I think that producing reliable data using self-experimentation is possible but would be incredibly costly, and I should probably just study interventions that will have an immediate effect with little risk of long-term harm.

In addition, in order to run a good experiment I need a control that is similar to the treatment phase in every respect except the variable I am studying. Take the question, “How much should I sleep?” I could collect baseline data on variables like my mood, start forcing myself to sleep from midnight to 7 am every day, and see how my mood changes. But having such a regular sleep schedule is quite different from what I do now. The regularity itself could be helpful or harmful. I wouldn’t be answering “How much should I sleep?”, but rather “How does sleeping from midnight to 7 am compare to whatever I was doing before?”  I should instead compare consistently sleeping 7 hours per night to consistently sleeping as much as I tend to sleep now. The problem is, I don’t know how much I sleep now.

I realized that wanting to start experimenting on myself was jumping the gun. I needed good baseline data, just to set up the control phase of an experiment. In addition, I know little enough about my usual habits that I wouldn’t know if they were far out of line with standard medical recommendations. So I kept track of what I ate, what I did, when I slept, how I exercised, my mood, and my mental functioning for most of a month. This had unanticipated benefits: I found out new information about my habits, and I learned what methods of tracking my own data are useful and what need improvement. Except for a couple of online tests, I collected most of my data by writing down what I remembered about my day before I went to bed, which took less than five minutes. Here’s a category-by-category account of what worked, what didn’t, and what I learned.


Tracking the amount of sleep I got was easiest of the measurements I made. I wrote down what time I got into bed, what time I turned the light off after reading, when I woke up, and when I got out of bed. I used an alarm clock on four out of thirty-one nights, but excluding those nights doesn’t change the results much. I spent an average of 7.69 (with a standard deviation of 0.97) hours asleep, and, on the nights that I read before going to sleep, 29 minutes reading (with a standard deviation of 17 minutes).

Surprise one: How much I sleep varies hugely. I wonder whether I actually need more sleep on the nights that I naturally sleep more; would allotting myself eight hours of sleep a night be harmful?

Yeah, the frequencies don't add up right.

Yeah, the frequencies don’t add up right.

The other surprise: I’m in the habit of reading in bed until I feel very sleepy, than turning the light off and dropping right to sleep. I thought this took five or ten minutes . In reality, it took about thirty. This would be justified if reading were displacing time I spent lying awake in bed. If that were true, I would have a longer “lights-out to wake-up” time on nights that I didn’t read. There was actually no difference on that measure between nights I read and nights I didn’t, but the amount I sleep is so variable that this doesn’t mean much.

Where to go from here: There are lots of questions I would like to answer about my sleep, mostly involving sleep quality. The main problem in running sleep experiments is how to measure sleep quality. Possibilities for measuring sleep quality:

  1. Write down how well-rested I feel when I get up in the morning. Unfortunately, something that makes me sleep more lightly (like leaving my curtain open) might also make me feel less groggy when I wake up.
  2. Track the variables that I actually care about that I think sleep affects, like working memory, mood, sleepiness, productivity, or running speed. Unfortunately, I would expect these variables to be a very noisy measure of sleep quality since they are affected by so much else. In addition, they probably drift over the the time span that I would need to run an experiment: If I try sleeping while wearing ear plugs for two months and find that my mood is consistently brighter at the end of the month, couldn’t that be because it is warmer and sunnier outside?
  3. Use something like a Fitbit to track how much I move in my sleep. This would be expensive, but seems like the best option.

Tasks and Food

I wrote down what I got done and what I ate every day in a very rough way. Unsurprisingly, this did not produce data I could analyze. An example entry in “Tasks” is “Ran, delivered veggies, dropped off and picked up prescription, bought groceries, interviewed MIT applicant, wrote interview report.” Two consecutive days of food data are “Oatmeal, frozen mango, sambar, quinoa, lots of bread, sauerkraut, vegan sausage, Ben vegetables, buttered toast” and “Tofu scramble with peppers, frozen strawberries, jar of kimchi, egg drop soup, Ben veggies, Buffalo Brussels sprouts, fruit cake”.  I think I will stop tracking food data until I find a better way to do it. As for tasks, I will go back to my old method of keeping a to-do list and crossing off completed tasks.


I wrote down how far I walked, how far I ran, and which Pop Pilates videos I did. I walked for transportation, ran infrequently since my IT band has been bothering me, and spent about as much time on Pop Pilates videos as I did running. I didn’t find any surprises, but now that my IT band is feeling better I’m glad to have an estimate of how much I’ve been running (about 14 miles a week) so that I can increase my mileage slowly and regularly as I resume more regular running.

I meant to do regular tests of my physical ability but didn’t really get around to it. I have some idea of how many consecutive pull-ups or push-ups I can do and how fast I can run a mile, but no time series data.

MoodEvery day before bed, I wrote down how I felt over the day and what I thought influenced my mood. I meant to record my mood using a little game on every day, but I only remembered to do so twelve times. Moodscope was somewhat helpful in identifying what affects my mood; since my mood affects my actions, I don’t want to read much into any correlations. The only variable that I don’t control is whether my boyfriend around, and I think I’m happier and more productive when he is.

Moodscope consistently surprised me; almost every time I evaluated questions like “How enthusiastic do I feel?” I would think “But this is how I always feel, right?” I was wrong. My mood fluctuates moderately, and without data I can’t tell when I’m having a worse-than-average day.

Cognitive Functioning
I took cognitive tests on These tests involved simple but difficult tasks like memorization of words and numbers, judging whether two patterns are the same, and quickly responding to stimuli. I liked them, but the manner in which I took them didn’t generate usable data.

On Quantified Mind you can’t just sign in and start taking tests; you need to either join an experiment that will give you a new battery of tests every session or create your own experiment. I joined the experiment “Time of Day” and had a set of tests that varied every day. Each test (such as number memorization) ocurred about three times over the twelve times I used Quantified Mind. Unfortunately, since the tests were different each time, it’s really hard to analyze this data to find out how different variables affected my mental abilities.

I think that in the future I will keep using Quantified Mind, but I will use the “design your own experiment” feature to construct a set of tests that is the same every day.

About adaldrida

I'm a grad student. My interests are diffuse. Recently, I've spent a lot of time thinking about Empirical Bayes, psychiatry, and sports physiology.
This entry was posted in Uncategorized and tagged , , , , , , , , , . Bookmark the permalink.

3 Responses to Results from one month of collecting data on myself

  1. Ben Kuhn says:

    Maybe this is totally crazy, but I’m not sure you actually need to do rigorously controlled experiments to find this stuff out.

    Like, the idea of rigorously controlled experiments was invented to fix skewed incentives. Scientists were incentivized to set up experiments that showed outcomes they wanted to see instead of outcomes that were true because they capture the benefits (reputation, more funding, etc.) and the scientific community bears the cost (noisy data, harder to make progress). But if the studies are all on “how adaldrida works” then there’s not an incentive problem because the only person who loses if you fail to seek truth is you.

    Now I don’t know if that’s sufficient to get one’s brain to stop making things up, so some degree of methodological rigor still seems appropriate. But I think often you can get away with being less rigorous than having a bunch of control-intervention-control blocks. The important outcome isn’t how rigorous your study is, but whether it convinces *you* when you’re doing your best attempt at truth-seeking.

    – I took a bunch of cognitive tests while attempting to do polyphasic sleep and discovered that (a) they had a huge training effect and (b) they barely picked up any difference after I pulled an all-nighter, so interpret with caution.
    – You probably already know this but Gwern is really good at self-study design:
    – For tasks, a subjective rating of how productive your day is might be a good measure that’s also mathematically tractable. It’ll be biased, but (as per above) when you’re not doing Real Science, being biased doesn’t mean you’re not allowed to draw any conclusions from it at all.
    – For sleep, if you have a smartphone, there are apps that will track how much you move (though I’m not sure how good they are, especially with significant others in the bed). The next step up is a Zeo, which is expensive but maybe worth it (you’ll have to make your own headbands but this is doable).

  2. adaldrida says:

    Hmm. I think there are still incentive problems: Mainly, the drive to find something cool, or find something at all. To me, looking at data on myself and seeing if there are any interesting discontinuities feels like a lot like looking at economic data and hoping there’s something interesting in it. Also, without a clean experimental design and pre-specified analysis plan, it’s hard to know if a result that looks big is just what I should have expected to happen by chance after looking at X outcome variables (since I can’t tell what X is).

    Also, the n=1 problem really amplifies the need to take clean, unbiased measurements and repeat many times. Say I wanted to measure “Does marathon training make me sleep more?” My measurement of “how much do I sleep on average?” would have a 22-minute standard error (plus and minus, at 95% confidence) over a one month measurement period. I would need to try something that had a really big effect to have a reasonable chance of knowing that the outcome I measured wasn’t due to chance. Of course I can run little experiments and update my beliefs a little bit, but it’s only worth running the experiment if I expect a positive result to change my actions as well as my beliefs.

    – Thanks for the tip on the cognitive tests. I wonder how much they’ve been checked for external validity? Maybe I can try to record something like “percent of lecture I paid attention to” instead.

    – I did not know about Gwern’s website. Thanks!

    – No smartphone.

  3. Pingback: 143 days of training for the Wapack | blog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s