ALIGNMENT

on dating an ai researcher

Jul 11, 2025

the first time we met, you told me how the rest of your life would play out. align the current language models (make sure they behave in line with human values), then use them to supervise the AGI, then use the AGI to build superintelligence. you declared that the purpose of your life was alignment research

oh, i thought to myself, i wonder why you’re talking to me then

i also worked on AI research, but in a different subfield. my job was to develop more efficient pipelining strategies for training models. i spent my days staring at GPU pipeline diagrams and thinking about how to transfer data around the network as efficiently as possible. on good weeks i thought it was the coolest job in the world - at its core it was just one massive scheduling puzzle, the most complex one i’d ever encountered, and i enjoyed the occasional detours into linear programs and graph theory and discrete math

on bad weeks i’d implement new pipeline optimizations and debug them for days on end, only to realize afterwards that the GPU utilization hadn’t budged at all. surely the point of my life isn’t to tell GPUs what order to talk to each other in, i’d think to myself, and i’d wonder if i should get a different job. of course i never did. instead i told myself the work i was doing was important, that by working on efficiency i was advancing the rate of scientific progress while also helping the environment

i really did believe in efficiency, but there were so many layers of abstraction between my day-to-day work and any kind of real-world impact. it felt like tying a string to a rope to a boulder - maybe the boulder would budge if i pulled on the string as hard as i could, but it seemed more likely the string would snap and i would drive myself into the ground

why should we build superintelligence? i wondered. people in silicon valley often threw around terms like abundance and post-scarcity, but i didn’t understand the appeal. modern-day life already seemed pretty good to me

“that’s an absurd position,” you exclaimed. “i’m sure you could come up with a hundred things missing from your life if you tried”

i thought for a moment. “not really? i mean, it’s pretty cool that we have coherent chatbots and self-driving taxis now. but i really don’t know what i’d use superintelligence for if someone handed it to me”

“fine, think about it from a different perspective. do you have any favorite scifi stories?” you asked

“hmm… probably interstellar,” i replied

“i remember watching that in middle school,” you said, “and realizing for the first time, life on earth is so fragile. we can’t be satisfied with the present because one day we’ll encounter problems beyond the scope of present-day technologies”

you began telling me about how we needed dyson spheres and brain-computer interfaces and new laws of physics, and how superintelligence was our best shot at getting there, and how we could go about building it. i stopped paying attention a few minutes in as i became lost in my own thoughts. i don’t have what it takes to be a good researcher, i realized

it felt so obvious once i met you: all my ideas were hopelessly incremental. pipeline efficiency was the type of well-defined metric that anyone could understand, and i was stuck optimizing it not because i loved it but because i couldn’t come up with a better problem to solve. really great ideas, the kind that created new paradigms or made complex theories simple, required a wild and primitive imagination to find; you could never do it unless you were willing to abandon everything you were accustomed to believing. i knew then that my head was far too close to the ground, my mind too full of the ordinary and mundane, and i would only ever amount to a disciple in the midst of prophets

every fifteen minutes your watch would vibrate, after which you’d say a few words about whatever you were currently doing. the first three times this happened i thought it was some kind of alarm, and eventually i asked if you had somewhere else to be

“that’s not an alarm,” you said, “that’s my time tracker.” you pulled out your phone and showed me how a transcription app would activate every fifteen minutes and categorize all your activities in a spreadsheet. the totals for the current week were: work, 80h; sleep, 50h; social, 10h; meditation and journaling; 8h; gym, 7h; chores and maintenance, 7h; commute, 6h. i couldn’t help but notice the exact same totals on the previous two weeks

“are you worried at all about burnout?” i asked

“nobody ever gets burned out from working too hard,” you declared. “burnout comes from working towards a cause you no longer believe in, or from otherwise doing something you don’t really want to do”

“how do you stick to such a consistent schedule though? like, sometimes i just have bad days where i wake up and feel so unmotivat—”

“that’s why i’ve blocked out time for meditation and journaling! i do feel worse than normal sometimes. when that happens i hone in on the feeling and figure out where it comes from, and then i go resolve it quickly”

“what does it mean to resolve a feeling quickly?”

“it’s about transforming the feeling from something out of my control to something within my control. for example: yesterday i woke up not wanting to work. step one: i noticed the feeling was caused by insecurity about doing a refactor that i didn’t think i was capable of, as well as dread about having to interact with a coworker who speaks in a manner i find annoying. step two: i made a plan for breaking down the refactor into small pieces and figured out who i’d ask for help if i got stuck on each piece, and i also had a chat with my coworker about how we could communicate more effectively. feelings resolved!”

i tried my best to understand what you meant, but my feelings looked nothing like how you described yours. i felt so uncertain all the time, my thoughts all tangled up with each other and impossible to separate. sometimes my emotions lived at the edge of my consciousness for months, lurking in half-remembered dreams and phrases on the tip of my tongue, and i had to hunt them down like wild animals to make sense of what they were

i had no idea what you wanted from me when we first started dating. we spent most of our time coworking in cafes; you would get lost in thought for hours on end, thinking about how to train classifiers to detect harmful model outputs. meanwhile i tried not to break the spell by fidgeting or looking out the window too often. my brain wasn’t really capable of doing more than forty hours a week of focused work, but i pretended to work as much as you anyway

once in a while i would ask if you wanted to go outside; you almost never agreed. the first time you said yes, i drove us west the entire length of the city, to lands end. we climbed down old bathhouse ruins and watched seagulls glide overhead in perfect “v” formations. you looked like a child as you tried to get as close to the ocean as possible without getting soaked, rushing towards the water every time it receded and scrambling backwards every time it surged

i watched the waves wash away sand castles and pebbles and debris, and they also appeared to be washing away years of overwork and poor posture and tunnel vision. you seemed happy; happier than whenever you talked about superintelligence, happier than whenever you read an exciting paper, happier than that time you figured out how to make your classifiers twice as fast

you said that the purpose of your life was your research. was it my job to keep you company as you carried it out? or was it my job to set you free?

you became busier. there were always two-week sprints of safety-testing work before every new model release, but model releases used to be every six months and now they were coming more often. most days i only saw you on calls before bed, calls that always managed to leave me both overjoyed and hurt, calls that i spent fixated on your face hoping you would do the same for me

of course you never did. you were smart enough and fast enough to multitask unhindered during conversations, but there were other signs - the way you constantly tilted your head and adjusted your hair (you, looking at your own camera feed and not mine), the subtle shifts in light and shadow on your cheeks (you, switching to another tab), the telltale left-right-left-right motions of your irises (you, reading something, probably slack or arxiv). did you think i couldn’t notice? i have always been too perceptive and too sensitive for my own good

i thought about asking you to change, but what would be the point? i craved the single-mindedness of your life, the way you could pinpoint what you were with exact precision. there was no way of lessening the pain without also ruining what i admired most about you

once upon a time i believed that the margins were large enough to build an entire life in, that even if i didn’t like the main plot of the book i was part of i could still live happily in its footnotes. i could cram all my interests into the leisure hours between a job i didn’t care for and bedtime, i could squeeze all my love into the rare occasions where my partner was present in between long periods of unavailability, i could be satisfied with sparse bursts of joy across an otherwise desolate timeline

where did it come from, all this self-erasure and self-minimization and self-sacrifice? did i learn it from a toxic first relationship? an oppressive upbringing? long hours at work? or was it the natural result of growing up, of experiencing failure and seeing life turn out differently than i’d hoped? how was it that when i was a kid i only ever thought about getting another piece of candy, another gold medal, another five minutes of recess - and now somehow i was obsessed with holding onto the smallest possible space i could survive in?

“i assume you’ve already heard the news,” you said

i nodded. your team had conducted safety testing on a new agent product without finding anything problematic, but within the first week of launch thousands of people had fallen in love with the agent. to make things worse, some obscure ten-person research lab had conducted their own audit and managed to elicit those same issues mere hours after the release, demonstrating that it was possible to catch them. there was a round of damage control and rollbacks, and now you were taking your first break in weeks, sleep-deprived and noticeably thinner. for the first time ever i thought you looked defeated

“okay,” i said, “you need to go touch grass.” i dragged you into the car and drove us to the castro, where we got out and began the fifteen-minute walk up to corona heights park

“so how did you guys not find anything?” i asked

“there were many reasons,” you replied. “one: our classifiers can never tell us about false negatives. two: we didn’t know to look for this specific failure mode, so we never designed an evaluation for it. three: unknown unknowns are difficult to account for. but the biggest reason is that the rest of the company didn’t want us to find anything, because that would delay the release”

shouldn’t that have been obvious from the beginning? i wondered. the tension between releasing ai products and testing them thoroughly had always been clear, even to outsiders

we hiked up the park in silence and reached the summit at my favorite time of day: the late afternoon fog had just begun creeping in from the west but was not yet thick enough to obstruct my vision. i loved that panoramic view of the city from 500 feet up - enough distance to make things feel small, but not so much that they became unfamiliar - and as i took the view in somehow i felt ready for whatever was to come

“sometimes i’m not sure if you actually want the things you say you want,” i blurted out

“i don’t know what that means,” you said

“like all this superintelligence stuff,” i said. “i’m not sure if you actually care about it as much as you say you do.” i knew it was a shitty thing to do, telling you that your priorities weren’t straight when there had just been a meltdown at work, but it was a minor miracle that i’d gathered the courage to get to this point and i had to use it while i could

“how could you say that? you know how hard i’ve been working”

“working a lot doesn’t mean you care though. it just means you’re forcing yourself to wo—”

“of course i’m doing that because i care”

“it doesn’t seem to ever bring you joy though. or if it does then it’s not in a way that i can recognize? sometimes i think i don’t understand you at all”

“really, in what ways?” you asked

in what ways? there were so many that i didn’t know where to begin. i don’t understand how you can be so self-aware yet so blind, i wanted to scream. how you can dedicate your life towards a goal without realizing that your work is not actually an effective way of accomplishing that goal. how you can spend so much time on meditation and observing your feelings and breaking them down and still be so unaware of how you make the people around you feel. i knew i wouldn’t be able to finish that third sentence without crying though, so i didn’t say any of them

“i guess i don’t believe you understand your feelings as well as you say you do,” i said instead. “you’re really smart, but the problem is that smart people can rationalize anything, you know? you can convince yourself of feelings or desires that aren’t really there. like, you always seem so self-assured in a way that doesn’t feel real to me”

“that’s because i know what i want,” you snapped. “it doesn’t feel real to you because you’ve never been sure of what you want”

“i don’t think that’s it,” i said, more out of defensiveness than anything else

on one hand, i knew you were probably right. on the other hand, i didn’t really believe that i could feel so much fear and confusion while someone else my age could have it all figured out and feel none of those things

“you can want things for reasons that aren’t genuine,” i continued. “like, maybe you’re scared of feeling insignificant, and everyone says AI is the next big thing so working on it makes you feel important. or maybe you’re scared of feeling uncertain about what to do, and working on something important is a way to avoid thinking about that”

i was just spitballing, but as i said those words i realized they might actually be true. oh my god, i thought, what if you’re actually just a scared kid? maybe i’d simply idolized you too deeply to understand that earlier

over the next few weeks i tried to interact with you exactly as i’d done before, but my mind kept getting in the way. we would cowork at our favorite cafes, but in your focus i would only see self-deception. you would open up about your feelings, but instead of empathizing i would only notice your ignorance. you would talk about your ideas for new technologies, but when i listened all i heard was your own delusion

had i ruined everything? i felt like it was my fault for overthinking things, the same way i sometimes stared too closely at a piece of art until the beauty had faded from recognition and all i could perceive were details and flaws. but what could i do about it, now that i’d already reached this point? what could anyone do, once their mind had ventured out beyond the confines it was supposed to remain in, like a curse escaping from pandora’s box? i couldn’t stand the sight of you with your best parts turned upside-down, and i would never be able to unsee that. there was nothing to do but to part ways. i’d always struggled with finding anything in between adoration and indifference

afterwards i started catching glimpses of you in everyone around me, and i became sick of the city. there were many people here who genuinely believed in what they were doing, and i respected them immensely; but there were far more people who needed something to believe in and caught wind of whatever religion was spreading at the time and never looked back afterwards. i worried about ending up like that - swept up in the dreams of other people, locked into the orbit of ideas that were alien to me

i considered cutting out everything AI-related from my life. deactivate twitter, stop going to launch parties, avoid the street with the “Ava won't come into work hungover. Stop hiring humans!” billboard. of course i could never actually bring myself to do it. there were nuggets of truth behind every religion in silicon valley, whether it was “move fast and break things” or web3 evangelism or superintelligence, and all of them would reshape the world sooner or later. avoiding them just meant running away from the future until it had enveloped everything and there was nowhere left to hide; it was better to remain in the arena and find a way to survive without losing my mind

thanks to alicia, cj, claire, laura, sonnet4 for feedback on drafts <3 first time actually writing fiction! narrator’s views are not necessarily my own

a slice of my mind

Discussion about this post