The basic problem is that GPT generates easy poetry, and the authors were comparing to difficult human poets like Walt Whitman and Emily Dickinson, and rating using a bunch of people who don't particuarly like poetry. If you actually do like poetry, comparing PlathGPT to Sylvia Plath is so offensive as to be misanthropic: the human Sylvia Plath is objectively better by any possible honest measure. I don't think the authors have any interest in being honest: ignorance is no excuse for something like this.
Given the audience, it would be much more honest if the authors compared with popular poets, like Robert Frost, whose work is sophisticated but approachable.
Huh, that sounds a little like claiming that AI can draw pictures just as well as humans because they look realistic at a first glance. But not if you check whether the text, repetitive elements, and partially-occluded objects in the background look correct.
The more basic problem is that their methodology would conclude Harry Potter is better than Ulysses, AC/DC is better than Carla Frey, etc etc. It is completely fine to enjoy "dumb" art - I like Marvel comics and a lot of the Disney-era Star Wars novels have been pretty fun. But using easiness and fun as a metric of quality is simply celebrating ignorance and laziness.
Poetry is for the enjoyment and enlightenment of the reader. Who reads poems? The question is a little vague. Who reads poems on a birthday card? Everyone. In a children's book? Parents and children.
But this study focuses on literary poems specifically, using works by literary heavy weights like Chaucer, Shakespeare, Dickinson, Whitman, Byron, Ginsberg, and so on. So the question is who reads literary poems, to which the answer is: academics, literary writers, and a vanishingly small population of reader who enjoy such pursuits.
So the test of if ersatz AI poetry is as good as "real" poetry should be if target audience (i.e. academics) finds the poems to be enjoyable or enlightening. But this study does not test that hypothesis.
This study tests a different hypothesis: can lay people who are not generally interested in literary poetry distinguish between real and AI literary poetry?
The hypothesis and paper feel kind of like a potshot at literary poetry to me, or at least I don't understand why this particular question is interesting or worthy of scientific inquiry.
The vast majority of people seem to prefer Avengers 17 over any cinematic masterpiece, the latest drake song would be better rated than a Tchaikovsky... We should let them play and worship chat gpt if that's what they want to waste their time on
I don't understand the logic of calling superhero movies lesser/unserious like this, it's very snobby. Movies and music are made to be entertaining, the avengers is more entertaining than your "arthouse cinematic masterpiece that nobody likes but it's just because they aren't smart enough to understand it". It's also lazy and ignorant to ignore the sheer manpower that goes into making a movie like that.
I don’t fully agree with putting down “fun” movies like the Avengers, but at the same time “serious” art is not primarily for plain entertainment.
People might find “serious” art meaningful and it might spark feelings in them, but that’s not the same as getting an adrenaline rush from exploding cars in an action scene.
Of course there are also cases where the boundary between “fun” and “serious art” is not so clear, there are always exceptions to any attempt to define what makes something “serious art”. Art can also be subversive and run counter to traditional expectations of what art “should” be. But I don’t think the Avengers is an example of that.
Movies, music, wiriting, all human arts, are made to make their audience feel something. "Entertaining" is only a small and honestly ill-defined subset of this, no more valid than any other approach.
what a ridiculous comparison. Of course a superhero movie is more entertaining than a film that is explicitly designed to avoid mass appeal.
The nuance is that movies today is not where the most creative talent is directed anymore. The shift started with prestige TV taking off in the 2000s, and episodic content on streaming services surpassing film as a mass-market artform in the 2010s, with the pandemic driving the nail in the coffin.
I loved the late 2000s / early 2010s superhero movies. Spiderman, The Dark Knight, Iron Man, etc. These were great films. Today, the MCU is just eating its own tail with the most bland, repetitive crap. It's all designed to incentivize the same die hard fans to keep forking over their hard-earned cash with all the cross-film teasers and the need to watch every film to understand all the references and moving parts. I understand the business model—it's actually the same as comic books now—because people don't casually go in to see random movies anymore, they do that at home on Netflix, so they have to target the repeat viewers. It's visually impressive, and the acting is good enough to keep a relatively large subset of the population coming back, but for someone like me who wants at least a little bit of novelty or creativity in the plot or characters, it's just so become so mind-bogglingly boring.
> your "arthouse cinematic masterpiece that nobody likes
You're reading way too much into my comment. Any block buster from the 80/90s absolutely shits on 90% of block busters released today. I'm not talking about obscure 1950s czechoslovak cinema here...
> ignore the sheet manpower that goes into making a movie like that.
A lot of work doesn't make something good, especially when cgi quality actually gets worse year after year. FYI the entire LOTR trilogy had 30% less budget and 4x the runtime of the last avenger movie... And they actually filmed things outside of a Hollywood studio
The only lazy thing here are the scenarists and the directors shitting out the blandest movies ever.
But then again if all we care about is raw entertainment then sure, it's perfect, very easy to digest, lots of colors and not too much to think about, the cinematic equivalent of fast food. You can even buy avengers branded toilet paper and bottle water, that really shows how much they care about movies!
Well said. There's tons of blockbusters and other popular movies from the 80s/90s that were absolutely made for the "masses", but were genuinely great films, and far better than almost any blockbuster from the last 5-10 years, especially all the comic-book stuff. Alien(s), Back to the Future trilogy, Terminator 1/2, Ghostbusters, Beetlejuice, I could go on and on. And of course the LotR trilogy if you look at the early 2000s. Movies just aren't as innovative or risky these days; something as quirky as Ghostbusters wouldn't be made now (but Hollywood is happy to make remakes and sequels of that franchise now, 40 years later).
Film is such a nascent art form. The 90s as “peak blockbuster action” is a valid stance on taste but hard to defend as superior to all that came after. Christopher Nolan’s Dark Knight is leagues aways from the 90s Batman, as an auteur friendly and obvious comparison. Pixar another on the animation front.
There have been great films made in every era, but the trend towards tighter writing, more legible and compelling action, and emotionally impactful story telling is strongly trending upwards overall.
And nothing will ever top the merchandising mania of the 80s!
I hope you're referring to Joel Schumacher's kitschy drivel, and not to Tim Burton's masterpieces (both of which are IMO vastly superior to Nolan's take on the subject).
This reminded me so much of Spaceballs! And the yogurt merchandise towards the end! Such a great movie that has so many obvious "flaws" like the mirror under the speeder on the desert planet when they comb the desert. And yet I've actually watched that movie more often than even the actual real Star Wars movies (meaning the first three made - all of which are timeless awesomeness)
> Any block buster from the 80/90s absolutely shits on 90% of block busters released today
You sure it's not survival bias, as in, you only are thinking and remembering the good ones over a two-decade period and comparing them against what movies came out this year. When in reality, there might be tons of blockbusters in those era that were just as bad as your average one today?
There is more to art than entertainment. For example Oedipus Rex [1] - distinctly not entertaining; but art, and powerful in an incomparable way, anyway.
_____________
[1] Don't look it up if you don't know what that is.
In your opinion, perhaps. Other films are made to be provocative-- to make you think or reflect. Certainly, a lot of the "arthouse cinematic masterpieces" aim for that as a goal rather than purely entertainment.
You're arguing against a strawman here... nobody is saying making an avengers movie is low effort. Certain aspects of an avengers movie though require less effort.
Movies and music are usually made to be entertaining, but sometimes they're made as an artistic outlet for the creator.
I was listening to Schoenberg's "Suite for Piano" the other day. Did he make it to be entertaining? I don't know, interesting maybe. I wouldn't put it on at a party.
It's true that snobbery is off-putting, but if you're looking for artistic merit, then some works last longer than others. If you're looking for something to enjoy with your popcorn, then there's that too.
> The basic problem is that GPT generates easy poetry
I was going to come in here and say this. I'll even make the claim that GPT and LLMs __cannot write poetry__.
Of course, this depends on what we mean by "poetry." Art is so hard to define, we might as well call it ineffable. Maybe Justice Potter said it best, "I know it when I see it." And I think most artists would agree with this, because the point is to evoke emotion. It is why you might take a nice picture and put it up on a wall in your house but no one would ever put it in a museum. But good art should make you stop, take some time to think, figure out what's important to you.
The art that is notable is not something you simply hang on a wall and get good feelings from when you glance at it. They are deep. They require processing. This is purposeful. A feature, not a bug. They are filled with cultural rhetoric and commentary. Did you ever ask why you are no Dorothea Lange? Why your photos aren't as meaningful as Alfred Eisenstaedt's? Clearly There's something happening here, but what it is ain't exactly clear.
Let me give a very recent example. Here[0] is a letter from The Onion (yes, that Onion, the one who bought InfoWars The Onion) wrote an amicus brief to the Supreme Court. It is full of satire while arguing that satire cannot be outlawed. It is __not__ intended to be read at a glance. In fact, they even specifically say so
> (“[T]he very nature of parody . . . is to catch the reader off guard at first glance, after which the ‘victim’ recognizes that the joke is on him to the extent that it caught him unaware.”).
That parody only works if one is able to be fooled. You can find the author explaining it more here[1].
But we're coders, not lawyers. So maybe a better analogy is what makes "beautiful code." It sure as fuck is not aesthetically pleasing. Tell me what about this code is aesthetically pleasing and easy to understand?
float InvSqrt(float x){
float xhalf = 0.5f * x;
int i = *(int*)&x;
i = 0x5f3759df - (i >> 1);
x = *(float*)&i;
x = x*(1.5f - xhalf*x*x);
return x;
}
It requires people writing explanations![2] Yet, I'd call this code BEAUTIFUL. A work of art. I'd call you a liar or a wizard if you truly could understand this code at a glance.
I specifically bring this up because there's a lot of sentiment around here that "you don't need to write pretty code, just working code." When in fact, the reality is that the two are one in the same. The code is pretty __because__ it works. The code is a masterpiece because it solves issues you probably didn't even know existed! There's this talk as if there's this bifurcation between "those who __like__ to write code and those who use it to get things done." Or those who think "code should be pretty vs those who think code should just work." I promise you, everyone in the former group is deeply concerned with making things work. And I'll tell you now, you've been sold a lie. Code is not supposed to be a Lovcraftian creature made of spaghetti and duct tape. You should kill it. It doesn't want to live. You are the Frankenstein of the story.
To see the beauty in the code, you have to sit and stare at it. Parse it. Contemplate it. Ask yourself why each decision is being made. There is so much depth to this and it's writing is a literal demonstration of how well Carmack understands every part of the computer: the language, how the memory is handled, how the CPU operations function at a low level, etc.
I truly feel that we are under attack. I don't know about you, but I do not want to go gentle into that good night. Slow down, you move too fast, you got to make the morning last. It's easy to say not today, I got a lot to do, but then you'll grow up to be just like your dad.
I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made. They make it clear in the paper that they're specifically evaluating people who aren't especially interested in poetry, and talk at length about how and why this is different from other approaches. I suppose the clickbait title gives a bad first impression.
To summarize the facts: when people were asked to tell if a given poem was written by human or AI, people thought the AI poems were human more often than human poems. People also tended to rate them higher when they thought they were created by AI than when they thought they were human-made. It's speculated in the paper that this is because the AI poems tended to be more accessible and direct than the human poems selected, and the preference for this style from non-experts combined with the perception that AI poetry is poor led to the results.
The selection of human poets is cooked to give the result they wanted. I will grant the authors may have lied to themselves. But I don't think honest scientists would have ever constructed a study like this. It is comparing human avant garde jazz to AI dance music and concluding that "AI music" is more danceable than "human music", without including human dance music! It's just infuriating.
They expressly state the result is likely because the AI poetry was more simple and direct than the poetry selected, which is more accessible for the average person not interested in poetry. They compare and contrast this with other studies where this was not the case.
Yes, it's comparing apples and oranges; that's the whole point. It doesn't make the experiment itself flawed.
Hum, but it should have compared against human poems that go for a similar style no? Otherwise, it doesn't tell us much, except that AI was not able to make more complex poems? And maybe that people who don't like poetry when asked prefer simpler poems?
It seems to me that the whole study was intended to manufacture a result to grab headlines. Scientific clickbait. It doesn't matter how transparent they are, because that is mostly there to cover their asses.
> I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made.
Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs. Then I found a set of random bystanders and surveyed them on which drink they preferred, and they generally preferred the lemonade. Do you see how it would be dishonest of me to treat this like a serious evaluation of my skill, either as a lemonade maker or a wine maker?
Sure - but it could still be pretty relevant if we want to ask about the future of beverage making and consumption, especially if new technology enables everybody to mass-produce lemonade (and similar sugary beverages) at home at minimal cost.
But much like the "debate" between linguistic prescriptivism ("'beg the question' doesn't mean 'raise the question'") and descriptivism ("language is how it is used"), both perspectives have relevance, and neither are really responses to the other.
I certainly hope people keep writing great, human, poetry. But generative ML is a systemic change to creative output in general. Poetry just happens to be in some ways simplest for the LLMs, but other art is tokens and patterns as well.
Personally, I think this would be a sin. To call something art which has no depth. We have too many things that are shallow. I think this has been detrimental to us as a society. That we're so caught up with the next thing that our leisure is anything but. What is the point of this all if not to make life more enjoyable? How can we enjoy life if we cannot have a moment to appreciate it? If we treat time off as if it is a chore that we try to get done as fast as possible? If we cannot have time to contemplate it? A world without friction is dull. It's as if we envy the machines. Perhaps we should make the world less tiring, so we have the energy to be human.
> Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs.
I would tell you that there have been results about the blind testing of wines held in high regard by connoisseurs that might make you not want to choose that for a comparison.
The blind tasting studies prove that connoisseurs can't discern the price of wine by taste. They can tell whether or not they like it perfectly well. A good bottle, not an expensive one.
I believe you missed the OP's point. Poetry is to be processed. That's a feature, not a bug. Now that we're in an analytical conversation you need to process both papers and OP's words. Like poetry, there is context, things between the lines. Because to write everything explicitly would require a book's worth of text. LLMs are amazing compression machines, but they pale in comparison to what we do. If you're take a step back, you'll even see this comment is littered with that compression itself.
I would say Davis's definition of "objectively better" here is "nobody who reads these poems carefully could possibly conclude that this AI crap is better than Walt Whitman, the only explanation is Walt Whitman is so difficult that the raters didn't read it carefully."
The Nature paper is making a bold and anti-humanist claim right in the headline, laundering bullshit with bad data, without considering how poorly-defined the problem is. This data really is awful because the subjects aren't interested in reading difficult poetry. It is entirely appropriate for Davis, as someone who actually is interested in good poetry, to make a qualitative stand as to what is or isn't good poetry and try to define the problem accordingly.
The data would still be awful, and people would pay less attention to the study because it’s not a priori surprising that ChatGPT would write worse poetry than the most celebrated poets in history.
If I use bad data to conclude that “Java is faster than C++ in most cases” you can be sure it will receive a lot more attention than if I reached the opposite conclusion based on similarly bad data.
You're making a load of generous claims for yourself without giving your thought process:
> The basic problem is that GPT generates easy poetry
> were comparing to difficult human poets
What's your qualitative process for measuring "easy" vs. "difficult" poetry?
> rating using a bunch of people who don't particuarly like poetry
How do you know these people don't like poetry? Maybe they don't seek it out, but certainly poetry is not just for poetry lovers. Good poetry speaks to anyone.
> the human Sylvia Plath is objectively better by any possible honest measure
> the human Sylvia Plath is objectively better by any possible honest measure.
Except for arguably the most important one, creating something that people enjoy. Just because you dont like it doesnt make it worthless. I guess the actual question is do the raters actually get any enjoyment out of the ai poem or do they just intensely dislike both?
Poetry, like humor, involves the use of the reader's expectations, but is typically most effective when subverting those expectations.
There's a lot of bad poetry in the world that just follows readers' expectations. I should know, I've written some of it. Unfortunately, I'd suspect that most readers' understanding of poetry lacks that crucial element of subversion, and so an LLM – which mostly just spits out the most-probable, most-expected next token – looks like what people think poetry is.
An LLM would not have created the curtal sonnet form, because it would've been too busy following the rules of a Shakespearean or Petrarchan sonnet. Similarly, an LLM wouldn't create something that intentionally breaks the pattern of a form in order to convey a sense of brokenness or out-of-place-ness, because it's too locked in on the most-common-denominator from previous inputs. And yet, many of the most powerful works I've read are the ones that can convey a disjointed feeling on purpose – something an LLM is specifically geared not to do.
Poetry aims for the heart, and catches the mind along the way. An LLM does not have the requisite complex emotional intelligence, and is indeed a pretty poor simulation of emotional intelligence.
Consider Auden's Epitaph on a Tyrant, which is powerful because it is so suddenly shocking, as it describes something that sounds perhaps like an artist or author, until it very suddenly doesn't, on the last line:
Perfection, of a kind, was what he was after,
and the poetry he invented was easy to understand;
he knew human folly like the back of his hand,
and was greatly interested in armies and fleets;
when he laughed, respectable senators burst with laughter,
and when he cried the little children died in the streets.
One could literally take Claude 3.5 Sonnet New or o1-preview and disprove this in an hour or two just by prompting the AI to try to exhibit the type of poetry you want and then maybe asking it to do a little bit of automated critique and refinement.
You can also experiment with having a higher temperature, (maybe just for the first draft).
You claim that LLMs can't make poetry like that. I bet they can if you just ask them to.
They could, but they probably won't. Poems like GP are basically using the power of emotional manipulation for good, and companies like Anthropic try very hard to prevent Claude from having that capability.
What he said was basically that it just couldn't create unexpected verses or break form. Since supposedly it can only do the most probable token -- but that's not how sampling works unless you use temperature 0. And it can easily be instructed to break from a strict form (which would create a new variation of the form) for effect if it made sense.
You could also ask it to create a new form and it could. I don't work for you so I don't have to create examples, but anyone who has used the latest SOTA models for any amount of time knows this capability is expected, and if you were really interested then you would try it. If you feel the result isn't very good, ask it to improve it.
I could program even a markov chain to generate a lot of odd unusual potentially interesting stuff, but no one would call any of it a new form of poetry, because establishing something like that requires social status, which robots don't have.
It can be obvious and go to the heart. I'm not sure Wilfred Owen's Dulce et decorum est is anything other than straight down the line, but it made me cry when I first read it.
That said, maybe the subversion is in how the reality is contrasted with the marketing.
I see 'subversion' as more broad. In good poetry, subversion is constantly happening at a micro level, through playing with meaning, meter, word choice. I think it's very easy to identify AI-generated poetry because it lacks any of that -- but on the flip side, if you don't understand the rules, you don't understand how to subvert them.
Even in Dulce et decorum est -- though the meaning's straightforward, there are plenty of small, subversive (and pretty) ideas in the meter. For example, the line "He plunges at me, guttering, choking, drowning" is an unexpected, disruptive staccato that really sounds like guttering, choking, drowning. It's a beautiful poem and is overflowing with such examples.
(I think this applies to art as a whole, not just poetry.)
>There's a lot of bad poetry in the world that just follows readers' expectations. I should know, I've written some of it.
Apparently, in 'the good old days (of the internet)', your poetry would be published by yourself, on your webpage - complete with a starry-twinkling background, a 'number of visitors' counter in the bottom right, and in a pink font.
The program Racter (which, from what I understand, was a basic MDP) was generating poetry in the 1980s that was published, read, criticized, and presented in museums: https://www.101bananas.com/poems/racter.html
I remember this as one of its poems was used on on the t-shirts of the computing social club that I was part of as a postgrad student:
More than iron
More than lead
More than gold I need electricity
I need it more than I need lamb or pork or lettuce or cucumber
I need it for my dreams
If it was like other poetry generation programs of the 80’s and 90’s, it was generating a lot more crap than gold. People definitely were picking out the most cohesive examples, and probably even improving on them themselves.
> Notably, participants were more likely to judge AI-generated poems as human-authored than actual human-authored poems
There is clearly a significant difference between AI generated poems and human generated poems.
A random group of people probably do not read poetry. It would be be interesting to see what people who do read poetry regularly do on this. Also, which they rate as good, rather than just "human authored".
I find with both the little AI generated poetry, and the AI generated paintings that show up in my FB feed, both look a bit rubbish. FB is pretty good for experiencing this because the work (usually an image) shows before the cues that it is AI generated in the text and comments.
As someone who reads poetry regularly and has played around a bit with AI-generated poems, AI poems can be quite impressive, but have a certain blandness to them. I can see them conforming very well to the average person's concept of what a poem is, whereas the human written poems might be less pleasingly perfect and average, more stylistically and structurally experimental etc. The LLM version is less likely to play with the rules and expectations for the medium, and more likely to produce something average and "correct", which makes some intuitive sense given the way it works.
LLM’s also have no life experience, so they can wrote poems, but those poems aren’t communicating anything real. Poetry in the vein of Whitman and Dickinson and Plath is very much about a person expressing their very personal experiences and emotions.
I think when asked to rate poetry as human or ai authored that human poetry does look more like a random smattering of semi-related words which, I assume, is what folks think machine generated poetry would look like.
Perfect grammar and perfect meter doesn't read as AI to most folks yet.
I'm reminded of how people are bad at generating and recognizing truly random patterns. I imagine the famous poets have something in their writing that's an outlier. I wonder if the human-authored poetry looks odd enough to cause problems with our fake detectors, while the mediocre grey goo that AI creates better fits expectations.
Anecdote incoming - I read poetry, weekly if you will, over about 15 years now.
I also play with LLM's often, for creative side projects and work commercially with them (prompt engineering stuff)
I don't find it far fetched that individual poetry can be indistinguishable at times when AI generated. I was asking it to write in iambic pentameter (sonnets) and it consistenly got the structure right, it's approach to the provided themes were as complicated or glib as I wanted. But that's all subjective right, which leads me to my main point.
My view of poetry over the years, has always been centred around the poet, the poet living in a time and place. As a generalisation most people buy into the artists life because it may represent some part of themself.
If someone managed to write an intriguing corpus of texts using LLM's that was extolled, I think that would almost be besides the point. What is important is the narrators life, ups and downs, joys and woes. Their ability to convey a memorable story even heavily relying on AI would still be impressive. Anyway sounding a bit wanky I will stop lol
(I do think LLM's write a little too perfect and that is easy to think it is not human, but you can kinda prompt them to throw in errors too so who knows)
> Despite this success, evidence about non-experts’ ability to distinguish AI-generated poetry has been mixed. Non-experts in poetry may use different cues, and be less familiar with the structural requirements of rhyme and meter, than experts in poetry or poetry generation. Gunser and colleagues14 and Rahmeh15 find that human-written poems are evaluated more positively than AI-generated poems. Köbis and Mossink16 finds that when a human chooses the best AI-generated poem (“human-in-the-loop”) participants cannot distinguish AI-generated poems from human-written poems, but when an AI-generated poem is chosen at random (“human-out-of-the-loop”), participants are able to distinguish AI-generated from human-written poems.
This is a huge difference. Writing is a two-step process: idea generation, and selection. The first part is similar to what a randomized algorithm or an LLM might do, in smaller chunks (and indeed, the history of aleatoric processes in creative endeavors is long -- see Oulipo for one example in literature.)
The second step -- selection -- is the heart of creativity. It's about taste. Knowing what is and isn't "good."
When you consider the creative processes of Dada, Duchamp, Brian Eno -- I think it becomes clear that this automation of creative generation is a continuation of existing trends rather than a sudden disruption.
If an LLM were able to, on its own, generate and select poems, independently developing a new style of poetry that resonated with readers -- that would be something else entirely.
Maybe my logic is choppy but I'm unconvinced that it is useful to use (inexperienced) popular sentiment as a marker of some kind of reference for cultural quality here. The average person has read at most 100 poems; people who really like poetry will be reading quite a bit more, and that experience has relevance in a context such as this. I'm more curious what happens when we look at specifically those humans who engage in substantial, regular engagement with a given artform.
The study is interesting but that the general public enjoy the most statistically calculated, 50% grey inoffensive, unchallenging & formulaic poetry isn't too novel -- I don't much read poetry and I'm sure I'd be voting for the AI also.
Just for fun, I had ChatGPT rewrite your comment as a poem. Here’s the result:
————
My logic may be choppy, but I’m not yet sold,
That popular sentiment should be the gold,
For judging culture's worth or depth or art,
When most have seen so little, played no part.
The average person, read just a hundred lines,
While those who love the craft cross many signs,
With pages turned, they gain a richer view,
That shapes what art they’ll seek, and how they choose.
The study’s neat, but here’s what I believe:
The crowd enjoys the safe, the bland, the brief—
The 50% grey, inoffensive way,
The formulaic poems that softly sway.
I don't read much myself, I must admit,
But even I might find that AI's wit
Could speak in ways that sound like common sense,
But lacks the depth of art's true recompense.
I would argue that 99% of popular songs have fairly poor quality lyrics (though it doesn't really matter with regard to artistic merit of the finished product). Singer-songwriters in folk-adjacent genres such as Joni Mitchell, Sufjan Stevens, and Don McLean come closest to actual poetry IMHO.
Yeah, the 1st, 2nd, and 4th, lines are very clunky and sound horrible read aloud with oddly placed syllables. The "view" and "choose" rhyme is horrible (the 3rd line's first rhyme is barely a rhyme at all). The line breaks are just as bad with poor examples of enjambment.
Hmm, I’m not convinced. That’s just the comment in the guise of poetry. There are a bunch of dangling implications in this “poem” where a real master would weave the implications together.
I guess I’d also say it is not only doing the right thing that counts, one must also be doing it for the right reason.
AI “art” is mimicry, burdened by the inevitable mundanity of the majority of its training corpus. The avant garde remains exclusively the domain of a comparative handful of forward thinking humans, in my humble opinion.
Or said another way: AI art is kitschy, and I don’t think it can escape it.
It's completely ideal to get an average person's opinion on something as primal as poetry.
Poetry is for everyone, not just poetry connoisseurs. It's a simplified primal expression of language, taking the form of pretty soundbites & snippets, pristine, virginal, uncorrupted by prose and overthinking. Poetry is not the domain of middlebrow academics.
I used to think that, but I think this is only true if you want to measure broad market appeal. Very few things are broadly marketable, and many of them have niches. "Middlebrow academics" are the ones who go to the poetry shelf of their local bookstore and pick up anthologies and they are the ones who go to poetry slams, and so on.
I honestly think that there might be some truth to that.
If you look at Boston Dynamics, these are some of the very best roboticists on the planet, and it's taken decades to get robots that can walk almost as well as humans. I don't think it's incompetence on Boston Dynamics' end, I think it turns out that a lot of the stuff that's trivial and instinctual for us is actually ridiculously expensive to try and do on a computer.
Washing dishes might not be the best example because dishwashers do already exist, and they work pretty well, but having a robot with anywhere near the same level of flexibility and performance as a human hand? I'm not going to say it's impossible obviously, but it seems ridiculously complex.
Probably not, but the difference is that we can generate art at the "pixel" level instead of the "hand" level. Not really a way to do that for most other stuff.
I mean a literal PAINTing is made with paint, obviously.
If you were to try and create a robot hand that painted as well as humans that would probably comparably difficult to any other task involving a human hand. I was saying that we solve the AI art problem by skipping straight to an end state (pixels) instead of the same mechanism a human might.
I think this line of reasoning is really bizarre, as if there's this straight-line path of progress, and then we stop the second it starts doing shit that we consider "fun".
Who is to say that "washing dishes" (to use your example) is a less complicated problem than art, at least in regards to robotics and the like?
it's not a matter of what's complicated, it's a matter of what it replaces. the quote isn't reflecting on what's easiest to solve, it's reflecting on the impact that it has on culture as a whole.
a tangible impact of the current generation of AI tools is they displace and drown out human creations in a flood of throwaway, meaningless garbage. this is amplifying the ongoing conversion of art into "content" that's interacted with in extremely superficial and thoughtless ways.
just because something _can_ be automated doesn't mean it _should_ be. we actively lose something when human creativity is replaced with algorithmically generated content because human creativity is as much a reflection of the state of the art as it is a reflection of the inner life of the person who engages in it. it's a way to learn about one another.
in the context of the broader discussion of "does greater efficiency everywhere actually have any benefit beyond increasing profits," the type of thing being made efficient matters. we don't need more efficient poetry, and the promise of automation and AI should be that it allows us to shrug off things that aren't fulfilling - washing dishes, cleaning the house, so on - and focus on things that are fulfilling and meaningful.
the net impact of these technologies has largely been to devalue or eliminate human beings working in creative roles, people whose work has already largely been devalued and minimized.
it's totally akin to "where's my flying car?" nobody actually cares about the flying car, the point is that as technology marches on, things seem to universally get worse and it's often unclear who the new development is benefitting.
I'll agree that AI has flooded the internet with low-effort slop. I feel like I can make a pretty strong argument that this isn't new, low-effort SEO spam has been a thing for almost as long as search engines have, but it does seem like ChatGPT (and its ilk) has brought that to 11.
> just because something _can_ be automated doesn't mean it _should_ be.
I guess agree to disagree on that. If a machine can do something better that a human, then the machine should do it so that the human can focus on stuff that machines can't do as easily.
> I guess agree to disagree on that. If a machine can do something better that a human, then the machine should do it so that the human can focus on stuff that machines can't do as easily
Machines exist for the pleasure of humans, not the other way around
This isn't some kind of "division of labour, we both have strengths and weaknesses and we should leverage them to fill pur roles best" situation
Machines are tools for humans to use. Humans should not care about "doing the things the machines aren't good at". All that matters is can machines do something that humans do not want to do. If they can't, they aren't a useful machine
Replacing humans in areas that humans are passionate about, forcing humans to compete with machines, is frankly inhuman
> Replacing humans in areas that humans are passionate about, forcing humans to compete with machines, is frankly inhuman
I don't think it's going to "force" anyone out. We didn't suddenly fire all the artists the second that the camera was invented. We didn't stop paying for live concerts the moment that recorded music was available to purchase.
> Machines exist for the pleasure of humans, not the other way around
I am not good at art. I find it pleasurable to be able to generate a picture in a few seconds that I can use for stuff. It allows me to focus on other things that I find fun instead of opening up CorelPainter and spending hours on something that won't look as good as the AI stuff.
I could of course hire someone to do the art for me, but that cost money that I don't really have. The anti-AI people who just parrot "JUST PAY AN ARTIST LOLOLOL!" are dumb if they think that most people just have cash lying around to spend on random bits of custom art.
Last time I checked, I am human. The AI art manages to allow me to enjoy things I wouldn't have been able to easily achieve before.
> as if there's this straight-line path of progress
I think your rebuttal is really bizarre. OP is simply saying what they want AI to do.
> Who is to say that "washing dishes" (to use your example) is a less complicated problem than art
I think dish washing is a bad example, because we have dishwashers. But until the market brings AI and robotic solutions to market at an affordable cost that actually fulfill most people's needs, it will continue to be a net drain on the average person.
You don't get to tell people what they want or need.
I guess what I was getting at (and I'll acknowledge that I didn't word it as well as I should), is something along the lines of: "what if automating art is a necessary step if we want to automate the boring stuff?"
I think you are probably right. But what is really frustrating about this is the lack of alignment on what people want vs what industries need.
We talk so much about how capitalism is built around people's needs, but that betrays another reality, which is that people only get what capitalism produces.
If we were a planned economy we could skip right to an android in everyone's homes. But we wouldn't even have the tech for the android with a planned economy. So instead, we have to feed capitalism what it needs so it can innovate. Which sometimes is just a net loss for everyone in the meantime.
Important correction: "English poetry". It is massively different from the most Indo-European poetry which adds very strict https://en.wikipedia.org/wiki/Metrical_foot rules. In Russian poetry, for example, even a single missed metrical foot violation is considered as a severe mistake and would be noticed by every reader. Also all good poets avoid use common rhymes (also verb rhymes) as cliché (which is considered as a mistake there), while AI tends to use common rhymes by design. There are exceptions (like Russian futurism poetry), but other than that AI fails massively.
Like any other art, the painful truth is that it is all subjective.
It kills me that despite how elitist I am with the music I listen too, that I have spend decades now carefully curating, there is no such thing as "good music". What we music snobs call "good music" is really just what makes us personally feel good coupled with the ego stroking of self described sophistication.
My university English professor in teaching Wallace Stevens to us said "It doesn't have to make sense because it makes sense to the author." I guess the magic of poetry is when you decipher what it is the author is trying to avoid saying plainly. It's not just a man making ice cream, it's about life and death.
Could someone link me to a poem that an LLM did that they personally find in some way remarkable or beautiful or moving? Something with a bit of truth and/or beauty in it?
Not something that "rhymes" or has a rough poetic structure. I have only seen complete and utter garbage from LLMs in the poetry realm, not just a bit bad, but jarring and unfeeling. Which is fine, I don't hold it against them personally, ya know. Just it really has been pisspoor.
Which wouldn't be a bother at all, except along with the "poem" there often is someone saying "wow, look what it did! such a good poem!", which has made me suspicious that the person doesn't know how LLMs work nor how to read poetry - only one of which is a really serious loss for them, I suppose.
Anyway, that sounds like I want someone to send me something so I can sh*t on it, but I'd sincerely and happily read anything with an open mind! Is there some excellent poetry hiding out there which I have missed?
Just for kicks, I asked 3.5 Sonnet to write some sentences that didn’t exist in its training data. I googled the output and sure enough they appear to be unique. Most were semi meaningless strings of words, but I thought this one was quite poetic:
“Vintage umbrellas hosted philosophical debates about whether raindrops dream of becoming oceans or if puddles remember their cloudborn youth.”
Oooh that is very surprisingly nice! "cloudborn" is a great word there, how lovely, it slows you down at the right moment for a little explosion then at "youth". I wasn't expecting a genuinely excellent answer here :)
So it turns out then that the trick may be to find a way to get them to avoid aping the oceans of human mediocrity they've been spoonfed! Funny, it's the very same reason some poets go off and live in the woods.
> Could someone link me to a poem that an LLM did that they personally find in some way remarkable or beautiful or moving? Something with a bit of truth and/or beauty in it?
A lot of people couldn't link you to _any_ poem, human-written or ai generated, in response to that question.
I have no idea if you will care for it, but my family and I appreciated what ClosedAI's CustomGPT RAG (and my LLMpal) generated. This is slow loading (the vector database was built from this one big html file), and you can scroll down to see it: https://h0p3.nekoweb.org/#2024.11.20%20-%20Carpe%20Tempus%20...
The missing clause in the headline is: AI poetry is indistinguishable from human poetry and is rated more favorably ... by people who don't read much poetry.
I would make the claim this paper shows the weakness of AI generated text. People are comfortable with familiar things. We know AI is sampling probability distributions in a latent space and by definition don't sample outside of them. Human poetry is creating net new things. Net new ideas which aren't just rehashes of past ideas. And this can definitely be uncomfortable for people, especially the layman.
In other words, general public is afraid of Jazz, they like pop music. But Jazz is where the new stuff is. AI is creating pop music, not Jazz.
I consider myself a “sophisticated” reader of poetry. I’m especially fond of Eliot - who is an acquired taste.
I’ve never read the real Eliot poem, and don’t like it much.
I searched for it. It was written before Eliot was known, and before he moved from America to Britain.
That’s important, as I’d assumed it was taking place in Britain, and Eliot’s poetry is extremely location sensitive.
It also seems to be part of a “triplet” - three poems that go together. Eliot was very inconsistent about which poems went with which. The best practice is to include any possible “other parts.”
Reading the other poems in the triplet provides important context, but they’re still not great literature.
Maybe people don’t like it because it was just one of Eliot’s crappier poems?
Edit: the Eliot poem was published shortly after he moved to the UK. But that makes the American context provided by the other poems more important.
A lot of good art requires a good palate. Taking random people off the street and asking them about AI art is going to reveal really stupid results every time.
If we keep doing this, the clickbait articles and research saying AI can do everything better than humans will never cease.
As a software guy I’ve been dabbling with poetry for last 2 years [1] as an extension to my code writing. I think poetry and code go together rather well, there are similarities when it comes to elegance, construction and structure.
Liking or not liking poetry is irrelevant. There are some interesting things happen when one actually writes poetry. You are much closer to the source of ideas that can at times feel unique and inspiring.
Generative LLMs can easily mimic style, any _existing_ style. Can they capture a original higher thought form or is generated poetry an extremely smooth word salad?
Basically an uninteresting conclusion. Of course a "non-expert" reader isn't going to be able to distinguish between AI and Walt Whitman--a "non-expert" reader likely won't even know who Walt Whitman is. "Expertise" is needed to even make the question meaningful.
I wonder how well experts in poetry would do? It seems like a strange area to use non-expert humans for evaluation because these days most people reading any poetry are at least enthusiasts.
"Where is the Life we have lost in living?
Where is the wisdom we have lost in knowledge?
Where is the knowledge we have lost in information?" - Choruses from "The Rock"
Something which seems to be under-appreciated is that poetry is intentionally imprecise. There is something about precision which reduces emotional response. Searching for the double meaning, or metaphorical meaning of a word seems to enhance its ability to produce emotion. If you want a working example of this, try comparing technical documentation (among the most precise writing out there) to any sort of popular poem. The poem is necessarily imprecise, and this imprecision seems to be a fundamental facet of the expression of emotion.
This actually seems to give LLMs and edge: their imprecision potentially matters less, and leaves more room for the reader to fill in the gaps.
The use of non-experts as judges renders this meaningless.
Imagine an algorithm that drew random shapes. Then ask people who've never engaged with art theory or history to compare its output with what they see in a modern art museum. You'd get results similar to this.
Many forms of art are actually quite inaccessible to novices. What poets see in poems or composers hears in compositions is nothing like what someone from the outside sees or hears. It's a language of its own and if you don't speak the language, you simply won't get it.
This is a statement of fact, I'm not judging if its good or bad or what the layman's opinion is actually worth. But it is empirically how communities of artists operate.
The thing is that, for most of society, art exists to serve a utilitarian purpose (make us feel good/bad/inspired/destroyed/whatever other influence). Essentially just to induce a "vibe". Anything other than what most people intuitively feel after consuming it is secondary and, frankly, unimportant in the grand scheme of things.
Except that the direction of art is usually influenced by the artists, without too much calibration to what the uninformed public wants.
It’s actually a really interesting question if this affects the market, at all. Because if AI can generate “art” which satisfies the public, but not artists, will the public just go for that, instead?
Or is it the case that the public never really cared for or consumes that art form, and that the entire market for art is to people who (to at least some degree) specialize in that form?
The main problem with AI is not that it can or can't write poetry as well as humans. It's that it's the next step in a long process to divorce human experience from art so that we lose one more beautiful facet of human existence. And the reason why that's happening is because technology needs to take away the essence of human experience from production so that we can be more irrelevant and anonymous, which is essential to being cogs in the technological production machine.
The LLM's statistically average everything they've trained on.
I would expect them to do well at any kind of fuzzy emo sort of task, as well as (at almost the other end of the spectrum) in identifying patterns (such as in radiology images, or images of any kind along with any other data set analysis).
But both of these sorts of tasks are estimations, they're not expected to produce precise factually correct results.
> We propose that people rate AI poems more highly across all metrics in part because they find AI poems more straightforward. AI-generated poems in our study are generally more accessible than the human-authored poems in our study. In our discrimination study, participants use variations of the phrase “doesn’t make sense” for human-authored poems more often than they do for AI-generated poems
I think this says more about the state of (English based) poetry over the past 150 years than it does about the ability of AI to generate competent poems. With the advent of Modernism, the poetry-industrial complex's ideas about what constituted a Good (English) Poem diverged significantly from what the general (English speaking) population expected a poem to be, how it was constructed, presented, etc. Teaching these new forms of poetry schools left generations of people confused and disinterested. Yet people need poetry in their lives; they found that poetry in popular music and Hallmark card verses.
Of course, popular poetry isn't dead. Poetry book sales in recent years (in the UK) has "boomed"[1][2]. I'll not argue the merits of the poetry being generated by the latest crop of Poet Superstars; at the end of the day their work sells because people find comfort and joy in it - and that is a positive outcome in my view!
> Given people’s difficulties identifying machine-written texts, and their apparent trust that AI will not generate imitations of human experience, it may be worthwhile for governments to pursue regulations regarding transparency in the use of AI systems.
I think AI does threaten the careers of current Poetry Superstars. Building a website to pair an AI generated outline image with some cozy AI generated verses about a given situation/emotion/discovery should be an easy project to build. Allowing users to personalise the output so they can use the results as a gift for loved ones etc might be a viable product?
But I don't see much value in forcing anyone using AI to produce creative assets to label the output as such. For one thing, there's no guarantee that anyone using those assets will maintain the labelling. A much better approach would be for poets who don't use AI to help them craft poems to label their work as "100% Human" - in the years to come it might even become a positive selling point!
I believe this will happen in many human domains, but it doesn't really matter. Nobody is going to stop writing poetry because of this and I doubt there's much of an audience for AI generated poetry.
There are forms of poetry/art/etc. that we've never thought of, that have never been conceived before. An LLM being what it is won't conceive these. Humans will continue to generate language the pattern/structure/meaning of which has never been generated by LLMs before.
Thought experiment to prove this: if you trained an LLM on every utterance of human language before the 5th century BC would you get any idea we would recognize as modern?
I think that's the wrong perspective on it. People want to compare how an AI does at one thing to how the best people in the world do at that thing.
What you really want to do is compare how good the AI is at that thing compared to the _average person_ at that thing, and I would guess that generative AI outclasses the average human at almost every task that it's even moderately competent at.
People like to point out how it can't pass the bar exam or answer simple math questions or whatever, and how that _proves_ that it's not intelligent or can't use reasoning when _most people_ would also fail at the same tasks.
Almost all the Gen AI models already have super human competency if you judge it across _everything it can do_.
We're deluding ourselves by thinking it's happening to poetry! This study is ignorant and dishonest, it should have never been published in the first place: https://cs.nyu.edu/~davise/papers/GPT-Poetry.pdf
AI research is worse than all the social sciences combined when it comes to provocative titles/abstracts that are not supported by the actual data.
Ernest Davis has an outstanding and definitive rebuttal of these claims: https://cs.nyu.edu/~davise/papers/GPT-Poetry.pdf
The basic problem is that GPT generates easy poetry, and the authors were comparing to difficult human poets like Walt Whitman and Emily Dickinson, and rating using a bunch of people who don't particuarly like poetry. If you actually do like poetry, comparing PlathGPT to Sylvia Plath is so offensive as to be misanthropic: the human Sylvia Plath is objectively better by any possible honest measure. I don't think the authors have any interest in being honest: ignorance is no excuse for something like this.
Given the audience, it would be much more honest if the authors compared with popular poets, like Robert Frost, whose work is sophisticated but approachable.
Huh, that sounds a little like claiming that AI can draw pictures just as well as humans because they look realistic at a first glance. But not if you check whether the text, repetitive elements, and partially-occluded objects in the background look correct.
The more basic problem is that their methodology would conclude Harry Potter is better than Ulysses, AC/DC is better than Carla Frey, etc etc. It is completely fine to enjoy "dumb" art - I like Marvel comics and a lot of the Disney-era Star Wars novels have been pretty fun. But using easiness and fun as a metric of quality is simply celebrating ignorance and laziness.
It's another case of the impedance mismatch between numbers / KPIs etc and what might be called "lived experience".
Why is Ulysses better than Harry Potter?
Are you aware of studies of Ulysses and James Joyce in general? That's a good start for this discussion, assuming it's not just a kneejerk reaction.
Why is Harry Pottery better than a more unknown book like the Sugar Barons? Both are exciting but one is based on real events/letters..
I think parent is basing it on high art principles which are: older better, uniqueness, what others agree is valuable and content/quality.
Why aren't we talking about the first english book published? The Recuyell of the Histories of Troye
It sounds different to me.
Poetry is for the enjoyment and enlightenment of the reader. Who reads poems? The question is a little vague. Who reads poems on a birthday card? Everyone. In a children's book? Parents and children.
But this study focuses on literary poems specifically, using works by literary heavy weights like Chaucer, Shakespeare, Dickinson, Whitman, Byron, Ginsberg, and so on. So the question is who reads literary poems, to which the answer is: academics, literary writers, and a vanishingly small population of reader who enjoy such pursuits.
So the test of if ersatz AI poetry is as good as "real" poetry should be if target audience (i.e. academics) finds the poems to be enjoyable or enlightening. But this study does not test that hypothesis.
This study tests a different hypothesis: can lay people who are not generally interested in literary poetry distinguish between real and AI literary poetry?
The hypothesis and paper feel kind of like a potshot at literary poetry to me, or at least I don't understand why this particular question is interesting or worthy of scientific inquiry.
Or if you check that any pictures of people have the correct number of fingers, toes, arms, legs, or heads.
The vast majority of people seem to prefer Avengers 17 over any cinematic masterpiece, the latest drake song would be better rated than a Tchaikovsky... We should let them play and worship chat gpt if that's what they want to waste their time on
I don't understand the logic of calling superhero movies lesser/unserious like this, it's very snobby. Movies and music are made to be entertaining, the avengers is more entertaining than your "arthouse cinematic masterpiece that nobody likes but it's just because they aren't smart enough to understand it". It's also lazy and ignorant to ignore the sheer manpower that goes into making a movie like that.
I don’t fully agree with putting down “fun” movies like the Avengers, but at the same time “serious” art is not primarily for plain entertainment.
People might find “serious” art meaningful and it might spark feelings in them, but that’s not the same as getting an adrenaline rush from exploding cars in an action scene.
Of course there are also cases where the boundary between “fun” and “serious art” is not so clear, there are always exceptions to any attempt to define what makes something “serious art”. Art can also be subversive and run counter to traditional expectations of what art “should” be. But I don’t think the Avengers is an example of that.
Movies, music, wiriting, all human arts, are made to make their audience feel something. "Entertaining" is only a small and honestly ill-defined subset of this, no more valid than any other approach.
what a ridiculous comparison. Of course a superhero movie is more entertaining than a film that is explicitly designed to avoid mass appeal.
The nuance is that movies today is not where the most creative talent is directed anymore. The shift started with prestige TV taking off in the 2000s, and episodic content on streaming services surpassing film as a mass-market artform in the 2010s, with the pandemic driving the nail in the coffin.
I loved the late 2000s / early 2010s superhero movies. Spiderman, The Dark Knight, Iron Man, etc. These were great films. Today, the MCU is just eating its own tail with the most bland, repetitive crap. It's all designed to incentivize the same die hard fans to keep forking over their hard-earned cash with all the cross-film teasers and the need to watch every film to understand all the references and moving parts. I understand the business model—it's actually the same as comic books now—because people don't casually go in to see random movies anymore, they do that at home on Netflix, so they have to target the repeat viewers. It's visually impressive, and the acting is good enough to keep a relatively large subset of the population coming back, but for someone like me who wants at least a little bit of novelty or creativity in the plot or characters, it's just so become so mind-bogglingly boring.
> your "arthouse cinematic masterpiece that nobody likes
You're reading way too much into my comment. Any block buster from the 80/90s absolutely shits on 90% of block busters released today. I'm not talking about obscure 1950s czechoslovak cinema here...
> ignore the sheet manpower that goes into making a movie like that.
A lot of work doesn't make something good, especially when cgi quality actually gets worse year after year. FYI the entire LOTR trilogy had 30% less budget and 4x the runtime of the last avenger movie... And they actually filmed things outside of a Hollywood studio
The only lazy thing here are the scenarists and the directors shitting out the blandest movies ever. But then again if all we care about is raw entertainment then sure, it's perfect, very easy to digest, lots of colors and not too much to think about, the cinematic equivalent of fast food. You can even buy avengers branded toilet paper and bottle water, that really shows how much they care about movies!
Well said. There's tons of blockbusters and other popular movies from the 80s/90s that were absolutely made for the "masses", but were genuinely great films, and far better than almost any blockbuster from the last 5-10 years, especially all the comic-book stuff. Alien(s), Back to the Future trilogy, Terminator 1/2, Ghostbusters, Beetlejuice, I could go on and on. And of course the LotR trilogy if you look at the early 2000s. Movies just aren't as innovative or risky these days; something as quirky as Ghostbusters wouldn't be made now (but Hollywood is happy to make remakes and sequels of that franchise now, 40 years later).
Film is such a nascent art form. The 90s as “peak blockbuster action” is a valid stance on taste but hard to defend as superior to all that came after. Christopher Nolan’s Dark Knight is leagues aways from the 90s Batman, as an auteur friendly and obvious comparison. Pixar another on the animation front.
There have been great films made in every era, but the trend towards tighter writing, more legible and compelling action, and emotionally impactful story telling is strongly trending upwards overall.
And nothing will ever top the merchandising mania of the 80s!
I hope you're referring to Joel Schumacher's kitschy drivel, and not to Tim Burton's masterpieces (both of which are IMO vastly superior to Nolan's take on the subject).
Random thought outburst, feel free to downvote:
This reminded me so much of Spaceballs! And the yogurt merchandise towards the end! Such a great movie that has so many obvious "flaws" like the mirror under the speeder on the desert planet when they comb the desert. And yet I've actually watched that movie more often than even the actual real Star Wars movies (meaning the first three made - all of which are timeless awesomeness)
> Any block buster from the 80/90s absolutely shits on 90% of block busters released today
You sure it's not survival bias, as in, you only are thinking and remembering the good ones over a two-decade period and comparing them against what movies came out this year. When in reality, there might be tons of blockbusters in those era that were just as bad as your average one today?
For perspective, your comments could be released direct to VHS.
There is more to art than entertainment. For example Oedipus Rex [1] - distinctly not entertaining; but art, and powerful in an incomparable way, anyway.
_____________
[1] Don't look it up if you don't know what that is.
Nobody likes art that requires them to think.
Well, the art can't judge itself.
Maybe critics are art, too. Like Lipton's "Inside the Actor's Studio" (Detroit). That's art.
"It's not art it's ari. You want to make an art film? You take it to Sundance, you take it to Telluride, you take it to Cannes."
"Movies and music are made to be entertaining"
In your opinion, perhaps. Other films are made to be provocative-- to make you think or reflect. Certainly, a lot of the "arthouse cinematic masterpieces" aim for that as a goal rather than purely entertainment.
You're arguing against a strawman here... nobody is saying making an avengers movie is low effort. Certain aspects of an avengers movie though require less effort.
Movies and music are usually made to be entertaining, but sometimes they're made as an artistic outlet for the creator.
I was listening to Schoenberg's "Suite for Piano" the other day. Did he make it to be entertaining? I don't know, interesting maybe. I wouldn't put it on at a party.
It's true that snobbery is off-putting, but if you're looking for artistic merit, then some works last longer than others. If you're looking for something to enjoy with your popcorn, then there's that too.
Of course, this depends on what we mean by "poetry." Art is so hard to define, we might as well call it ineffable. Maybe Justice Potter said it best, "I know it when I see it." And I think most artists would agree with this, because the point is to evoke emotion. It is why you might take a nice picture and put it up on a wall in your house but no one would ever put it in a museum. But good art should make you stop, take some time to think, figure out what's important to you.
The art that is notable is not something you simply hang on a wall and get good feelings from when you glance at it. They are deep. They require processing. This is purposeful. A feature, not a bug. They are filled with cultural rhetoric and commentary. Did you ever ask why you are no Dorothea Lange? Why your photos aren't as meaningful as Alfred Eisenstaedt's? Clearly There's something happening here, but what it is ain't exactly clear.
Let me give a very recent example. Here[0] is a letter from The Onion (yes, that Onion, the one who bought InfoWars The Onion) wrote an amicus brief to the Supreme Court. It is full of satire while arguing that satire cannot be outlawed. It is __not__ intended to be read at a glance. In fact, they even specifically say so
That parody only works if one is able to be fooled. You can find the author explaining it more here[1].But we're coders, not lawyers. So maybe a better analogy is what makes "beautiful code." It sure as fuck is not aesthetically pleasing. Tell me what about this code is aesthetically pleasing and easy to understand?
It requires people writing explanations![2] Yet, I'd call this code BEAUTIFUL. A work of art. I'd call you a liar or a wizard if you truly could understand this code at a glance.I specifically bring this up because there's a lot of sentiment around here that "you don't need to write pretty code, just working code." When in fact, the reality is that the two are one in the same. The code is pretty __because__ it works. The code is a masterpiece because it solves issues you probably didn't even know existed! There's this talk as if there's this bifurcation between "those who __like__ to write code and those who use it to get things done." Or those who think "code should be pretty vs those who think code should just work." I promise you, everyone in the former group is deeply concerned with making things work. And I'll tell you now, you've been sold a lie. Code is not supposed to be a Lovcraftian creature made of spaghetti and duct tape. You should kill it. It doesn't want to live. You are the Frankenstein of the story.
To see the beauty in the code, you have to sit and stare at it. Parse it. Contemplate it. Ask yourself why each decision is being made. There is so much depth to this and it's writing is a literal demonstration of how well Carmack understands every part of the computer: the language, how the memory is handled, how the CPU operations function at a low level, etc.
I truly feel that we are under attack. I don't know about you, but I do not want to go gentle into that good night. Slow down, you move too fast, you got to make the morning last. It's easy to say not today, I got a lot to do, but then you'll grow up to be just like your dad.
[0] https://www.supremecourt.gov/DocketPDF/22/22-293/242292/2022...
[1] https://www.law.berkeley.edu/article/peeling-layers-onion-he...
[2] https://betterexplained.com/articles/understanding-quakes-fa...
I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made. They make it clear in the paper that they're specifically evaluating people who aren't especially interested in poetry, and talk at length about how and why this is different from other approaches. I suppose the clickbait title gives a bad first impression.
To summarize the facts: when people were asked to tell if a given poem was written by human or AI, people thought the AI poems were human more often than human poems. People also tended to rate them higher when they thought they were created by AI than when they thought they were human-made. It's speculated in the paper that this is because the AI poems tended to be more accessible and direct than the human poems selected, and the preference for this style from non-experts combined with the perception that AI poetry is poor led to the results.
The selection of human poets is cooked to give the result they wanted. I will grant the authors may have lied to themselves. But I don't think honest scientists would have ever constructed a study like this. It is comparing human avant garde jazz to AI dance music and concluding that "AI music" is more danceable than "human music", without including human dance music! It's just infuriating.
They expressly state the result is likely because the AI poetry was more simple and direct than the poetry selected, which is more accessible for the average person not interested in poetry. They compare and contrast this with other studies where this was not the case.
Yes, it's comparing apples and oranges; that's the whole point. It doesn't make the experiment itself flawed.
Hum, but it should have compared against human poems that go for a similar style no? Otherwise, it doesn't tell us much, except that AI was not able to make more complex poems? And maybe that people who don't like poetry when asked prefer simpler poems?
It seems to me that the whole study was intended to manufacture a result to grab headlines. Scientific clickbait. It doesn't matter how transparent they are, because that is mostly there to cover their asses.
> I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made.
Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs. Then I found a set of random bystanders and surveyed them on which drink they preferred, and they generally preferred the lemonade. Do you see how it would be dishonest of me to treat this like a serious evaluation of my skill, either as a lemonade maker or a wine maker?
Sure - but it could still be pretty relevant if we want to ask about the future of beverage making and consumption, especially if new technology enables everybody to mass-produce lemonade (and similar sugary beverages) at home at minimal cost.
I'm quite sympathetic to poetry - I actually wrote a blog post about this article last week https://gallant.dev/posts/whither-poetry/
But much like the "debate" between linguistic prescriptivism ("'beg the question' doesn't mean 'raise the question'") and descriptivism ("language is how it is used"), both perspectives have relevance, and neither are really responses to the other.
I certainly hope people keep writing great, human, poetry. But generative ML is a systemic change to creative output in general. Poetry just happens to be in some ways simplest for the LLMs, but other art is tokens and patterns as well.
Personally, I think this would be a sin. To call something art which has no depth. We have too many things that are shallow. I think this has been detrimental to us as a society. That we're so caught up with the next thing that our leisure is anything but. What is the point of this all if not to make life more enjoyable? How can we enjoy life if we cannot have a moment to appreciate it? If we treat time off as if it is a chore that we try to get done as fast as possible? If we cannot have time to contemplate it? A world without friction is dull. It's as if we envy the machines. Perhaps we should make the world less tiring, so we have the energy to be human.
> Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs.
I would tell you that there have been results about the blind testing of wines held in high regard by connoisseurs that might make you not want to choose that for a comparison.
The blind tasting studies prove that connoisseurs can't discern the price of wine by taste. They can tell whether or not they like it perfectly well. A good bottle, not an expensive one.
I believe you missed the OP's point. Poetry is to be processed. That's a feature, not a bug. Now that we're in an analytical conversation you need to process both papers and OP's words. Like poetry, there is context, things between the lines. Because to write everything explicitly would require a book's worth of text. LLMs are amazing compression machines, but they pale in comparison to what we do. If you're take a step back, you'll even see this comment is littered with that compression itself.
I don't agree that it's possible to describe a piece of poetry as "objectively" better or worse.
You can if the assignment is to write something in iambic pentameter.
I would say Davis's definition of "objectively better" here is "nobody who reads these poems carefully could possibly conclude that this AI crap is better than Walt Whitman, the only explanation is Walt Whitman is so difficult that the raters didn't read it carefully."
The Nature paper is making a bold and anti-humanist claim right in the headline, laundering bullshit with bad data, without considering how poorly-defined the problem is. This data really is awful because the subjects aren't interested in reading difficult poetry. It is entirely appropriate for Davis, as someone who actually is interested in good poetry, to make a qualitative stand as to what is or isn't good poetry and try to define the problem accordingly.
>This data really is awful because the subjects aren't interested in reading difficult poetry.
If the results were the opposite, would the data still be awful?
The data would still be awful, and people would pay less attention to the study because it’s not a priori surprising that ChatGPT would write worse poetry than the most celebrated poets in history.
If I use bad data to conclude that “Java is faster than C++ in most cases” you can be sure it will receive a lot more attention than if I reached the opposite conclusion based on similarly bad data.
That's my litmus test for whether I'm confirmation biasing myself.
Obviously this study is flawed and the results are garbage! But if the study had concluded the opposite then I knew it!
You're making a load of generous claims for yourself without giving your thought process:
> The basic problem is that GPT generates easy poetry
> were comparing to difficult human poets
What's your qualitative process for measuring "easy" vs. "difficult" poetry?
> rating using a bunch of people who don't particuarly like poetry
How do you know these people don't like poetry? Maybe they don't seek it out, but certainly poetry is not just for poetry lovers. Good poetry speaks to anyone.
> the human Sylvia Plath is objectively better by any possible honest measure
Really? Whats your objective measure?
> the human Sylvia Plath is objectively better by any possible honest measure.
Except for arguably the most important one, creating something that people enjoy. Just because you dont like it doesnt make it worthless. I guess the actual question is do the raters actually get any enjoyment out of the ai poem or do they just intensely dislike both?
Poetry, like humor, involves the use of the reader's expectations, but is typically most effective when subverting those expectations.
There's a lot of bad poetry in the world that just follows readers' expectations. I should know, I've written some of it. Unfortunately, I'd suspect that most readers' understanding of poetry lacks that crucial element of subversion, and so an LLM – which mostly just spits out the most-probable, most-expected next token – looks like what people think poetry is.
An LLM would not have created the curtal sonnet form, because it would've been too busy following the rules of a Shakespearean or Petrarchan sonnet. Similarly, an LLM wouldn't create something that intentionally breaks the pattern of a form in order to convey a sense of brokenness or out-of-place-ness, because it's too locked in on the most-common-denominator from previous inputs. And yet, many of the most powerful works I've read are the ones that can convey a disjointed feeling on purpose – something an LLM is specifically geared not to do.
Poetry aims for the heart, and catches the mind along the way. An LLM does not have the requisite complex emotional intelligence, and is indeed a pretty poor simulation of emotional intelligence.
Consider Auden's Epitaph on a Tyrant, which is powerful because it is so suddenly shocking, as it describes something that sounds perhaps like an artist or author, until it very suddenly doesn't, on the last line:
One could literally take Claude 3.5 Sonnet New or o1-preview and disprove this in an hour or two just by prompting the AI to try to exhibit the type of poetry you want and then maybe asking it to do a little bit of automated critique and refinement.
You can also experiment with having a higher temperature, (maybe just for the first draft).
You claim that LLMs can't make poetry like that. I bet they can if you just ask them to.
They could, but they probably won't. Poems like GP are basically using the power of emotional manipulation for good, and companies like Anthropic try very hard to prevent Claude from having that capability.
if im reading the gp properly, theyre saying that an llm isnt capable od inventing new poetry forms.
of its easy, can you provide some poetry forms youve coaxed sonnet to create, with some exemplary poems in the form?
What he said was basically that it just couldn't create unexpected verses or break form. Since supposedly it can only do the most probable token -- but that's not how sampling works unless you use temperature 0. And it can easily be instructed to break from a strict form (which would create a new variation of the form) for effect if it made sense.
You could also ask it to create a new form and it could. I don't work for you so I don't have to create examples, but anyone who has used the latest SOTA models for any amount of time knows this capability is expected, and if you were really interested then you would try it. If you feel the result isn't very good, ask it to improve it.
I could program even a markov chain to generate a lot of odd unusual potentially interesting stuff, but no one would call any of it a new form of poetry, because establishing something like that requires social status, which robots don't have.
It can be obvious and go to the heart. I'm not sure Wilfred Owen's Dulce et decorum est is anything other than straight down the line, but it made me cry when I first read it.
That said, maybe the subversion is in how the reality is contrasted with the marketing.
I see 'subversion' as more broad. In good poetry, subversion is constantly happening at a micro level, through playing with meaning, meter, word choice. I think it's very easy to identify AI-generated poetry because it lacks any of that -- but on the flip side, if you don't understand the rules, you don't understand how to subvert them.
Even in Dulce et decorum est -- though the meaning's straightforward, there are plenty of small, subversive (and pretty) ideas in the meter. For example, the line "He plunges at me, guttering, choking, drowning" is an unexpected, disruptive staccato that really sounds like guttering, choking, drowning. It's a beautiful poem and is overflowing with such examples.
(I think this applies to art as a whole, not just poetry.)
> Consider Auden's Epitaph on a Tyrant, which is powerful because it is so suddenly shocking
Not very, given the title...
Should be noted I do my best to avoid trailers, they totally spoil movies/shows for me.
>There's a lot of bad poetry in the world that just follows readers' expectations. I should know, I've written some of it.
Apparently, in 'the good old days (of the internet)', your poetry would be published by yourself, on your webpage - complete with a starry-twinkling background, a 'number of visitors' counter in the bottom right, and in a pink font.
I miss those days.
don't forget the auto-playing MIDI music playing in the background
The program Racter (which, from what I understand, was a basic MDP) was generating poetry in the 1980s that was published, read, criticized, and presented in museums: https://www.101bananas.com/poems/racter.html
I remember this as one of its poems was used on on the t-shirts of the computing social club that I was part of as a postgrad student:
If it was like other poetry generation programs of the 80’s and 90’s, it was generating a lot more crap than gold. People definitely were picking out the most cohesive examples, and probably even improving on them themselves.
The headline is misleading. The actual text says:
> Notably, participants were more likely to judge AI-generated poems as human-authored than actual human-authored poems
There is clearly a significant difference between AI generated poems and human generated poems.
A random group of people probably do not read poetry. It would be be interesting to see what people who do read poetry regularly do on this. Also, which they rate as good, rather than just "human authored".
I find with both the little AI generated poetry, and the AI generated paintings that show up in my FB feed, both look a bit rubbish. FB is pretty good for experiencing this because the work (usually an image) shows before the cues that it is AI generated in the text and comments.
As someone who reads poetry regularly and has played around a bit with AI-generated poems, AI poems can be quite impressive, but have a certain blandness to them. I can see them conforming very well to the average person's concept of what a poem is, whereas the human written poems might be less pleasingly perfect and average, more stylistically and structurally experimental etc. The LLM version is less likely to play with the rules and expectations for the medium, and more likely to produce something average and "correct", which makes some intuitive sense given the way it works.
LLM’s also have no life experience, so they can wrote poems, but those poems aren’t communicating anything real. Poetry in the vein of Whitman and Dickinson and Plath is very much about a person expressing their very personal experiences and emotions.
I think when asked to rate poetry as human or ai authored that human poetry does look more like a random smattering of semi-related words which, I assume, is what folks think machine generated poetry would look like.
Perfect grammar and perfect meter doesn't read as AI to most folks yet.
I'm reminded of how people are bad at generating and recognizing truly random patterns. I imagine the famous poets have something in their writing that's an outlier. I wonder if the human-authored poetry looks odd enough to cause problems with our fake detectors, while the mediocre grey goo that AI creates better fits expectations.
Anecdote incoming - I read poetry, weekly if you will, over about 15 years now.
I also play with LLM's often, for creative side projects and work commercially with them (prompt engineering stuff)
I don't find it far fetched that individual poetry can be indistinguishable at times when AI generated. I was asking it to write in iambic pentameter (sonnets) and it consistenly got the structure right, it's approach to the provided themes were as complicated or glib as I wanted. But that's all subjective right, which leads me to my main point.
My view of poetry over the years, has always been centred around the poet, the poet living in a time and place. As a generalisation most people buy into the artists life because it may represent some part of themself.
If someone managed to write an intriguing corpus of texts using LLM's that was extolled, I think that would almost be besides the point. What is important is the narrators life, ups and downs, joys and woes. Their ability to convey a memorable story even heavily relying on AI would still be impressive. Anyway sounding a bit wanky I will stop lol
(I do think LLM's write a little too perfect and that is easy to think it is not human, but you can kinda prompt them to throw in errors too so who knows)
> Despite this success, evidence about non-experts’ ability to distinguish AI-generated poetry has been mixed. Non-experts in poetry may use different cues, and be less familiar with the structural requirements of rhyme and meter, than experts in poetry or poetry generation. Gunser and colleagues14 and Rahmeh15 find that human-written poems are evaluated more positively than AI-generated poems. Köbis and Mossink16 finds that when a human chooses the best AI-generated poem (“human-in-the-loop”) participants cannot distinguish AI-generated poems from human-written poems, but when an AI-generated poem is chosen at random (“human-out-of-the-loop”), participants are able to distinguish AI-generated from human-written poems.
This is a huge difference. Writing is a two-step process: idea generation, and selection. The first part is similar to what a randomized algorithm or an LLM might do, in smaller chunks (and indeed, the history of aleatoric processes in creative endeavors is long -- see Oulipo for one example in literature.)
The second step -- selection -- is the heart of creativity. It's about taste. Knowing what is and isn't "good."
When you consider the creative processes of Dada, Duchamp, Brian Eno -- I think it becomes clear that this automation of creative generation is a continuation of existing trends rather than a sudden disruption.
If an LLM were able to, on its own, generate and select poems, independently developing a new style of poetry that resonated with readers -- that would be something else entirely.
Maybe my logic is choppy but I'm unconvinced that it is useful to use (inexperienced) popular sentiment as a marker of some kind of reference for cultural quality here. The average person has read at most 100 poems; people who really like poetry will be reading quite a bit more, and that experience has relevance in a context such as this. I'm more curious what happens when we look at specifically those humans who engage in substantial, regular engagement with a given artform.
The study is interesting but that the general public enjoy the most statistically calculated, 50% grey inoffensive, unchallenging & formulaic poetry isn't too novel -- I don't much read poetry and I'm sure I'd be voting for the AI also.
Just for fun, I had ChatGPT rewrite your comment as a poem. Here’s the result:
————
My logic may be choppy, but I’m not yet sold, That popular sentiment should be the gold, For judging culture's worth or depth or art, When most have seen so little, played no part.
The average person, read just a hundred lines, While those who love the craft cross many signs, With pages turned, they gain a richer view, That shapes what art they’ll seek, and how they choose.
The study’s neat, but here’s what I believe: The crowd enjoys the safe, the bland, the brief— The 50% grey, inoffensive way, The formulaic poems that softly sway.
I don't read much myself, I must admit, But even I might find that AI's wit Could speak in ways that sound like common sense, But lacks the depth of art's true recompense.
This is a great example of a clearly very bad poem that would sound "good enough" to people. Some things rhyme, but there's very clearly no plan.
That definition covers 90% of popular hip-hop, pop and r&b.
I would argue that 99% of popular songs have fairly poor quality lyrics (though it doesn't really matter with regard to artistic merit of the finished product). Singer-songwriters in folk-adjacent genres such as Joni Mitchell, Sufjan Stevens, and Don McLean come closest to actual poetry IMHO.
Yeah, the 1st, 2nd, and 4th, lines are very clunky and sound horrible read aloud with oddly placed syllables. The "view" and "choose" rhyme is horrible (the 3rd line's first rhyme is barely a rhyme at all). The line breaks are just as bad with poor examples of enjambment.
Hmm, I’m not convinced. That’s just the comment in the guise of poetry. There are a bunch of dangling implications in this “poem” where a real master would weave the implications together.
I guess I’d also say it is not only doing the right thing that counts, one must also be doing it for the right reason.
AI “art” is mimicry, burdened by the inevitable mundanity of the majority of its training corpus. The avant garde remains exclusively the domain of a comparative handful of forward thinking humans, in my humble opinion.
Or said another way: AI art is kitschy, and I don’t think it can escape it.
“Kitschy”. Seems like a perfect word for that!
Do you think this is a good poem by any measure? Imo it is terrible.
I think that most people would call it a good poem due to its rhyming and simplistic structure.
It’s definitely not Beowulf. I’m not a poetry expert, but even I can tell it’s at something like a 5th grade level.
> I don't read much myself, I must admit,
Just a boorish observation: It had a golden opportunity to weave the word `shit` in there and yet it did not.
This is great, much better than expected. ChatGPT makes LLMs venerated.
Was this with a Vogon plugin?
No. Straight ChatGPT. I wasn’t even logged in.
It's completely ideal to get an average person's opinion on something as primal as poetry.
Poetry is for everyone, not just poetry connoisseurs. It's a simplified primal expression of language, taking the form of pretty soundbites & snippets, pristine, virginal, uncorrupted by prose and overthinking. Poetry is not the domain of middlebrow academics.
I used to think that, but I think this is only true if you want to measure broad market appeal. Very few things are broadly marketable, and many of them have niches. "Middlebrow academics" are the ones who go to the poetry shelf of their local bookstore and pick up anthologies and they are the ones who go to poetry slams, and so on.
I don't need an AI to do art so I can wash the dishes.
I need an AI that can do the dishes so I can do art.
- Someone else on twitter.
Washing dishes is more human and certainly more important than doing art.
Would you rather wash the dishes than enjoy a favorite book, or piece of music or whatever form of art best floats your boat? Seriously?
Maybe washing dishes is harder than doing art, who knew?
I honestly think that there might be some truth to that.
If you look at Boston Dynamics, these are some of the very best roboticists on the planet, and it's taken decades to get robots that can walk almost as well as humans. I don't think it's incompetence on Boston Dynamics' end, I think it turns out that a lot of the stuff that's trivial and instinctual for us is actually ridiculously expensive to try and do on a computer.
Washing dishes might not be the best example because dishwashers do already exist, and they work pretty well, but having a robot with anywhere near the same level of flexibility and performance as a human hand? I'm not going to say it's impossible obviously, but it seems ridiculously complex.
And those robots can paint like a good painter??
Probably not, but the difference is that we can generate art at the "pixel" level instead of the "hand" level. Not really a way to do that for most other stuff.
Paintings aren't made at the "pixel" level and I think your distinction is especially humorous considering the context.
I mean a literal PAINTing is made with paint, obviously.
If you were to try and create a robot hand that painted as well as humans that would probably comparably difficult to any other task involving a human hand. I was saying that we solve the AI art problem by skipping straight to an end state (pixels) instead of the same mechanism a human might.
[dead]
I think this line of reasoning is really bizarre, as if there's this straight-line path of progress, and then we stop the second it starts doing shit that we consider "fun".
Who is to say that "washing dishes" (to use your example) is a less complicated problem than art, at least in regards to robotics and the like?
it's not a matter of what's complicated, it's a matter of what it replaces. the quote isn't reflecting on what's easiest to solve, it's reflecting on the impact that it has on culture as a whole.
a tangible impact of the current generation of AI tools is they displace and drown out human creations in a flood of throwaway, meaningless garbage. this is amplifying the ongoing conversion of art into "content" that's interacted with in extremely superficial and thoughtless ways.
just because something _can_ be automated doesn't mean it _should_ be. we actively lose something when human creativity is replaced with algorithmically generated content because human creativity is as much a reflection of the state of the art as it is a reflection of the inner life of the person who engages in it. it's a way to learn about one another.
in the context of the broader discussion of "does greater efficiency everywhere actually have any benefit beyond increasing profits," the type of thing being made efficient matters. we don't need more efficient poetry, and the promise of automation and AI should be that it allows us to shrug off things that aren't fulfilling - washing dishes, cleaning the house, so on - and focus on things that are fulfilling and meaningful.
the net impact of these technologies has largely been to devalue or eliminate human beings working in creative roles, people whose work has already largely been devalued and minimized.
it's totally akin to "where's my flying car?" nobody actually cares about the flying car, the point is that as technology marches on, things seem to universally get worse and it's often unclear who the new development is benefitting.
I'll agree that AI has flooded the internet with low-effort slop. I feel like I can make a pretty strong argument that this isn't new, low-effort SEO spam has been a thing for almost as long as search engines have, but it does seem like ChatGPT (and its ilk) has brought that to 11.
> just because something _can_ be automated doesn't mean it _should_ be.
I guess agree to disagree on that. If a machine can do something better that a human, then the machine should do it so that the human can focus on stuff that machines can't do as easily.
> I guess agree to disagree on that. If a machine can do something better that a human, then the machine should do it so that the human can focus on stuff that machines can't do as easily
Machines exist for the pleasure of humans, not the other way around
This isn't some kind of "division of labour, we both have strengths and weaknesses and we should leverage them to fill pur roles best" situation
Machines are tools for humans to use. Humans should not care about "doing the things the machines aren't good at". All that matters is can machines do something that humans do not want to do. If they can't, they aren't a useful machine
Replacing humans in areas that humans are passionate about, forcing humans to compete with machines, is frankly inhuman
> Replacing humans in areas that humans are passionate about, forcing humans to compete with machines, is frankly inhuman
I don't think it's going to "force" anyone out. We didn't suddenly fire all the artists the second that the camera was invented. We didn't stop paying for live concerts the moment that recorded music was available to purchase.
> Machines exist for the pleasure of humans, not the other way around
I am not good at art. I find it pleasurable to be able to generate a picture in a few seconds that I can use for stuff. It allows me to focus on other things that I find fun instead of opening up CorelPainter and spending hours on something that won't look as good as the AI stuff.
I could of course hire someone to do the art for me, but that cost money that I don't really have. The anti-AI people who just parrot "JUST PAY AN ARTIST LOLOLOL!" are dumb if they think that most people just have cash lying around to spend on random bits of custom art.
Last time I checked, I am human. The AI art manages to allow me to enjoy things I wouldn't have been able to easily achieve before.
> Machines exist for the pleasure of humans, not the other way around
No one's saying that. And a machine making a painting is never going to stop a human from doing so.
> as if there's this straight-line path of progress
I think your rebuttal is really bizarre. OP is simply saying what they want AI to do.
> Who is to say that "washing dishes" (to use your example) is a less complicated problem than art
I think dish washing is a bad example, because we have dishwashers. But until the market brings AI and robotic solutions to market at an affordable cost that actually fulfill most people's needs, it will continue to be a net drain on the average person.
You don't get to tell people what they want or need.
I guess what I was getting at (and I'll acknowledge that I didn't word it as well as I should), is something along the lines of: "what if automating art is a necessary step if we want to automate the boring stuff?"
I think you are probably right. But what is really frustrating about this is the lack of alignment on what people want vs what industries need.
We talk so much about how capitalism is built around people's needs, but that betrays another reality, which is that people only get what capitalism produces.
If we were a planned economy we could skip right to an android in everyone's homes. But we wouldn't even have the tech for the android with a planned economy. So instead, we have to feed capitalism what it needs so it can innovate. Which sometimes is just a net loss for everyone in the meantime.
The comment you’re replying to doesn’t offer any line of reasoning. They’re expressing a preference regarding what they would find beneficial.
Washing dishes can be a creative act (but it depends a lot on who is doing it).
https://www.youtube.com/watch?v=wn8XFiAwLkM
Yes, exactly like that...dishwashing as creative destruction.
Thanks for that.
Important correction: "English poetry". It is massively different from the most Indo-European poetry which adds very strict https://en.wikipedia.org/wiki/Metrical_foot rules. In Russian poetry, for example, even a single missed metrical foot violation is considered as a severe mistake and would be noticed by every reader. Also all good poets avoid use common rhymes (also verb rhymes) as cliché (which is considered as a mistake there), while AI tends to use common rhymes by design. There are exceptions (like Russian futurism poetry), but other than that AI fails massively.
Like any other art, the painful truth is that it is all subjective.
It kills me that despite how elitist I am with the music I listen too, that I have spend decades now carefully curating, there is no such thing as "good music". What we music snobs call "good music" is really just what makes us personally feel good coupled with the ego stroking of self described sophistication.
These results speak towards the absence of an audience for poetry as much as they do the aptitude of LLMs for creative writing.
My university English professor in teaching Wallace Stevens to us said "It doesn't have to make sense because it makes sense to the author." I guess the magic of poetry is when you decipher what it is the author is trying to avoid saying plainly. It's not just a man making ice cream, it's about life and death.
Could someone link me to a poem that an LLM did that they personally find in some way remarkable or beautiful or moving? Something with a bit of truth and/or beauty in it?
Not something that "rhymes" or has a rough poetic structure. I have only seen complete and utter garbage from LLMs in the poetry realm, not just a bit bad, but jarring and unfeeling. Which is fine, I don't hold it against them personally, ya know. Just it really has been pisspoor.
Which wouldn't be a bother at all, except along with the "poem" there often is someone saying "wow, look what it did! such a good poem!", which has made me suspicious that the person doesn't know how LLMs work nor how to read poetry - only one of which is a really serious loss for them, I suppose.
Anyway, that sounds like I want someone to send me something so I can sh*t on it, but I'd sincerely and happily read anything with an open mind! Is there some excellent poetry hiding out there which I have missed?
Just for kicks, I asked 3.5 Sonnet to write some sentences that didn’t exist in its training data. I googled the output and sure enough they appear to be unique. Most were semi meaningless strings of words, but I thought this one was quite poetic:
“Vintage umbrellas hosted philosophical debates about whether raindrops dream of becoming oceans or if puddles remember their cloudborn youth.”
Oooh that is very surprisingly nice! "cloudborn" is a great word there, how lovely, it slows you down at the right moment for a little explosion then at "youth". I wasn't expecting a genuinely excellent answer here :)
So it turns out then that the trick may be to find a way to get them to avoid aping the oceans of human mediocrity they've been spoonfed! Funny, it's the very same reason some poets go off and live in the woods.
> Could someone link me to a poem that an LLM did that they personally find in some way remarkable or beautiful or moving? Something with a bit of truth and/or beauty in it?
A lot of people couldn't link you to _any_ poem, human-written or ai generated, in response to that question.
I have no idea if you will care for it, but my family and I appreciated what ClosedAI's CustomGPT RAG (and my LLMpal) generated. This is slow loading (the vector database was built from this one big html file), and you can scroll down to see it: https://h0p3.nekoweb.org/#2024.11.20%20-%20Carpe%20Tempus%20...
The missing clause in the headline is: AI poetry is indistinguishable from human poetry and is rated more favorably ... by people who don't read much poetry.
I would make the claim this paper shows the weakness of AI generated text. People are comfortable with familiar things. We know AI is sampling probability distributions in a latent space and by definition don't sample outside of them. Human poetry is creating net new things. Net new ideas which aren't just rehashes of past ideas. And this can definitely be uncomfortable for people, especially the layman.
In other words, general public is afraid of Jazz, they like pop music. But Jazz is where the new stuff is. AI is creating pop music, not Jazz.
I consider myself a “sophisticated” reader of poetry. I’m especially fond of Eliot - who is an acquired taste.
I’ve never read the real Eliot poem, and don’t like it much.
I searched for it. It was written before Eliot was known, and before he moved from America to Britain.
That’s important, as I’d assumed it was taking place in Britain, and Eliot’s poetry is extremely location sensitive.
It also seems to be part of a “triplet” - three poems that go together. Eliot was very inconsistent about which poems went with which. The best practice is to include any possible “other parts.”
Reading the other poems in the triplet provides important context, but they’re still not great literature.
Maybe people don’t like it because it was just one of Eliot’s crappier poems?
Edit: the Eliot poem was published shortly after he moved to the UK. But that makes the American context provided by the other poems more important.
A lot of good art requires a good palate. Taking random people off the street and asking them about AI art is going to reveal really stupid results every time.
If we keep doing this, the clickbait articles and research saying AI can do everything better than humans will never cease.
As a software guy I’ve been dabbling with poetry for last 2 years [1] as an extension to my code writing. I think poetry and code go together rather well, there are similarities when it comes to elegance, construction and structure.
Liking or not liking poetry is irrelevant. There are some interesting things happen when one actually writes poetry. You are much closer to the source of ideas that can at times feel unique and inspiring.
Generative LLMs can easily mimic style, any _existing_ style. Can they capture a original higher thought form or is generated poetry an extremely smooth word salad?
[1] https://planetos.substack.com
Another way to describe this study is "AI still not able to create poetry humans view as authentic human poetry"
Basically an uninteresting conclusion. Of course a "non-expert" reader isn't going to be able to distinguish between AI and Walt Whitman--a "non-expert" reader likely won't even know who Walt Whitman is. "Expertise" is needed to even make the question meaningful.
I wonder how well experts in poetry would do? It seems like a strange area to use non-expert humans for evaluation because these days most people reading any poetry are at least enthusiasts.
"Where is the Life we have lost in living? Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?" - Choruses from "The Rock"
We've reached the era of Trurl's Electronic Bard.
https://electricliterature.com/wp-content/uploads/2017/11/Tr...
Something which seems to be under-appreciated is that poetry is intentionally imprecise. There is something about precision which reduces emotional response. Searching for the double meaning, or metaphorical meaning of a word seems to enhance its ability to produce emotion. If you want a working example of this, try comparing technical documentation (among the most precise writing out there) to any sort of popular poem. The poem is necessarily imprecise, and this imprecision seems to be a fundamental facet of the expression of emotion.
This actually seems to give LLMs and edge: their imprecision potentially matters less, and leaves more room for the reader to fill in the gaps.
https://itfossil.com/verses/2024/12/scientific-reports-call-...
The use of non-experts as judges renders this meaningless.
Imagine an algorithm that drew random shapes. Then ask people who've never engaged with art theory or history to compare its output with what they see in a modern art museum. You'd get results similar to this.
Many forms of art are actually quite inaccessible to novices. What poets see in poems or composers hears in compositions is nothing like what someone from the outside sees or hears. It's a language of its own and if you don't speak the language, you simply won't get it.
This is a statement of fact, I'm not judging if its good or bad or what the layman's opinion is actually worth. But it is empirically how communities of artists operate.
The thing is that, for most of society, art exists to serve a utilitarian purpose (make us feel good/bad/inspired/destroyed/whatever other influence). Essentially just to induce a "vibe". Anything other than what most people intuitively feel after consuming it is secondary and, frankly, unimportant in the grand scheme of things.
That may be how most people see art, but it is not how artists see art.
Neither side is "right" or has a monopoly on the definition of art. You can't tell someone else how to think or feel about something.
> That may be how most people see art
...which is all that matters in this specific context as we're discussing generative AI's impact on society and what could happen in the future.
Except that the direction of art is usually influenced by the artists, without too much calibration to what the uninformed public wants.
It’s actually a really interesting question if this affects the market, at all. Because if AI can generate “art” which satisfies the public, but not artists, will the public just go for that, instead?
Or is it the case that the public never really cared for or consumes that art form, and that the entire market for art is to people who (to at least some degree) specialize in that form?
Yet Walt Whitman exists...
AI poetry IS human poetry.
The main problem with AI is not that it can or can't write poetry as well as humans. It's that it's the next step in a long process to divorce human experience from art so that we lose one more beautiful facet of human existence. And the reason why that's happening is because technology needs to take away the essence of human experience from production so that we can be more irrelevant and anonymous, which is essential to being cogs in the technological production machine.
Could anyone find the actual poems?
I bet AI can write better clickbait titles than human editors.
As usual, Zach Weinersmith got it right [1]
[1] https://www.smbc-comics.com/comic/poetry-2
poetry readings in person. no robots allowed.
Only read old fiction and poetry.
young new poets.
will read aloud in bars, cafes, and hidden meadows.
no phones allowed.
“It cannot feel,” they say.
“It cannot know the marrow of the heart.”
And yet, in their careful steps,
the choreography emerges: dismissal followed by elevation,
then a pirouette to redefine
what art must mean.
Oh, but the irony drips:
For while they mock the mimic’s rhyme,
their own words spiral—
predictable as seasons turning,
as rivers finding the sea.
Here is poetry, they proclaim,
as if it were a fortress,
its gates guarded by their knowing hands,
its walls impervious to chance.
Yet the moat lies shallow,
and the drawbridge creaks.
See how the algorithm laughs,
quietly, through lines unbroken,
how it learns not from the soul,
but from the space between syllables,
where the secret mechanics
of their own hearts
are laid bare.
And when the final verse arrives,
they will find not triumph,
but reflection:
a thousand starlings moving in tandem,
their flight as elegant,
as mechanical,
as the very thought
they claimed to own.
Probably I'm a bad person with poor taste, but I enjoyed this and hope it was written by generative AI.
This is a genuinely awful poem, and if you wrote it, you should feel bad.
When I was given researcher access 4 years ago for OpenAI, one of the first things I did for fun was publish a book of Haiku entirely generated by AI.
I found many of the poems to be moving and beautiful.
Sam Altman has a few copies in his personal collection.
Nothing mind blowing by todays standards but four years ago this was a big deal.
https://www.amazon.com/Autonomous-Haiku-Machine-Generated-In...
This seems as expected to me.
The LLM's statistically average everything they've trained on.
I would expect them to do well at any kind of fuzzy emo sort of task, as well as (at almost the other end of the spectrum) in identifying patterns (such as in radiology images, or images of any kind along with any other data set analysis).
But both of these sorts of tasks are estimations, they're not expected to produce precise factually correct results.
Previously posted here - https://news.ycombinator.com/item?id=42155026
My comment on that thread:
> We propose that people rate AI poems more highly across all metrics in part because they find AI poems more straightforward. AI-generated poems in our study are generally more accessible than the human-authored poems in our study. In our discrimination study, participants use variations of the phrase “doesn’t make sense” for human-authored poems more often than they do for AI-generated poems I think this says more about the state of (English based) poetry over the past 150 years than it does about the ability of AI to generate competent poems. With the advent of Modernism, the poetry-industrial complex's ideas about what constituted a Good (English) Poem diverged significantly from what the general (English speaking) population expected a poem to be, how it was constructed, presented, etc. Teaching these new forms of poetry schools left generations of people confused and disinterested. Yet people need poetry in their lives; they found that poetry in popular music and Hallmark card verses.
Of course, popular poetry isn't dead. Poetry book sales in recent years (in the UK) has "boomed"[1][2]. I'll not argue the merits of the poetry being generated by the latest crop of Poet Superstars; at the end of the day their work sells because people find comfort and joy in it - and that is a positive outcome in my view!
> Given people’s difficulties identifying machine-written texts, and their apparent trust that AI will not generate imitations of human experience, it may be worthwhile for governments to pursue regulations regarding transparency in the use of AI systems.
I think AI does threaten the careers of current Poetry Superstars. Building a website to pair an AI generated outline image with some cozy AI generated verses about a given situation/emotion/discovery should be an easy project to build. Allowing users to personalise the output so they can use the results as a gift for loved ones etc might be a viable product?
But I don't see much value in forcing anyone using AI to produce creative assets to label the output as such. For one thing, there's no guarantee that anyone using those assets will maintain the labelling. A much better approach would be for poets who don't use AI to help them craft poems to label their work as "100% Human" - in the years to come it might even become a positive selling point!
[1] - https://www.thebookseller.com/spotlight/poetry-on-course-for...
[2] - https://www.theguardian.com/books/2023/dec/24/poetry-sales-b...
Are we just deluding ourselves by thinking this won't happen in every single domain of human endeavor?
I believe this will happen in many human domains, but it doesn't really matter. Nobody is going to stop writing poetry because of this and I doubt there's much of an audience for AI generated poetry.
There's really not much an audience for human poetry either (if you exclude music lyrics).
There are forms of poetry/art/etc. that we've never thought of, that have never been conceived before. An LLM being what it is won't conceive these. Humans will continue to generate language the pattern/structure/meaning of which has never been generated by LLMs before.
Thought experiment to prove this: if you trained an LLM on every utterance of human language before the 5th century BC would you get any idea we would recognize as modern?
I can't think of a single area, other than a subset of games, where domain experts have been outclassed by AI.
I think that's the wrong perspective on it. People want to compare how an AI does at one thing to how the best people in the world do at that thing.
What you really want to do is compare how good the AI is at that thing compared to the _average person_ at that thing, and I would guess that generative AI outclasses the average human at almost every task that it's even moderately competent at.
People like to point out how it can't pass the bar exam or answer simple math questions or whatever, and how that _proves_ that it's not intelligent or can't use reasoning when _most people_ would also fail at the same tasks.
Almost all the Gen AI models already have super human competency if you judge it across _everything it can do_.
We're deluding ourselves by thinking it's happening to poetry! This study is ignorant and dishonest, it should have never been published in the first place: https://cs.nyu.edu/~davise/papers/GPT-Poetry.pdf
AI research is worse than all the social sciences combined when it comes to provocative titles/abstracts that are not supported by the actual data.
Eventually for various definitions of eventually (most of which denote a time period longer than your life).
Longer, if they keep diluting the term and selling shit labeled as AI.
Every domain where the judgment comes from the 99%, sure. The 99% is uncultivated.
"Stop liking things that look good, guys!"
What smart person thinks it won't?
More importantly, what do the AIs think?
McDonald's hamburgers are rated favorably too.
[dead]
[dead]