{"id":433,"date":"2024-02-23T13:00:00","date_gmt":"2024-02-23T14:00:00","guid":{"rendered":"https:\/\/reshebniki-online.com\/?p=433"},"modified":"2024-02-29T15:39:47","modified_gmt":"2024-02-29T15:39:47","slug":"what-two-years-of-ai-development-can-tell-us-about-sora","status":"publish","type":"post","link":"https:\/\/reshebniki-online.com\/index.php\/2024\/02\/23\/what-two-years-of-ai-development-can-tell-us-about-sora\/","title":{"rendered":"What two years of AI development can tell us about Sora"},"content":{"rendered":"<br \/>\n<figure>\n      <img alt=\"An AI-generated video from Sora, OpenAI\u2019s new generative video model, shows sea creatures like fish and dolphins with legs, riding bicycles on top of an ocean.\" src=\"data:image\/gif;base64,R0lGODlhAQABAAAAACH5BAEKAAEALAAAAAABAAEAAAICTAEAOw==\" class=\"lazyload\" data-src=\"https:\/\/reshebniki-online.com\/wp-content\/uploads\/2024\/02\/Screen_Shot_2024_02_22_at_2.15.33_PM.0.jpg\"><figcaption>A screenshot of a video generated by Sora, OpenAI\u2019s generative video model. | Sora\/<a class=\"ql-link\" href=\"https:\/\/twitter.com\/sama\/status\/1758220311735181384\" target=\"_blank\" rel=\"noopener\">OpenAI CEO Sam Altman<\/a><\/figcaption><\/figure>\n<p>If you want to know the future of OpenAI\u2019s latest tool, take a look at Midjourney and DALL-E 2.<\/p>\n<p id=\"G2AYPV\">Remember when <a href=\"https:\/\/www.vox.com\/2023\/4\/28\/23702644\/artificial-intelligence-machine-learning-technology\" data-source=\"encore\">AI<\/a> art generators became widely available in 2022 and suddenly the internet was full of uncanny pictures that were very cool but didn\u2019t look quite right on close inspection? Get ready for that to happen again \u2014 but this time for video.<\/p>\n<p id=\"sunOPd\">Last week, OpenAI released Sora, a generative AI model that produces videos based on a simple prompt. It\u2019s not available to the public yet, but CEO Sam Altman showed off its capabilities by taking requests on X, formerly known as <a href=\"https:\/\/www.vox.com\/twitter\" data-source=\"encore\">Twitter<\/a>. Users replied with short prompts: \u201c<a href=\"https:\/\/twitter.com\/sama\/status\/1758249750909096142\">a monkey playing chess in a park<\/a>,\u201d or \u201c<a href=\"https:\/\/twitter.com\/sama\/status\/1758220311735181384\">a bicycle race on ocean with different animals as athletes<\/a>.\u201d It\u2019s uncanny, mesmerizing, weird, beautiful \u2014 and prompting the usual cycle of commentary.<\/p>\n<p id=\"uy1z1E\">Some people are making strong claims about Sora\u2019s <a href=\"https:\/\/www.cbsnews.com\/news\/openai-sora-text-to-video-tool\/\">negative effects<\/a>, expecting a \u201c<a href=\"https:\/\/twitter.com\/Pinsky\/status\/1758612226796048576\">wave of disinformation<\/a>\u201d \u2014 but while I (and experts) think <a href=\"https:\/\/www.vox.com\/the-highlight\/23447596\/artificial-intelligence-agi-openai-gpt3-existential-risk-human-extinction\">future powerful AI systems pose really serious risks<\/a>, claims that a specific model will bring the disinformation wave upon us have not held up so far. <\/p>\n<p id=\"WZN4r7\">Others are pointing at Sora\u2019s many flaws as representing <a href=\"https:\/\/garymarcus.substack.com\/p\/soras-surreal-physics\">fundamental limitations<\/a> of the technology \u2014 which was a mistake when people did it with image generator models and which, I suspect, will be a mistake again. As my colleague A.W. Ohlheiser <a href=\"https:\/\/www.vox.com\/technology\/24079459\/sora-openai-video-tool-world-simulator\">pointed out<\/a>, \u201cjust as DALL-E and ChatGPT improved over time, so could Sora.\u201d<\/p>\n<p>The predictions, both bullish and bearish, may yet pan out \u2014 but the conversation around Sora and generative AI would be more productive if people on all sides took into greater account all the ways in which we\u2019ve been proven wrong these last couple of years.<\/p>\n<h3 id=\"80VQUT\">What DALL-E 2 and Midjourney can teach us about Sora<\/h3>\n<p id=\"GK9GWh\">Two years ago, OpenAI announced <a href=\"https:\/\/openai.com\/dall-e-2\">DALL-E 2<\/a>, a model that could produce still images from a text prompt. The high-resolution fantastical images it produced were quickly all over social media, as were the <a href=\"https:\/\/spectrum.ieee.org\/openai-dall-e-2\">takes<\/a> on what to think of it: Real art? Fake art? A threat to artists? A tool for artists? A disinformation machine? Two years later, it\u2019s worth a bit of a retrospective if we want our takes on Sora to age better. <\/p>\n<p id=\"lqRXdU\">DALL-E 2\u2019s release was only a few months ahead of <a href=\"https:\/\/www.midjourney.com\/home\">Midjourney<\/a> and <a href=\"https:\/\/stability.ai\/news\/stable-diffusion-public-release\">Stable Diffusion<\/a>, two popular competitors. They each had their strengths and weaknesses. DALL-E 2 did more photorealistic pictures and adhered a little better to prompts; Midjourney was \u201cartsier.\u201d Collectively, they made AI art available at the click of a button to millions. <\/p>\n<p id=\"Wj00Da\">Much of the societal impact of generative AI then didn\u2019t come directly from DALL-E 2, but from the wave of image models it led. Likewise, we might expect that the important question about Sora isn\u2019t just what Sora can do, but what its imitators and competitors will be able to do.<\/p>\n<p id=\"VHRjsO\">Many people thought that DALL-E and its competitors heralded a <a href=\"https:\/\/www.rand.org\/pubs\/commentary\/2022\/11\/america-may-not-be-ready-for-the-looming-tsunami.html\">flood of deepfake propaganda and scams<\/a> that\u2019d threaten our democracy. While we may well see an effect like that some day, those calls now seem to have been premature. The effect of deepfakes on our democracy \u201calways seems just around the corner,\u201d analyst Peter Carlyon <a href=\"https:\/\/www.rand.org\/pubs\/commentary\/2023\/12\/deepfakes-arent-the-disinformation-threat-theyre-made.html\">wrote in December<\/a>, noting that most propaganda continues to be of a more boring kind \u2014 for example, taking remarks out of context, or images of one conflict shared and mislabeled as being from another. <\/p>\n<p id=\"aEfZLT\">Presumably at some point this will change, but there should be some humility about claims that Sora will be that change. It doesn\u2019t take deepfakes to lie to people, and they remain an expensive way to do it.<strong> <\/strong>(AI generations are relatively cheap, but if you\u2019re going for something specific and convincing, that\u2019s much pricier. A tsunami of deepfakes implies a scale that spammers mostly can\u2019t afford at the moment.)<\/p>\n<p id=\"mYtICo\">But the place where it seems most crucial to me to remember the last two years of AI history is when I read criticisms of Sora\u2019s images for being clumsy, stilted, inhuman, or obviously flawed. It\u2019s true, they are. Sora \u201cdoes not accurately model the physics of many basic interactions,\u201d OpenAI\u2019s research release acknowledges, <a href=\"https:\/\/openai.com\/sora\">adding<\/a> that it has trouble with cause and effect, mixing up left and right, and following a trajectory. <\/p>\n<aside id=\"IzLqIh\">\n<div data-anthem-component=\"readmore\" data-anthem-component-data=\"{&quot;stories&quot;:[{&quot;title&quot;:&quot;How fake AI images can expand your mind &quot;,&quot;url&quot;:&quot;https:\/\/www.vox.com\/future-perfect\/23661673\/pope-puffer-coat-generative-ai-midjourney-imagination&quot;}]}\"><\/div>\n<\/aside>\n<p id=\"XcyvKI\">Nearly identical criticisms were, of course, made of DALL-E 2 and Midjourney \u2014 at least at first. Early coverage of DALL-E 2 highlighted its incompetencies, from creating horrifying monstrosities whenever you asked for multiple characters in a scene to giving people <a href=\"https:\/\/spectrum.ieee.org\/openai-dall-e-2\">claws instead of hands<\/a>. AI experts argued that the inability of AI to handle \u201ccompositionality\u201d \u2014 or instructions about how to compose the elements of a scene \u2014 <a href=\"https:\/\/twitter.com\/GaryMarcus\/status\/1512650166326628355\">reflected a shortcoming fundamental to the technology<\/a>. <\/p>\n<p id=\"k8lXWN\">In practice, though, models got better at fulfilling highly specific prompts and users got better at prompting, and as a result it\u2019s possible today to create images with complex and detailed scenes. Nearly all of the entertaining deficiencies were corrected in DALL-E 3, released last year, and in the latest updates to Midjourney. Today\u2019s image generators can do hands and crowd scenes fine.<\/p>\n<p id=\"M10YjR\">In the time between DALL-E 2 and Sora, AI image generation has gone from a party trick to a massive industry. Many of the things DALL-E 2 couldn\u2019t do, DALL-E 3 could. And if DALL-E 3 couldn\u2019t, a competitor often could. That\u2019s a perspective that\u2019s crucial to keep in mind when you read prognosticating on Sora \u2014 you\u2019re likely looking at early steps into a major new capability, one that could be used for good or malicious purposes,<strong> <\/strong>and while it\u2019s possible to oversell it, it\u2019s also very easy to sell it short. <\/p>\n<p id=\"hISPg6\">Instead of overcommitting to any particular perspective on what Sora and its successors will or won\u2019t be able to do, it\u2019s worth admitting some uncertainty about where this is headed. It\u2019s much easier to say, \u201cThis technology will keep improving by leaps and bounds\u201d than to guess the specifics of how that will play out. <\/p>\n<p id=\"6avHX2\"><em>A version of this story originally appeared in the <\/em><a href=\"https:\/\/www.vox.com\/future-perfect\"><em><strong>Future Perfect<\/strong><\/em><\/a><em> newsletter. <\/em><a href=\"https:\/\/www.vox.com\/pages\/future-perfect-newsletter-signup\"><em><strong>Sign up here!<\/strong><\/em><\/a><\/p>\n<p id=\"hByBUE\">\n","protected":false},"excerpt":{"rendered":"<p>A screenshot of a video generated by Sora, OpenAI\u2019s generative video model. | Sora\/OpenAI CEO Sam Altman If you want to know the future of OpenAI\u2019s latest tool, take a look at Midjourney and DALL-E 2. Remember when AI art generators became widely available in 2022 and suddenly the internet was full of uncanny pictures [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":435,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[11],"tags":[],"_links":{"self":[{"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/posts\/433"}],"collection":[{"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/comments?post=433"}],"version-history":[{"count":2,"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/posts\/433\/revisions"}],"predecessor-version":[{"id":436,"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/posts\/433\/revisions\/436"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/media\/435"}],"wp:attachment":[{"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/media?parent=433"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/categories?post=433"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/reshebniki-online.com\/index.php\/wp-json\/wp\/v2\/tags?post=433"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}