[00:00:00] Speaker 04: May it please the court, Jesse Pannuccio, for appellants. [00:00:02] Speaker 04: I would like to reserve four minutes for rebuttal if possible. [00:00:05] Speaker 04: Just try and watch the clock. [00:00:07] Speaker 04: Yes, Your Honor. [00:00:09] Speaker 04: This interlocutory appeal presents the certified question of whether the DMCA imposes an identity requirement despite the statutory text having no such language. [00:00:21] Speaker 04: Defendants argued repeatedly below for a strict identity requirement. [00:00:26] Speaker 04: The district court agreed, but appellees now concede that was error. [00:00:31] Speaker 04: That was a necessary concession because the text, context, and purpose of the statute all foreclosed the imposition of a strict identicality requirement. [00:00:41] Speaker 04: Under the plain meaning of the statute, plaintiffs, in this case, stayed acclaimed for violations under B1 and B3. [00:00:50] Speaker 04: Given this concession, plaintiffs tried to shift the focus of the appeal to standing. [00:00:55] Speaker 04: But the district court correctly held that plaintiffs have standing in this case. [00:00:59] Speaker 04: Indeed, plaintiffs face classic injury to property rights and rights recognized by the Constitution, and that is all that is required by TransUnion. [00:01:09] Speaker 04: That injury is currently existing and ongoing, not speculative. [00:01:13] Speaker 04: The court should therefore reverse the order dismissing plaintiffs' DMCA claims. [00:01:18] Speaker 03: So on the standing point, even if we suppose for purposes of [00:01:23] Speaker 03: the standing analysis that we accept your understanding of what the statute prohibits. [00:01:29] Speaker 03: You would still need to show that there was a non-speculative, realistic possibility that your content was going to be reproduced without the appropriate attributions. [00:01:46] Speaker 03: So we're in the complaint have you alleged that sort of take me through what the the logic is of why you think that that's a real possibility here. [00:01:54] Speaker 04: Well let me say two things about that your honor I will I will take you through the allegations, but I do want to just address the setup of the question which is let's focus in on the b1 claim for example the injury in fact standing is different from the merits so you don't have to prove [00:02:09] Speaker 04: all of the, that you will prevail on the merits. [00:02:12] Speaker 04: You just have an under trans union, have an injury in fact that was recognized in the founding era as something courts could litigate. [00:02:18] Speaker 04: And in these DMCA cases, the injury in fact is the stripping of the CMI. [00:02:25] Speaker 04: It is the pulling of the CMI off of the work that is the injury, in fact. [00:02:29] Speaker 03: That is what just- You mean when they trained the model or when the model outputs something? [00:02:35] Speaker 04: Well, wherever they're stripping the CMI, we think they're probably doing it before training and there are allegations to that effect in the complaint. [00:02:42] Speaker 04: But if you take a work and you strip CMI from it, just generally speaking, that is injury in fact. [00:02:48] Speaker 04: That is classic injury to property rights. [00:02:50] Speaker 04: And moreover, it's the type of injury, because it's an injury to copyright, it's the type of injury [00:02:55] Speaker 04: that was recognized in the Constitution, and TransUnion says that is enough for standing. [00:02:59] Speaker 03: Right, but I guess, fair enough, but it has to actually be happening. [00:03:04] Speaker 03: Right? [00:03:05] Speaker 03: And so if we set aside for a moment the training, because I'm not sure you preserved a claim based on that. [00:03:15] Speaker 03: And if we're looking at what are the outputs of the model, what are the allegations that establish that the outputs of the model will involve stripping CMI? [00:03:26] Speaker 04: I'll take the second part first, but I do want to get to the waiver issue that I think you're referring to, because I do think that's wrong. [00:03:32] Speaker 04: We have very clear allegations from the first complaint all the way to the second amended complaint about stripping at the training phase. [00:03:40] Speaker 04: And I'll walk the court through that. [00:03:41] Speaker 04: But as for distribution, we cite several things. [00:03:47] Speaker 04: We plead several things. [00:03:49] Speaker 04: First of all, with respect to several of the DOES, we go through and show how their code was output in the model. [00:03:59] Speaker 04: That's one. [00:04:00] Speaker 04: Two, we know the other side has an admission that 1% of the time they will regurgitate code. [00:04:06] Speaker 04: And third, they put on a feature. [00:04:10] Speaker 04: They put a feature onto the program saying, would you like to be notified if we are regurgitating code from the public GitHub repository? [00:04:18] Speaker 03: Can I ask you about the 1%? [00:04:23] Speaker 03: Maybe that's enough, maybe that's not. [00:04:25] Speaker 03: The analysis of whether that's enough seems to depend on knowing some things about how much code is on GitHub, how many people are querying Copilot and generating these outputs. [00:04:43] Speaker 03: I don't think any of those numbers are in the complaint. [00:04:48] Speaker 03: So do you think that what we have is enough, and if not, [00:04:52] Speaker 03: I mean, is there more that you could say about that if you were to amend? [00:04:58] Speaker 04: Well, I think if what we've put in our complaint is not enough, it's hard to see how the DMCA could ever have life with artificial intelligence because all of these, it is uniquely in the defendant's possession how users are, how many users are prompting every day, what are they prompting it with, and what are the outputs. [00:05:18] Speaker 04: So we were able to test [00:05:20] Speaker 04: for ourselves and see that our own code, our plaintiff's code, is being outputted by the model just on our own tests. [00:05:29] Speaker 04: That should be enough to get us through the door to take discovery, especially when all of the evidence is uniquely in their control. [00:05:36] Speaker 04: If it's not, it's hard to see how anyone would ever say, I know my code is being outputted, because no one knows what all the users of Copilot are prompting every day. [00:05:45] Speaker 04: That's either most likely in the hands of the defendants. [00:05:49] Speaker 04: Your Honor, I do want to get back to, and I think this is very important, I do want to address the idea that stripping at the input was waived or not alleged. [00:05:58] Speaker 04: But if you go back to the original complaint, paragraphs 143 and 145 through 146, we said the following, 143, defendants removed or altered CMI from open source code that is owned by plaintiff and the class after the code was uploaded to a GitHub repository [00:06:18] Speaker 04: by incorporating it into copilot with its CMI removed. [00:06:22] Speaker 04: Now that was 143 of the original complaint. [00:06:24] Speaker 04: It remains in the operative complaint today as paragraph 214. [00:06:30] Speaker 04: We then alleged at paragraphs 145 to 146 of the original complaint, defendants were not licensed by plaintiff to train any artificial intelligence using the licensed materials or to incorporate the licensed materials into copilot. [00:06:45] Speaker 04: Those allegations still remain in the complaint at paragraphs 216 and 217. [00:06:49] Speaker 03: But the district court seemed not to think that you had a claim based on training and it said that. [00:06:56] Speaker 03: And it said that at the hearing, it said that in I think footnote seven of the order on the motion to dismiss the first version of the complaint. [00:07:04] Speaker 03: And other than, I guess, repeating the same allegations that it didn't think had stated that claim. [00:07:11] Speaker 03: Did you do anything to disabuse the district court of that understanding? [00:07:17] Speaker 04: I think we didn't. [00:07:18] Speaker 04: I'll take your honor through it. [00:07:19] Speaker 04: But I want to be clear. [00:07:19] Speaker 04: Footnote seven. [00:07:21] Speaker 04: is in the portion of the opinion about standing. [00:07:24] Speaker 04: But if you move forward to the portion of the opinion that gets into the 12 v 6, here is how that, after we had that first complaint, here is what we said in the opposition to the first motion to dismiss. [00:07:36] Speaker 04: This was ECF number 67 at page 16. [00:07:40] Speaker 04: We said, plaintiffs allege that defendants intentionally removed or altered CMI from the licensed materials after they were uploaded to one or more GitHub repositories. [00:07:49] Speaker 04: And then at the hearing on the first, [00:07:51] Speaker 04: motions to dismiss. [00:07:54] Speaker 04: We said it was the use of plaintiff's code in derogation of the licenses, which is at the core of our claim. [00:08:00] Speaker 04: And the court went on and asked this question. [00:08:03] Speaker 04: Is the act of copying itself into the neural network by itself violative of any of the GitHub licenses? [00:08:09] Speaker 04: And we answered at SER-91, I think the answer is yes. [00:08:15] Speaker 04: So that was at the hearing. [00:08:16] Speaker 04: And then if you look at the first, [00:08:18] Speaker 04: order on the first motions to dismiss. [00:08:20] Speaker 04: This is SCR 5455. [00:08:22] Speaker 04: The court concluded, plaintiffs alleged that their license code contains CMI. [00:08:29] Speaker 04: Plaintiffs alleged that defendants removed or altered that CMI from license code and distributed copies of the code knowing that the CMI had been removed or altered. [00:08:38] Speaker 04: At page 55, SCR 55, plaintiffs alleged that relevant CMI was affixed to their license code [00:08:44] Speaker 04: And defendants subsequently trained these programs to ignore or remove the CMI. [00:08:48] Speaker 04: And then the court said, that is enough to state a claim. [00:08:51] Speaker 04: So we think that fairly said that we're stating a claim on input, stripping at the input phase. [00:08:58] Speaker 04: And then at every subsequent phase of the MTDs, of the motion to dismiss practice, we reminded the court of its original holding. [00:09:08] Speaker 04: Now, at that point, defendants started shifting [00:09:11] Speaker 04: and saying we want to talk about this identicality requirement. [00:09:13] Speaker 04: And a lot of the briefing started focusing on that. [00:09:15] Speaker 04: But that original holding of the court that we stated the claim on an input theory was there. [00:09:20] Speaker 04: And in all of our subsequent oppositions, we reminded the court of that holding. [00:09:24] Speaker 04: So I do not think it's fair to say we've waived this in any way, Your Honor. [00:09:27] Speaker 02: But counsel, so would you agree though that with respect to the waiver issue, what it comes down to is our interpretation as to whether you fairly raise that question at the hearing itself when Judge Tiger specifically asked questions as to your theory. [00:09:45] Speaker 02: I understand your point that you did not disclaim reliance on that theory. [00:09:51] Speaker 02: But if our interpretation of the record is different, [00:09:55] Speaker 02: then you would have waived it, not withstanding the fact that you raised allegations in the pleading. [00:10:00] Speaker 02: Am I correct? [00:10:02] Speaker 04: I'd say two things in response to that, Your Honor. [00:10:04] Speaker 04: One, I tried to read here that exchange, the colloquy between Judge Tiger and us, where he asked directly the question about input. [00:10:12] Speaker 04: And we said, I think the answer is yes. [00:10:15] Speaker 04: And then what happened subsequently? [00:10:18] Speaker 04: Two, I don't think it's just that we continued to plead it in the complaint. [00:10:22] Speaker 04: It continued to be in our oppositions, our actual written oppositions, to the motions to dismiss, to preserve that theory. [00:10:29] Speaker 04: And just to round that out, we said in our oppositions, plaintiffs allege that defendants coded their programs to ignore [00:10:38] Speaker 04: or remove CMI in order to stop reproducing CMI as output, these allegations are still sufficient. [00:10:45] Speaker 04: And we were pointing back to the court's first order. [00:10:48] Speaker 04: These allegations are still sufficient to plead a DMCA claim under sections 1202B1, which is not the distribution claim, and 1202B3. [00:10:59] Speaker 04: So I think that question was kept alive by us, Your Honor, throughout the proceedings. [00:11:06] Speaker 04: Everyone started to talk more about identicality and that's the question up here. [00:11:09] Speaker 04: I don't think it's fair to say we waived anything. [00:11:13] Speaker 03: We'll make sure you get enough time for rebuttal, but I do want to give you a chance to say a bit about the merits. [00:11:19] Speaker 03: So, and maybe you could start by addressing, and I think I understand your friends on the other side to have conceded that, you know, identicality doesn't really mean strict identicality. [00:11:31] Speaker 03: So that's one end of the spectrum. [00:11:34] Speaker 03: And at the other end, I understood from your brief that you're acknowledging that a derivative work that might infringe, even if it, for example, quotes part of the original work, that not including attribution in that new work is not necessarily a DIMCA violation. [00:11:53] Speaker 03: So where in between those two poles do you think the statute draws the line? [00:11:59] Speaker 04: Well, I think it's important to, one, distinguish between B1 claims and B3 claims so we understand what we're talking about. [00:12:06] Speaker 04: I mean, usually you're going to compare copies if you're dealing with a B3 claim. [00:12:11] Speaker 04: But you don't even really get into comparing copies unless you're in a circumstantial evidence case. [00:12:16] Speaker 04: If you have direct evidence of CMI stripping, then most of these cases that, let me step back, most of the cases that discuss comparing the original work and a copy [00:12:27] Speaker 04: and trying to figure out if CMI was stripped are doing that because that's the only evidence that's there. [00:12:31] Speaker 04: There's no direct evidence that the defendant actually went in and stripped the CMI. [00:12:36] Speaker 04: And so then courts will start to look at it and say, can we infer from this circumstantial evidence that there was CMI stripped? [00:12:42] Speaker 04: But where you have direct evidence of CMI stripping, that's enough under the statute. [00:12:46] Speaker 04: And that's what we allege in this case. [00:12:47] Speaker 04: I mean, they own the repository. [00:12:49] Speaker 04: They downloaded everything from the repository. [00:12:51] Speaker 04: They stripped the CMI out. [00:12:53] Speaker 04: And they train on it. [00:12:54] Speaker 04: And then they have outputs. [00:12:55] Speaker 04: of plaintiff's code. [00:12:57] Speaker 04: So I think it would be important for the court to recognize in its opinion here the distinction between direct evidence and circumstantial evidence. [00:13:05] Speaker 03: And suppose, can you answer the same question without referring to what happened at the training stage? [00:13:14] Speaker 03: Yes. [00:13:15] Speaker 04: So to get back to Your Honor's original question, so if you are comparing copies [00:13:21] Speaker 04: I think this is very familiar to copyright law. [00:13:24] Speaker 04: It's been going on for 250 years on the right of reproduction, where if somebody copies a work, the question has always been, is it substantial enough copying? [00:13:33] Speaker 04: Does it get to the heart of the expression to be an infringing copy? [00:13:39] Speaker 04: And I think you could have much the same type of inquiry here. [00:13:42] Speaker 04: I don't think we need to reinvent [00:13:44] Speaker 04: This is a familiar inquiry to copyright. [00:13:47] Speaker 03: But doesn't that just collapse? [00:13:50] Speaker 03: Is it your position then that anything that infringes the copyright is a DMCA violation if it doesn't have the attribution information in it? [00:14:02] Speaker 04: Two responses to that. [00:14:03] Speaker 04: The first line answer is no, not everything that is an infringement of another right under the bundle of sticks of copyright will necessarily also infringe the DMCA right. [00:14:14] Speaker 04: And that is because, for example, take the George Harrison case. [00:14:18] Speaker 04: You may remember the song My Sweet Lord. [00:14:20] Speaker 04: He was found to have subconsciously copied. [00:14:24] Speaker 04: If you have subconscious copying, I don't think you would meet the C-enter element that is uniquely in the DMCA. [00:14:30] Speaker 04: So not all infringements of the reproduction right would necessarily be a DMCA violation. [00:14:36] Speaker 04: But at the same token, at the same time, [00:14:40] Speaker 04: It is often true that a copyright infringer is violating multiple, breaking multiple sticks in the bundle of copyrights. [00:14:47] Speaker 04: And Congress has adjusted that for 250 years about what the copyrights, you know, what bundle of sticks you have in copyright. [00:14:55] Speaker 04: But there's nothing earth-shaking about the idea that someone who is infringing, who is breaking the law of copyright, would break it multiple times over. [00:15:03] Speaker 04: And some of these questions are going to be fact bound. [00:15:06] Speaker 04: Let me just give one example from OpenAI's brief. [00:15:11] Speaker 04: Well, let me step back. [00:15:13] Speaker 04: Think, for example, of a book where you actively [00:15:21] Speaker 04: rip the first page off the book, and then you copy that in a photocopier and you begin to sell copies with no CMI. [00:15:27] Speaker 04: I think everybody would say that is clearly a DMCA violation, a B1 violation. [00:15:32] Speaker 04: You've actively removed the CMI and now you're distributing it. [00:15:38] Speaker 04: Now imagine someone had that same book and instead of ripping the cover page off, they just went to the photocopier and they flipped past the first page and they made all of the photocopies. [00:15:47] Speaker 04: They say, I think as I understand their theory, that that [00:15:51] Speaker 04: wouldn't be a violation of the DMCA because it's removing the CMI by omission. [00:15:58] Speaker 04: I don't think that's right. [00:15:59] Speaker 04: I think you'll get into fact-bound questions about what the intent was and how the CMI came to disappear from the copy. [00:16:05] Speaker 02: Counsel, let's say what you have is a book reviewer who's critiquing a book and takes out a very lengthy paragraph that fills a page. [00:16:19] Speaker 02: without CMI, without attribution. [00:16:22] Speaker 04: Is that a violation? [00:16:26] Speaker 04: So I think it would depend on whether that's a violation of the right of reproduction. [00:16:30] Speaker 04: Recall, Your Honor, that the clause at the end of B, it's the double C-enter requirement. [00:16:37] Speaker 04: So you have to strip the CMI, intentionally, and then knowing that it will induce, enable, facilitate, or conceal an infringement of any right under this title. [00:16:49] Speaker 04: So typically, book reviews are seen as fair use and not infringement. [00:16:53] Speaker 04: And in fact, the fair use section of the statute says fair use of a copyrighted work is not an infringement of copyright. [00:17:00] Speaker 04: So the text weds up perfectly here. [00:17:04] Speaker 04: If you have fair use, which a book review would normally be, then exerting a piece of the book without the CMI would not be a DMCA violation. [00:17:12] Speaker 02: So any time you have fair use that results in there being no copyright violation, the same is true with respect to the DMCA, no violation? [00:17:22] Speaker 04: Well, it would depend on whether you could imagine, for example, and we don't agree with this, but imagine, and at least one court has said that the training itself is a fair use. [00:17:34] Speaker 04: That doesn't mean the downstream outputs would also be a fair use. [00:17:37] Speaker 04: So again, it's a fact-bound question. [00:17:39] Speaker 04: But your honors hypothetical is a very good one. [00:17:42] Speaker 04: A book review is going to be fair use. [00:17:45] Speaker 04: And it's very unlikely that you're anticipating downstream use of that book review to infringe. [00:17:53] Speaker 03: Thank you. [00:17:54] Speaker 03: We'll give you three minutes. [00:17:55] Speaker 03: Thank you. [00:18:02] Speaker 03: Mr. Cariello? [00:18:10] Speaker 01: Thank you, Your Honor. [00:18:10] Speaker 01: May it please the court, Chris Cariello for Defendants GitHub and Microsoft. [00:18:14] Speaker 01: And I'm dividing time today with Ms. [00:18:16] Speaker 01: Blatt, who represents the OpenAI defendants. [00:18:19] Speaker 01: From their initial complaint through two amendments, through their argument a moment ago, plaintiffs have been searching for just the right combination of things to say to try to get a 1202B claim past the pleading stage. [00:18:31] Speaker 01: But the defects in that claim are inescapable, because they lie in the very account plaintiffs provide of what this technology is, [00:18:39] Speaker 01: and what it does. [00:18:41] Speaker 01: Between Ms. [00:18:41] Speaker 01: Blatt and I, we're going to focus on two of them. [00:18:43] Speaker 01: We briefed three independent reasons for affirmance here. [00:18:46] Speaker 01: But I'm going to focus on the standing problem, the basic problem that the injury for which plaintiffs seek redress in a 1202B claim isn't something that has ever happened to them or will happen to them. [00:18:57] Speaker 01: And then I'll make a start, if I can, on the 1202B merits question before handing off the podium to Ms. [00:19:03] Speaker 01: Blatt. [00:19:03] Speaker 01: So starting first with standing, it is axiomatic that in order to have standing, you yourself [00:19:09] Speaker 01: must be injured. [00:19:10] Speaker 01: You can't sue on the basis that someone else might be injured at some point. [00:19:15] Speaker 01: For damages standing, you need to have been injured. [00:19:18] Speaker 01: For injunctive relief standard, it must be likely that you will be injured in the future. [00:19:23] Speaker 01: Now we don't have to deal with damages anymore because plaintiffs no longer seek them in their complaint. [00:19:28] Speaker 01: So the only question we're dealing with is whether they've plausibly alleged a likelihood that they will be injured in the manner that they have articulated since the very beginning of this case, which is [00:19:39] Speaker 01: emitting their copyrighted code with CMI removed. [00:19:43] Speaker 01: So let's start with what they do allege and what they don't allege to see how they have failed to plead a likelihood that they will be injured. [00:19:50] Speaker 03: First, their code. [00:19:51] Speaker 03: Before you get into the details of that, why is it appropriate to think of this as a standing issue rather than a failure to state a claim on the merits? [00:20:01] Speaker 01: Sure, Your Honor. [00:20:01] Speaker 01: So it could be both. [00:20:02] Speaker 01: The McGee case says you need to yourself be among the injured. [00:20:07] Speaker 01: I think in a case like this where named plaintiffs are purporting to represent a class, it is important to think of it as a standing problem. [00:20:15] Speaker 01: You've come in, you've articulated the possibility that someone could be injured in a particular way. [00:20:21] Speaker 01: Are you the named plaintiff among the people who could be injured in that particular way? [00:20:27] Speaker 01: And so I agree with you, if this hasn't happened to them, they also have a 12b6 problem. [00:20:32] Speaker 01: But I do think it's a standing problem in the first instance because they need to demonstrate the injuries happened to them. [00:20:37] Speaker 03: But I guess it's oversimplifying. [00:20:42] Speaker 03: Their claim is, you took our stuff without permission. [00:20:46] Speaker 03: And the argument you're making under the rubric of standing is, no, we didn't. [00:20:51] Speaker 03: And that just seems like the, we didn't do anything to you. [00:20:57] Speaker 03: This didn't happen argument. [00:20:59] Speaker 03: It doesn't normally result in a dismissal for lack of standing. [00:21:03] Speaker 03: It results in a dismissal on the merits. [00:21:05] Speaker 01: So I understand. [00:21:06] Speaker 01: I think, Your Honor, is onto the fact that it is also a merits problem. [00:21:11] Speaker 01: I don't think that means it's not a standing problem. [00:21:13] Speaker 01: The way, with just a friendly amendment, the way I see their complaint as saying, here is technology that can result in an injury to some people. [00:21:22] Speaker 01: And they're coming in and saying, well, [00:21:24] Speaker 01: You know, we might be injured this way, but they haven't plausibly alleged that that's ever happened or could happen. [00:21:30] Speaker 01: So they're alleging a sort of product they're challenging, a technology they're challenging, saying it could result in injury to a large class. [00:21:38] Speaker 01: But they, the named plaintiffs, can't say that they are among those who would be injured. [00:21:43] Speaker 03: And why haven't, so then on getting to the facts of it, [00:21:50] Speaker 03: They've alleged that 1% of the queries return something that they say infringes rights under the statute. [00:22:03] Speaker 03: You have millions of queries being made every day on this. [00:22:08] Speaker 03: I'm not sure all the numbers necessary to do the math are in the complaint, but it seems like the complaint could easily be amended to add them. [00:22:16] Speaker 01: Your Honor, they've had three opportunities to do this, and the defects I'm about to lay out, we have been laying out since moment one on the first complaint. [00:22:25] Speaker 01: So plaintiffs tell us how this works. [00:22:27] Speaker 01: They say there are billions of lines of code. [00:22:30] Speaker 01: So their code is just among these billions. [00:22:32] Speaker 01: That's paragraph 95. [00:22:33] Speaker 01: And then they tell us what copilot does with these billions. [00:22:38] Speaker 01: It's a pattern recognizer. [00:22:40] Speaker 01: So it transforms these billions of lines [00:22:43] Speaker 01: by inferring statistical patterns, and that's paragraph 64, those statistical patterns become embedded. [00:22:50] Speaker 01: And then on the back end, this is paragraph 6591, the way it works is it is predicting from those embedded statistical patterns what you want next. [00:23:00] Speaker 01: So if you're a software programmer, you're sitting there, you're programming something, let's say it's a sorting function, right? [00:23:05] Speaker 01: Take this data, make it alphabetical. [00:23:07] Speaker 01: And copilot, if it's seen something like that a lot across the training set, says, [00:23:13] Speaker 01: looks like you're looking for this sorting function. [00:23:15] Speaker 01: So what don't plaintiffs allege here? [00:23:18] Speaker 01: They don't allege what their code is. [00:23:20] Speaker 01: They don't allege what their code does. [00:23:22] Speaker 01: They don't allege how much of it is in the billions. [00:23:24] Speaker 01: They don't say it appears frequently. [00:23:26] Speaker 01: They don't say it's part of a pattern. [00:23:27] Speaker 01: They don't say it appears more than once. [00:23:29] Speaker 01: They don't. [00:23:30] Speaker 02: Can I ask, though, perhaps I misunderstood Mr. Panuccio's theory, but as I understand it, he's suggesting that when a programmer, [00:23:42] Speaker 02: inputs the code into the algorithm, the CMI is being stripped. [00:23:52] Speaker 02: And if it's being stripped, whether or not it is then being used for output, as I understand it, the injury that's being alleged is the stripping of the CMI directly. [00:24:05] Speaker 01: Your Honor, there has always been and there continues to be a lack of clarity, I think, in plaintiffs' theory about [00:24:12] Speaker 01: when and how precisely CMI is stripped. [00:24:15] Speaker 01: Stripping is an act. [00:24:16] Speaker 01: They have to allege the mechanism by which that happens. [00:24:20] Speaker 01: I had understood throughout this entire case that their theory was when copilot generates from its statistics the predicted code completion, it is not including CMI. [00:24:32] Speaker 01: And just to be clear, [00:24:33] Speaker 01: plaintiffs have over and over again alleged precisely that theory in their complaint, right? [00:24:38] Speaker 01: This not including theory, not a strip theory. [00:24:41] Speaker 02: Putting aside what precisely they're alleging, if that were the allegation that, in fact, once the coding information is captured, if you will, by the program, by the algorithm, it is removing the CMI, [00:25:02] Speaker 02: Is there at least standing to claim in that instance that the violation occurred at that point giving rise to standing? [00:25:12] Speaker 01: So in the hypothetical world where a different plaintiff pled something like that, and all they said was, at some point during training, CMI is removed. [00:25:22] Speaker 01: They might be able to plausibly allege, and I see my time is up, but if I may, [00:25:27] Speaker 01: They might be able to plausibly allege that their code was among that that had CMI stripped. [00:25:32] Speaker 01: Now, we would get over this first hurdle. [00:25:34] Speaker 01: Did it happen to you? [00:25:35] Speaker 01: We could then have additional arguments over whether or not that injury satisfies trans-union. [00:25:41] Speaker 01: Mr. Pannuccio brought up that concept. [00:25:44] Speaker 01: We could have arguments on the merits. [00:25:46] Speaker 01: But they could at least plausibly get over the hurdle if they articulated that injury and then plausibly allege that that injury has happened to them. [00:25:55] Speaker 01: That's what they cannot do here. [00:25:57] Speaker 01: They have articulated an injury which is output of their code without CMI and failed to do anything to plead that that would be likely. [00:26:06] Speaker 01: We don't know what their code is. [00:26:07] Speaker 01: They've never alleged that it could potentially have utility to anyone, right? [00:26:12] Speaker 01: That a coder might be eliciting it. [00:26:15] Speaker 01: We are shooting in the dark. [00:26:17] Speaker 01: All we know is that their code is somewhere among billions of lines of code in this model. [00:26:24] Speaker 01: And Your Honor, if I may just take a brief moment to address a few other points that they've raised. [00:26:30] Speaker 01: Plaintiffs point to their examples. [00:26:32] Speaker 01: These examples are wildly unrealistic and they don't allege that they are realistic prompts. [00:26:38] Speaker 01: Contrast that with ER 199 to 202 to see what plaintiffs could have if they were different people. [00:26:46] Speaker 01: Possibly alleged at not at 199 to 202 of the of the excerpts of record they describe two examples from textbooks textbook code that would be well traveled across github and that might actually be useful to someone as far as we know this is defective code that no one would ever want to prompt and that no prompt would ever elicit in the real world Thank you very much arms [00:27:24] Speaker 03: Ms. [00:27:24] Speaker 03: Blatt. [00:27:27] Speaker 00: Thank you and may it please the court, Lisa Blatt for the Open AI Defendants. [00:27:32] Speaker 00: Although removal can occur without identicality, this court should hold that removal only occurs if a defendant tampers with existing CMI on existing works and not simply fails to add CMI to the defendant's own products. [00:27:49] Speaker 00: Now, in this case, the complaint does not and cannot allege, at least at the output stage, and we don't think the input stage was either alleged or before the court, anything that comes close to removal, because what we have here is simply an attribution right that is being alleged, which is that copilot generates modified snippets of plaintiff's code without also generating [00:28:13] Speaker 00: the CMI. [00:28:14] Speaker 00: And you can look no further than the only relief sought in this case is at 263 of the complaint. [00:28:20] Speaker 00: The only live issue in this case is a request for an injunction that is solely based on attribution. [00:28:27] Speaker 00: The injunction is make sure any output on copilot contains CMI. [00:28:33] Speaker 00: That is a request for attribution. [00:28:34] Speaker 00: There's no request for any kind of relief at the input stage. [00:28:39] Speaker 00: Moreover, when the other side sought certification and it talked about damages, which aren't asserted, but when damages were asserted, all the damages are based on output. [00:28:51] Speaker 00: That $9 billion figure is based on failing to output. [00:28:56] Speaker 03: Can I ask you the same question I asked Mr. Pannuccio? [00:29:00] Speaker 03: On the one hand, if you have a work of fan fiction that even if it has quotes from a book, that doesn't trigger the attribution requirement. [00:29:09] Speaker 03: On the other hand, if you take the book and strip off the cover page, I think I understood you to [00:29:14] Speaker 03: effectively concede that even if you've changed a couple words on page 200, that's still a violation because it's close enough to being identical. [00:29:25] Speaker 03: So how would you articulate where the line is in between those two? [00:29:30] Speaker 00: Sure, and you hit it exactly right with this extremes. [00:29:33] Speaker 00: So there are two extremes at issue, and then there's the line we draw, which is based on the statutory, or sorry, the common sense dictionary [00:29:40] Speaker 00: definition of removal. [00:29:42] Speaker 00: The two extremes are the one that you said, which is an identicality requirement, that even if you have a removal, there's this notion, which no one is defending, that that would preclude liability even though there was removal because the two works are not identical and there was one change. [00:29:57] Speaker 00: At the other extreme is exactly what you said, that any copying that fails to have an attribution violates at least the removal requirement if you have the C enter. [00:30:08] Speaker 00: And that is incredibly important that we're asking this court to address because not only is it the only relief sought, the whole notion of an identicality requirement that has what has been held by so many district courts is because it has been the dam that has prevented the floodgates of attribution right claims. [00:30:29] Speaker 00: Now where the line is drawn is with us and we can talk about, because you hit on all the exact hypotheticals, is we think there has to be a actual [00:30:38] Speaker 00: tampering on CMI. [00:30:40] Speaker 00: So the hypo that's the hard hypo would be what is the difference between stripping off or kind of tearing off the cover of the book and just copying every page but not the cover of the book which I think is the hard hypo you asked. [00:30:55] Speaker 00: And in our view that only the tearing off is a removal because in the other example that is highlighted it's simply a failure to attribute. [00:31:05] Speaker 00: Now, it may be a line, but it is a line that has to be drawn, including by the other side, because once they concede in their reply brief that there is no attribution right, some courts and all courts have to grapple with, what do you do when you have a copyright violation or any copying, whether it's fair use or not? [00:31:23] Speaker 00: because any copyright violation, almost all of them, fails to include the CMI. [00:31:28] Speaker 03: Right. [00:31:28] Speaker 03: But if the line is going to be the difference between removing something and just not reproducing that part, I mean, maybe it makes sense for physical media, but it doesn't seem to make sense at all when the media are all stored digitally. [00:31:44] Speaker 03: So how would you apply that? [00:31:47] Speaker 00: Sure. [00:31:48] Speaker 00: So you can apply it at least in the [00:31:51] Speaker 00: AI in the output sense, because you know generative AI is a model standing on its own that is not connected to the training data. [00:32:02] Speaker 00: So you are right. [00:32:03] Speaker 00: If you just have, let's just talk about the theory that's not before the court, that is, is before entire courts all over the country about training. [00:32:13] Speaker 00: We just think the district court was very focused and said, I want to make sure training is not part of this case. [00:32:18] Speaker 00: Once the court satisfied itself that training was not part of the case, [00:32:22] Speaker 00: it then went off into the output. [00:32:24] Speaker 00: But in terms of the input, you are correct. [00:32:27] Speaker 00: Courts are grappling with particularly in the southern district of New York what to do when you have These training data and in the kind of the granddaddy of cases the New York Times case what Judge Stein has held is that some of them? [00:32:42] Speaker 00: Adequately state removal at the training stage and some of them don't Depending on what the facts are alleged and there are very detailed allegations of the programs that are used in [00:32:53] Speaker 00: To input that stuff in the training stage that for whatever reason I don't know why these plaintiffs just didn't allege and didn't make training part of their case now on the output stage that is [00:33:05] Speaker 00: Just purely a, it's no different than if an art student sits in a museum and tries to replicate a Monet painting and doesn't copy down Monet's signature. [00:33:15] Speaker 00: That is simply what the AI models are doing by their own allegations. [00:33:19] Speaker 00: They never allege, this is absolutely critical, that AI models work like a search engine, which is they don't take an existing repository and copy and paste. [00:33:29] Speaker 00: In fact, the sources that they cite in paragraph 75 and 102 for the notion that there is some sort of copying, those very sources explain that it's the Chan study and the Githo post, and we put all this on page 12 of our brief. [00:33:44] Speaker 00: say that these models aren't looking up the training data. [00:33:48] Speaker 00: The training data is not part of the model. [00:33:50] Speaker 00: So it's simply based on a statistical pattern. [00:33:52] Speaker 00: It's like I have memorized the stuff that I am saying to you now. [00:33:56] Speaker 00: That is not that I am downloading it for something. [00:34:00] Speaker 00: It's simply I have memorized it and it's Lisa Blatt saying it. [00:34:03] Speaker 00: I'm not removing any CMI from something I've read. [00:34:06] Speaker 00: I'm just not telling you [00:34:07] Speaker 00: what I read and to prepare for this argument. [00:34:09] Speaker 03: So I mean, the briefs use the example of reproducing Twilight. [00:34:15] Speaker 03: I mean, if somebody with an exceptionally good memory just has, and perhaps there are some fans sufficiently devoted to have done this, sits down and just types out the entire text of the book, but minus the title by Stephanie Meyer part, [00:34:33] Speaker 00: And then start selling it prints it out and start selling bootleg copies of that You think that's not a violation because it sort of went through somebody's head along the way to being yes So let's be clear when you say violation because we're talking about that may or may not be a copyright violation and if what's bothering you well that's damages and that's lots of stuff happens including injunctive relief and [00:34:55] Speaker 00: If we're talking about a DMCA violation, it requires a removal, which is why we have the courts have adopted the identicality requirement. [00:35:03] Speaker 00: The statutory penalties are bankrupt-inducing. [00:35:06] Speaker 00: They are not supposed to be about copying. [00:35:08] Speaker 00: The author doesn't have to register their works. [00:35:11] Speaker 00: It could be a fair use. [00:35:12] Speaker 00: It could be all sorts of things. [00:35:13] Speaker 00: But who needs the copyright laws if they're correct about their definition? [00:35:18] Speaker 00: You would never need to bring a copyright claim. [00:35:20] Speaker 00: You just always bring a DMCA claim swallowing the copyright laws whole, which is this case. [00:35:26] Speaker 00: There is no allegation of infringement. [00:35:28] Speaker 00: and yet they're claiming up to a $25,000 per violation. [00:35:33] Speaker 00: Imagine any teacher in a classroom would be liable. [00:35:36] Speaker 00: We can talk about how fair use would work with that, but it's not clear to me that you could still be liable under the DMCA, even if you have a fair use copying. [00:35:45] Speaker 00: It depends on their definition of c-enter. [00:35:48] Speaker 00: And they got by with their definition of cienter here based on an attribution right. [00:35:53] Speaker 00: So however this court answers the certified question, and we don't think they're standing, but if you do answer the certified question, it is absolutely critical that you tell the courts that whatever you think about identicality, you have to make sure there's a removal, and removal has to be done with the dictionary definition of taking out something, not simply, and I think the will example was the best example we gave. [00:36:15] Speaker 00: is that at the end result, if I'm denied, I'm a beneficiary and I don't get my parents' money, I'm unhappy, but I could be either omitted from the will or I could have been removed from the will and the statute draws that line and you have to draw the line no matter what. [00:36:30] Speaker 00: Given any hypothetical we could talk about, because even in your Twilight example, you'll start to move off. [00:36:35] Speaker 00: Well, what if they copied everything but one page? [00:36:37] Speaker 00: That's clearly not a DMCA. [00:36:40] Speaker 00: We could just keep going down to the hypo we gave about the book review. [00:36:43] Speaker 00: In the book review example, the complaint itself in their brief to you in the opening brief says that the failure to attribute is a DMCA violation, 24 times their complaint cites to an attribution right. [00:36:56] Speaker 00: And so again, we think in order to clarify this law, it's the Ninth Circuit. [00:37:00] Speaker 00: There are a lot of cases that come up in California because you and the Second Circuit are the big copyright states. [00:37:07] Speaker 00: Copyright circuits, it's absolutely critical that this court at least reset what's going on and not just fix half the problem on an identicality requirement without also telling courts what the other side concedes is that there is no attribution right. [00:37:23] Speaker 00: Thank you. [00:37:23] Speaker 00: Thank you. [00:37:29] Speaker 03: Your bottle. [00:37:31] Speaker 04: Thank you, Your Honor. [00:37:32] Speaker 04: I'd like to make three points, if I may. [00:37:34] Speaker 04: First, the position that OpenAI is asserting, I think, is extraordinary. [00:37:42] Speaker 04: So I gave the example of actively ripping off the cover of a book and then photocopying it and distributing those copies. [00:37:50] Speaker 04: And OpenAI says, yes, that would violate the DMCA. [00:37:53] Speaker 04: But all you have to do, in their view of the law, to avoid that liability is simply take that same book [00:37:59] Speaker 04: Look at the cover page and say, well, instead of tearing it out, I'm just going to flip it over one. [00:38:04] Speaker 04: I'm going to make the exact same copies and engage in the exact same distribution. [00:38:08] Speaker 04: That would make a farce of the DMCA. [00:38:09] Speaker 04: That can't possibly be the law. [00:38:11] Speaker 03: I agree with you that that doesn't seem to make a lot of sense. [00:38:15] Speaker 03: But what do you do with the fact that the statute says remove or alter? [00:38:19] Speaker 04: Well, remove or alter. [00:38:21] Speaker 04: Let's take their example from page 35 of their brief. [00:38:24] Speaker 04: Look at restaurant menus. [00:38:25] Speaker 04: When the restaurant removes a menu item, that's when you remove something. [00:38:33] Speaker 04: They point to that example. [00:38:34] Speaker 04: But what does a restaurant do when it removes a menu item? [00:38:36] Speaker 04: They don't take their existing menus, grab a scissors, and cut it out, and then hand the patrons a menu with a hole in it. [00:38:44] Speaker 04: They just create a new menu. [00:38:45] Speaker 04: And so the original menu had lasagna on it. [00:38:47] Speaker 04: The new menu does not. [00:38:48] Speaker 04: And a patron would sit down and say, huh, [00:38:50] Speaker 04: They've removed lasagna from the menu. [00:38:52] Speaker 04: Even though there was no active stripping of that menu item, we all in common parlance think that that is removal of the item. [00:38:58] Speaker 04: And the same thing would be true here. [00:38:59] Speaker 04: So I don't think it tortures the regular meaning of remove at all to say that some acts of stripping CMI by omission will count. [00:39:10] Speaker 04: Now, let me go to Your Honor's Twilight example, the fan fiction. [00:39:15] Speaker 04: It may come down to a question of intent. [00:39:18] Speaker 04: Yes, if you memorized it with the intent of retyping it and just leaving the CMI out, I think that could qualify. [00:39:24] Speaker 04: But if you were more toward George Harrison, where you're sort of just kind of remember that Twilight's out there and you start writing similar stories and maybe it infringes, maybe it doesn't, you probably won't have the requisite C enter. [00:39:36] Speaker 04: And so this will be a classic fact question, and that's why we say that in our brief. [00:39:42] Speaker 04: The DMCA is a copyright. [00:39:44] Speaker 04: It is not something separate from copyright. [00:39:46] Speaker 04: It is one of the bundle of sticks, and it is important that it be preserved. [00:39:49] Speaker 04: And my friends on the other side are seeking an interpretation of that statute that would essentially read it out of the law. [00:39:56] Speaker 04: And Ms. [00:39:57] Speaker 04: Blatt said, look at the Second Circuit. [00:39:58] Speaker 04: And I would commend to the Second Circuit Judge Park's opinion in the Mango case. [00:40:02] Speaker 04: That is very clear on saying this is a real right. [00:40:05] Speaker 04: It has teeth. [00:40:06] Speaker 04: And you only need to allege what the statute says. [00:40:09] Speaker 04: Let me make two other points with the 30 seconds I have left. [00:40:12] Speaker 04: One is the injunction we request. [00:40:13] Speaker 04: We don't request an injunction just as to distribution. [00:40:17] Speaker 04: Paragraph 49, we say enjoying defendants from engaging in the unlawful conduct alleged herein, all of it. [00:40:22] Speaker 04: And of course, the contours of an injunction come into play once the case has gone through discovery and has been litigated and a court fashions an injunction to meet what has actually been proved. [00:40:33] Speaker 04: Lastly, I want to say one more time [00:40:35] Speaker 04: My friends say the injury isn't happening or the input is not part of the complaint. [00:40:41] Speaker 04: The first injury we allege is that stripping of CMI happens after they download from GitHub and input it into their model. [00:40:49] Speaker 04: And this is clear as day in paragraphs 214 and 216 through 217 of the amended complaint. [00:40:55] Speaker 04: That is one injury. [00:40:57] Speaker 04: The second injury happens again when the output works. [00:41:01] Speaker 04: They distribute them either in training or in the output. [00:41:06] Speaker 04: Without the CMI, but these are two separate injuries and we very clearly alleged and I think saying calling it a training theory is wrong. [00:41:12] Speaker 04: It's an input theory. [00:41:13] Speaker 04: They are stripping the CMI. [00:41:15] Speaker 04: They own GitHub. [00:41:16] Speaker 04: They own the whole repository. [00:41:17] Speaker 04: They download the whole repository. [00:41:19] Speaker 04: Then they strip the CMI to clean the data for training that act of stripping itself is an injury and it's an ongoing injury because the point of the DMCA is not to have the CMI stripped copies out in the wild. [00:41:31] Speaker 04: Thank you, your honor. [00:41:32] Speaker 03: Thank you very much. [00:41:33] Speaker 03: We thank all counsel for their helpful arguments. [00:41:35] Speaker 03: The case is submitted and we are adjourned.