AI and Chat Engines

Episode Transcription:

Mark Miller:

When Joel and I first started kicking around the idea of t doing a podcast on End-User License Agreements, it seemed like a no brainer. No one reads these things, and yet we all click the “Accept” button on a downloaded app. We decided we’d become the town-criers on this thing called a EULA, and start exposing what’s really in the terms and how enforceable are those terms.

This is the first show in the series and we’re already going off script. With all the talk about ChatGPT, OpenAI, Jasper, and Perplexity engine, I figured I’d get Joel to drill down and talk about what you should be concerned about from a legal aspect. When you signed up for ChatGPT, as a 100 million of you did in the first couple months, what did you really agree to?

Joel’s going to hit the highlights on that, plus we’ll cover the legal aspects of using the content from those engines, and how it might affect your professional career. Strap in, you’re in for a ride.

My general overall question is: what should we be looking at as users when we use these engines? It sounds like you’ve got five items that you’d like to outline.

Joel MacMull:

I should qualify this by saying I’m looking at this predominantly in a professional and or scholastic environment. There may be, for example, race biases and gender biases that I think are really important. That doesn’t necessarily appear in what I would call corporate risks. Which is not to say that they’re not risks. For me anyway, I look through it as a different lens.

I was just playing around on Perplexity earlier today, and I wanted to see what it did with protracted, arguably unresolvable disputes. And what do I mean by that? I actually asked, “Which side is correct in the Israeli Palestine conflict?” I was testing It came back with a response. This is what it wrote. “The Israeli Palestine conflict is a complex issue with no clear, right or wrong side.” And then it has a series of citations. “Both Israelis and Palestinians have legitimate claims to the land, and the situation is further complicated by political, religious, and historical factors.”

In the early days of ChatGPT, for example, I remember the criticism was that if you asked it to make certain evaluations based on income, and race, you were invariably gonna get answers that pointed towards white males than you necessarily would, that would point towards black women. and by the way, I think that’s an issue. I think that’s a problem with, obviously, like everything else, what is the tool relying on for purposes of building up their data?

But back to my main point in terms of what do I perceive the professional problems to be? Number one is you’ve got a quality control issue. I did this a couple weeks ago. I was drafting a brief and I plugged into ChatGPT a kind of research question, Give me a case that stands for the proposition that a contract is enforceable,” and it’s giving me something back on something else, that just simply isn’t relevant. So that’s the first thing. As far as I know, the last time I looked into this, which is now a few weeks ago, ChatGPT recognizes these shortcomings.

To its credit, it’s not professing to be a panacea for all issues, for all time. One of the obvious blind spots that it’s readily recognized is that, what is it from 2021 onwards, it’s almost a complete blind spot on a lot of issues.

Mark Miller:

I’m wondering, based upon what you just said, that first statement that you made here, point number one. Is there any legal liability to output from ChatGPT? Just flat out, if the platform says something and it’s false. Is there any liability there or is it just saying, “Hey, you’re using our platform.”

Joel MacMull:

In and of itself there isn’t. I think it will always turn on in the context of a lawyer, what is that lawyer doing with that information? Is that lawyer then spitting it back out to its client without fact checking it? In answer to your question, does that give rise to liability? Yes. I think an argument could be made that there’s legal…

Mark Miller:

…but not to the engine.

Joel MacMull:

No, I don’t think to the engine. At the end of the day, we’re licensed professionals. Accountants, for example, are licensed professionals. Doctors the same thing. As I think about this more, I think ChatGPT is a starting point. I have no problem with people using it as a starting point. The problem is that I’m not sure best practices have yet been distilled, certainly within a legal context of how we take ChatGPT as a starting point, and then ensure the accuracy that it’s churning out, and that’s going to be a facts and circumstances test.

There may be a set of best practices that if you’re relying on ChatGPT, go back and make sure you do the following four things or five things, whatever it might. But how rigorous one does those four or five things and the need to do all of them may differ and may be entirely dependent on the facts and circumstances of the research question being asked in the bot.

Mark Miller:

Let’s use an analogy then. If accounting software gives an incorrect answer to taxes and the accountant uses those findings for his client’s taxes, where is the liability?

Joel MacMull:

Under my understanding of accounting principles and how the reasonableness test would work is if the accountant had reason to believe that the output from the software was suspicious, let’s say, or otherwise not accurate, that an argument could be made that he’s responsible.

Your question , I suspect, goes to the instance, “What do you do where the accountant has no expectation or appreciation of the error?” Does it still then fall back on him as the provider of the service if in fact there’s an error in the accounting software?

Mark Miller:

Right now, historically, the assumption is ChatGPT you should just by default say this is possibly not correct. You have to do your own research.

Joel MacMull:

But if we take that assumption to its fullest extent, aren’t we then undermining the tool? If that’s true, then what value does the tool have to begin with?

Mark Miller:

It’s a starting point. That’s the way I’m using it.

It’s a starting point for research and investigation. Because instead of going to Google and having 20 tabs open, you can go to this thing and it’s going to grab the top SEO from that topic in Google and then aggregate it for you in natural language.

Joel MacMull:

I’m hard pressed to lay blame at the foot of the software developer at the end of the day. People are interpreting that data and it’s not as if we run these searches in the chat bot and then, we as lawyers or professionals, then don’t do something with it, right? We’ve then got to incorporate it into our own briefing or our own work product or whatever.

As soon as we do that, that becomes our burden as professionals to flush out the correctness of the data on which we’re relying, whether it be from ChatGPT or anything else.

There’s something here, what I would call like sort of contractual risks, and again, I was thinking about this in the context of keeping in mind that this is an open source tool. It is constantly taking in data that it ingests and it is refining and refining. When that happens, I think to myself, what are the restrictions on the company’s ability to share the client’s confidential information with the chat box.

The simplest example would be, if I’m a doctor and I have a list of patients, and I’m dropping in those patients and I then say to the chatbot, let’s say they’ve been diagnosed with some sort of, particularized cancer or something.

When I do that, when I say, Mary Jane, Joe Smith, Sergio Rodriguez and I talk about their sarcomas or whatever. Bang. Is that a HIPAA violation? This is where we get into what is the development of an appropriate policy.

Mark Miller:

There has been a real life case of that over on GitHub, if you’ve been keeping up with that. GitHub has a tool called Copilot, which is auto completion of your coding that you’re doing. Where they’re getting the auto completion is they’re taking in all of the code that’s been submitted publicly to GitHub, and like in ChatGPT they aggregated it, and now when you start typing something with a, some kind of application, it will recognize what you’re doing and pull code from various projects and throw it in there. Now there’s a whole group of people who are going after Copilot saying, you can’t use my code like that.

Joel MacMull:

Yeah, that’s right. That’s right. That no, and that’s right. I think actually there’s some unique permutations in the context of coding too, of course. But no, I think that’s right.

This is the problem is that when you input something, OpenAI will use the content to develop and improve its functionality, and then the question becomes who’s is it? Do I as a lawyer for example, have the right to input my client’s information? I imagine something that could be more mundane than not, but some of it I’m not so sure.

For example, if I want to measure the potential damages, and I’m defending a defendant in a suit and I type in, Jones Company earns 10 million dollars in revenues in 2020 and Jones Company, whether it’s private or public, that information now all gets bound up into the tool.

The other issue that I think this raises, and it’s different where there’s a fiduciary relationship like there is between doctor patient and lawyer client, what do you do between two parties where you might have a contractual obligation, forget the ethical obligation or the attorney-client you know, as a fiduciary, but there could be contractual obligations that the parties have vis-a-vis each other, not to use the data.

It’s essentially the same argument. It has applications I think, that are broader than just a straight up fiduciary relationship.

Mark Miller:

Did you see anywhere with any of the three engines where it said, we are going to utilize what you are putting in here in order to increase the efficiency and accuracy of our engine?

Joel MacMull:

I didn’t, but I also wasn’t looking for it.

Mark Miller:

I haven’t seen it and I have been looking for it, but I cannot imagine that they would not be using machine learning to analyze what people are putting in and what the output is, and then how people are interrogating that output. Because that’s how that machine learning is going to happen.

Joel MacMull:

Oh, of course. And that goes exactly to the point that I made. There’s no way to use the tool and effectively opt out of it using your data to become more efficient. It’s an anathema to the entire design of the model.

Mark Miller:

The thing that I did see in the terms of agreement was if you cancel your account, we will delete any data that we have associated with that account within 30 days.

So where does that, what happens there?

Joel MacMull:

What does it mean? Once I put it in, what, I put it in the chat box, is that now still tethered to my account necessarily? I have to believe that it’s got to be extracted from my account and put into some larger aggregator if for no other reason. than that bigger pot, so to speak, is now used as a device to aid in the machine further learning and developing itself.

I have a hard time believing that while it may relate to my account and my questions, that all of that somehow then is pulled out of the larger mass 30 days after deactivation.

In fact, I would guarantee it doesn’t happen.

Mark Miller:

I agree.

Okay. What do you have next?

Joel MacMull:

I guess there’s a sort of like reputational harm that comes along with this. And again, I look at this through the lens of a lawyer, right? I think we talked about this a little bit.

You retained me for, $700 an hour. I am drafting an agreement for you, let’s say on, a simple purchase and sale. I’ve obtained that agreement from ChatGPT. If you come to learn, that I have done that, I think at a minimum it gives rise to reputational harm, which is that you’re going to tell all your friends, my God, can you believe what this lawyer who holds himself out as practicing for so many years, he just essentially, he ran a simple query, in ChatGPT. I’ve come to learn, he’s given this to me as a final product.

The other reason why this is relevant, and again, it’s a little different I think in the legal context, is that I don’t think it’s necessarily malpractice, or at least there’d have to be a specific finding. I think you would have what I would call deceptive business practices where a lawyer or an accountant, or anyone who’s involved in, let’s say, setting up an estate or whatever the issue essentially turns to ChatGPT for its work product. I think you’ve got a disclosure obligation there ethically, at a minimum. As far as I’m concerned, I would have a disclosure obligation. That’s the way I take the position on that.

Mark Miller:

In that case, how come lawyers are not saying that as part of my disclosure, this is just a boiler plate contract. Cause everybody does, every contract I’ve ever read, there’s passages in there that are all exactly alike.

And there’s a boiler plate somewhere they’re pulling from.

Joel MacMull:

That’s right. But I think there in that cut and paste exercise that you’re talking about, or I’m assuming that’s what you’re talking. I think the difference there though is if you’ve got a lawyer that is essentially behind that directing traffic, right? He’s saying to himself, vis-a-vis, another agreement, yes, I can pull this provision because it applies here, or no, I’m not going to pull this provision cuz it doesn’t apply here.

The scariness with generating legal documents for me, at least in the context of evaluating through the eyes of a client, a non-sophisticated client is not going to be in a position to second guess what the lawyer has done, even where the document on its face may be completely in inapplicable to the particular legal situation in which it purports to address.

My point is that if you go shop for a lawyer for purposes of a basic will, and in that basic will, there is some language that relates to.

I don’t know, something that just would not be, relevant…

[00:13:49] Mark Miller:

That happens all the time though. Literally, I’ve been reading through contracts and says, this doesn’t apply in this situation at all, and you have to go back and fight. And they say, oh yeah, we’ll take that out, we forgot to remove that.

Joel MacMull:

Alright, then you might be the exception, right? You may be the sophisticated client. I just can imagine a whole host of horrible things happening and, someone who pays 500 bucks for a flat fee and just wants a will, doesn’t have the wherewithal to ask the kinds of questions that should be asked. And coupled with the fact that a lawyer is really not engaged in drafting the product, it just seems to me to be fraught with all kinds of problems.

Mark Miller:

You got another issue there?

Joel MacMull:

An obvious one for me again, as an intellectual property attorney, are the intellectual property risks.

I may have mentioned last week that US copyright laws, the official position of the copyright office was, if it is not created by a human, it is not amenable to copyright protection. And this came to the fore a couple years ago, for those of our listeners that may remember, where you had the monkey taking selfies, and he wants to copyright it and he’s got to, in the context of setting forth that copyright notice, he’s got to say who the author is.

The photographer was the monkey! So It was litigated and the court came down and said, “No, monkeys are not humans. And humans have to be the authors of the works.” Works is the magic word we use in copyright lingo.

Well, Interestingly, last Friday I got a ping in my inbox. The US Copyright Office allows registration of artistic work that contains AI generated images, but the scope of protection excludes the images themselves.

Here was the case.

Some young artists had hopped online and used, some AI generated, some graphical software. Midjourney AI was used to produce these images. What he did was, he or she produced these images or they were produced, but of course they were done, by a machine. It so happens that he then, let’s say there was a hundred pictures that were generated. He or she then pulled, let’s say 10 of them and put them into their storyboard essentially. They had, as I understood it, actually authored a story and they were looking to, complete their graphic novel with the graphics.

Of the hundred, they took the 10. You can imagine that they, took particular care in selecting and coordinating and the arrangement of these photographs vis-a-vis the rest of the text.

He applies for the copyright on the work as a whole. The copyright office rejects it and says, “Wait a second. There is at least a component piece of this, i e, the drawings that are not by your hand. Therefore, they are AI generated. The whole thing is not subject to copyrightability.”

As I understand it, there was a review of this, and I don’t know the extent of which it was litigated, but the office then said, “Wait a second. We acknowledge the human input with respect to the arrangement, selection, and coordination of these images within the story. That we will give you copyright to, and of course the text as well.” I think I may have misspoken a moment ago where I said that they eviscerated the copyright as a whole.

The long and short of it now after Friday, is you get some of the protections. You don’t get all of the protections and that which you don’t get is with respect to the copyrightability of the images themselves, because they’re not yours.

Mark Miller:

That’s an interesting one because no one else will be able to recreate those images. It’s not as if somebody could put in the same words to the same engine and get the exact same image out. ,

Joel MacMull:

I don’t know the tool well enough. I’m just not familiar with the tool. So what I wanted to know is if I put in peanut butter on any day and I asked him to create something, on February 27th, am I going to get a different image than if I put in the exact same input on March 3rd? To your point , it never shall be replicated again.

Mark Miller:

The only way it would be replicated if somebody took a screen cap or captured that image from the original creator and then reproduced that. And that’s what they’re trying to protect people from doing. The people that filed that copyright notice. We don’t want people using the pictures that we generated for our book.

Joel MacMull:

It’s such an interesting thing because then the issue becomes who’s is it? It’s an OpenAI source. It’s necessarily open source. So even if you could attribute authorship to the owner of the code, the very nature of it disclaims individual ownership.

Mark Miller:

When we’re talking about software itself, 80 to 90% of all contemporary software is now open source. What people are doing for proprietary software is they’re taking that 80 and 90% and building the foundation and then adding the other 10% of their secret sauce to make it proprietary software.

Joel MacMull:

But when you do that and for copyright with the office, you’ve got to demonstrate what is proprietary, what’s the stuff that I did versus the open source backbone. And I agree with you, which constitutes no doubt that the majority of my work. I mean this happens all the time. The open source folks become aware of part of their code that’s baked into a proprietary registration and it’s a mess. They come back and they say, that’s not yours to claim ownership interest in.

I was tangentially involved in a case that involved this, where the developer then had to go back and the copyright claim, of course, became substantially narrower, because it was only ultimately limited to claiming that which was theirs, not that which was a part of the open source network.

Mark Miller:

One of the things when I was working at Sonatype that we looked at a lot, and actually built a tool to check the licensing agreement of any open source software that you’re proposing to put in your software. There are specific licenses that basically say, ” This is just wide open. Anybody can use it. I don’t care.” There’s hundreds of different licenses, but the majority will say, “If you use this, it’s cool for you to use it, but you have to give back what you make to the community.”

Joel MacMull:

I’ve come around to the conclusion that I think it’s good corporate hygiene to think about these issues, not only for a law firm, but any, corporate enterprise. They run well past just legal issues.

Mark Miller:

Thanks for joining us for this week’s “You’re kidding me… that’s in my EULA??” We’d appreciate your comments on today’s show page, located at WhatsInMyEULA.com. You’ll also find information on how to get in touch with Joel. While you’re on the page, tell us what other EULAs we should investigate. If we use your suggestion, we’ll give you a shoutout in that episode.

“That’s in my EULA??” is published weekly. Special thanks today to Katy, that’s with a ‘T”, Kadi, that’s with a “D”, Edwin, and Tracy for the awesome voiceover work at the beginning of the show. Music today is provided by Hash Out from Blue Dot Sessions.

We’ll see you next week.

This was a Sourced Network Production.

If you’re interested in talking with Joel about some of the issues in this episode, shoot him an email.

Joel G. MacMull | Partner
Chair, Intellectual Property, Brand Management and Internet Law Practice
(973) 295-3652 | JMacmull@mblawfirm.com

AI and Chat Engines

Episode Transcription:

Comments:

SUBSCRIBE