The Journal of Business hosted Yanek Kondryszyn, intellectual property attorney and partner at Lee & Hayes PLLC, for its most recent Elevating The Conversation podcast.
The Elevating The Conversation podcast is available on Apple Podcasts, Amazon Music, Spotify, and elsewhere. Search for it on any of those platforms or the Journal's website to hear the entire conversation, but for now, here are five takeaways from the episode, which runs just under 40 minutes.
1. Frequently, the term artificial intelligence is used too narrowly. AI is a very large umbrella term to describe more of a philosophical idea. When I hear people talking about AI tools and specifically applying that term to describe large language models, which are this category of natural language processing that we're seeing boom right now, I'm a little revolted.
The reason why that's such an issue, I suppose, is because machine learning is just one way. The big hoopla that we're seeing is really around the generative models at this point, a particular set of machine-learned models that are performing very well for text generation, text to speech, and speech to speech.
But the problem is, someday, we'll probably have nonmachine learning model forms of AI. So whether that's using biological circuitry in some way, or biological logic, chemical logic, or quantum computing, it starts to become unhelpful if we've all started to associate AI with large language models and diffusion models specifically.
As a further reason to not use the term for generative tasks, AI could certainly be used for nongenerative tasks in the future. For example, robotics. Generally speaking, that's just a different class of use.
2. Be mindful of the client data you put into AI or machine learning tools. If you're a business that receives any sort of personally identifiable information or confidential information, you need to be cognizant of what is provided as input to any machine learning models you may be using.
For example, if you wanted to generate some image or process a synopsis of information and put that into a model, you may be in real trouble if you haven't already previously researched the terms of use for that model.
Oftentimes, with the free versions of many of these models, the exchange for it being free is the technology can be trained using your data.
And what does that mean exactly? I don't know what the proper name of one of those toys is, but picture the game where you put in a coin and it filters down through a whole bunch of nails and lands at some potential prize. Essentially, your training data is moving the location of those nails, figuratively speaking. Consequently, it's possible that someone else could have an input and end up with an output that's close to the output generated for you. That could be a problem if, all of a sudden, you have a customer whose private information is released because the nails got moved in that particular way.
I'd say that's of a particular risk right now, especially because a lot of these providers are competing to increase their benchmarks in a way that could edge them over big, big competitors. It isn't the "hi, how are you?" inputs that they're struggling with. It's the weird ones, the outliers. And guess what? That outlier data might include the small amount of information you put in.
3. Be equally as mindful of how you use the finished product of a machine learning model. We've talked a decent amount about the inputs to a model. I think another thing you have to be aware of is the outputs from the model and how you intend on using them.
The copyright office in the U.S. has been rather clear on the fact that copyrights only attach for creative works created by humans. As we know from photoshop examples, digital tools can be used to assist a human in creating. However, copyright doesn't attach unless a human is highly involved in creation.
If you wanted to create a jingle for your business or advertising materials, I wouldn't use a machine learning model to generate that in a pure sense. I wouldn't merely give the model a prompt and use whatever its gives you. Instead, I would use it for revision. For example, I'd take the picture and do some revision of it using a tool. I'd say that lands closer to what the copyright office seems to be leaning toward thinking is copyright eligible.
Think about a situation where you might have a jingle in an ad. If you purely generated that using a machine learning model, there's a risk that a competitor could just rip off your work, and there's nothing you could do about it. Granted, that doesn't get into the trademark issues, but I think you still need to be cognizant of how the output will be used and how copyright could be applied.
4. A company's customer data might be trade secrets in this day and age. For business owners, you have to realize you're also potentially sitting on your own trove of good human data.
If you have had systems to track interactions in documents that humans have written, that's all valuable training data for machine learning models now. In my view, the real, new oil in this economy is human-generated data, because we're going to see such an influx of AI-generated data.
It's going to be tough to sift through the noise and find good training data to train these models. And many businesses, even here in Spokane, maybe sitting on some of those troves of data and have to realize that all of a sudden, it could be that a lot of your documents just became trade secrets overnight with the advent of these generative models.
That might mean talking to an attorney, figuring out a trade secret strategy to protect your documentation. Things that formerly may have not qualified for trade secret all of a sudden might, because they're valuable as training data for a model for your industry.5. Take a lifelong-learner approach when it comes to AI. It's foolish to not be a learner, right? You absolutely should be using these tools in ways that don't expose private data and sensitive information. I would highly suggest playing around with the free versions of these tools and familiarizing yourself with their capabilities.
But then, don't leave it there. I think I see so many posts on LinkedIn about prompting and how to get good outputs, but that's really just, how do you do a Boolean search? I see ourselves in the innovation cycle as being somewhere like Boolean-search phase, in a sense that you have to do a good bit of prompt engineering to get the output you desire.
You might say, please give me an image of a business with bikes out front, and it's just not quite how you wanted it to look. So, you actually need a bit of a sophisticated string of prompts, such as the kind of aperture or the kind of setting. We'll see innovation in the future that reduces the friction to interface with these tools.
Thankfully, we get to return to business fundamentals, the things you should be considering for your business. You need to return to those fundamentals of investigating what returns this technology will have for us, what the actual advancement is for the business, and whether that's necessary.
As a business owner asking, what are the biggest points of friction for our company? And are there any models out there right now that actually help with those points of friction? To be able to answer those questions, you both need to be paying attention to your business, of course, but then you also need to have a good idea of what the tools actually do.