Users have tried to upload sensitive company information and PII, personally identifiable information, into ChatGPT. Those who are successful getting the data in, have now made that data free to all. Will people’s misuse of these generative AI programs be our greatest downfall to security and privacy?
Check out this post for the discussion that is the basis of our conversation on this week’s episode co-hosted by me, David Spark (@dspark), the producer of CISO Series, and Geoff Belknap (@geoffbelknap), CISO, LinkedIn. Joining us is our special guest Suha Can, CISO, Grammarly.
Here also is Grammarly’s trust/security landing page.
Got feedback? Join the conversation on LinkedIn.
Huge thanks to our sponsor, Opal

Full Transcript
Intro
0:00.000
[David Spark] Users have tried to upload sensitive company information, NPII, personally identifiable information into ChatGPT. Those who are successful at getting the data in have now made that data free to all. Will peoples’ misuse of these generative AI programs be our greatest downfall to security and privacy?
[Voiceover] You’re listening to Defense in Depth.
[David Spark] Welcome to Defense in Depth. My name is David Spark. I am the producer of the CISO Series. And joining me for this very episode is Geoff Belknap. He’s the CISO of LinkedIn. Geoff, say hello to our nice audience.
[Geoff Belknap] Hello, audience. I am Geoff Belknap, a large language model known as a meat-space generative AI, and I can’t wait to have this discussion.
[David Spark] Awesome. Well, before we get into that, which I’m very excited about getting into, I do want to mention our sponsor, Opal. Opal – secure the identity perimeter. You can actually find them at opal.dev. But we’re going to be talking about that very subject a little bit later in the show.
Hang tight. But first, let’s get to today’s main topic, and that is ChatGPT. It is the talk of the town here in Securityville. Rachael Greaves of Castlepoint Systems brought up an issue, and that is our users entering in sensitive data into a very public system for which researchers have proven to recall through data extraction attacks.
Heck, I don’t know if we can actually call that an attack if its own users unknowingly or possibly knowingly enter data into a public database. I think it’s more of a clever recall than an attack. So, Geoff, is the problem ChatGPT, or our need to add yet another course to our security awareness training?
[Geoff Belknap] I think like all things, the problem is not ChatGPT. The problem is when we find magical new technologies that are advanced and indistinguishable from magic that sometimes we forget that it’s still just another SaaS application, and we need to be careful about what we put into it. So, probably not a world ending new security threat but an opportunity for learning for sure.
[David Spark] Well, Jen Easterly, who as of the day of this recording… We’re recording this in May. Thinks that is a great threat, in fact, ChatGPT. And so we’re going to discuss the concerns and issues, and actually somebody who actually is dealing with generative AI right now and has a very good viewpoint on this very topic is our guest today.
Very excited to have him on board. It is the CISO of Grammarly, a product that I use plenty, Suha Can. Suha, thank you so much for joining us.
[Suha Can] Thanks, David. It’s great to be here.
What’s our visibility into this problem?
2:44.348
[David Spark] Theo Nassiokas of Lets Go Cyber said, “AI and the opaque fashion that ChatGPT operates isn’t the problem. Poor human judgement is the problem. Nothing has changed.” Shank Mohan of VMware said, “There needs to be ramifications for the staff who divulge corporate data.” And the author of this post, Rachael Greaves of Castlepoint Systems, brings up a really good point.
“Breaches aren’t just happening in enterprises. A doctor in a small practice using it to draft an insurance letter for a patient doesn’t have anyone to train them. There is no gatekeeper stopping most consumers using this tool like there is for other enterprise software.” So, let’s start with that last quote.
People say just don’t enter the bad data, but a lot of people don’t know how ChatGPT operates and what do we do for the people who don’t have the security awareness training like the doctor, Rachael Greaves points out to. Geoff?
[Geoff Belknap] I think this is a great way to think about the problem. This is not a new problem. But what’s really interesting about this case, especially with ChatGPT is unlike a situation where maybe a doctor or someone like yourself clicks on a link that they shouldn’t have, sometimes you don’t find out right away that you shouldn’t have.
And sometimes it’s just a transaction between you and the bad guy. In this case when you’re interacting with a large language model like ChatGPT, you’re putting information into a place that everybody else theoretically might have access to that information, that the AI is learning on top of and can tease it out later.
That’s pretty different. Nobody later is going to immediately be able to tell you that you clicked on a click inappropriately that installed malware except the malware guys. In this case, I can go query the AI and try to tease that information out that you got. So, I think there’s a faster cycle to exploitation or theft of that information this case, and so it’s a great opportunity to remind people you have to be thoughtful about what websites you put sensitive data into regardless of how cool it is that they can accurately create a document or a summary of information for you.
[David Spark] That very last line is a good point. I’m going to throw it to you, Suha. Is I don’t think most people recognize that them entering what may seem like benign data becomes sort of available to “all” but in a teased out function. What’s your take in terms of everyone’s awareness of that?
[Suha Can] I think awareness of this issue is low. And it is different than other cases that we have seen over the years in that when you upload your information to these tools, it’s not just within your account. These tools will retain only information you uploaded to them. So, there is definitely a new risk and a new level of awareness that needs to be gained to be able to use these tools securely.
[David Spark] For the situation that Rachael Greaves…the doctor who’s never going to get enterprise level security awareness… How is this information going to seep into them? Because putting this kind of information is a major security issue and privacy issue because they’ll be putting in conceivably private health information.
[Suha Can] There is definitely an opportunity for tools to provide some level of default privacy protections like anonymization, deidentification. But also the SaaS piece of data has been submitted to them like a PII. They should warn the user. So, I think more so than relying on training and awareness, I would look to the tools to provide some level of support and help to the users.
Like the doctor’s office example, to use these tools safely and securely. And we don’t have that today.
[David Spark] Yeah, that’s a good point. Because actually ChatGPT, Geoff, they do actually have some kind of security mechanisms. Like for example, I can’t ask ChatGPT, “What do you think of Geoff Belknap?” It won’t give me an opinion of you. It could do the same thing of if it sees credit cards, social security numbers, what looks like PHI or PII.
It could say, “Are you sure you want to upload this?” Or maybe just flat out block it. What do you think?
[Geoff Belknap] I think that’s exactly right. What I will say is ChatGPT and Open AI and the team, they are doing a phenomenal job of trying to stay ahead of this thing, and they’re trying very hard. It is a unique problem in this specific space, but it’s not unique in the sense that here’s a new technology.
People are finding it wildly useful. It has definitely found its moment, at least at the time that we’re talking about it. Hopefully by the time this airs, it’s not all over with. But here we are looking at a website that is very useful to people. People are putting all kinds of information into it.
They probably didn’t predict that it would be happening just like this. Sometimes when you get that product market fit, and it just fits and runs, it’s not exactly what you expected. But you’re excited that people are using your product. So, now what needs to happen is they are spending lots of time trying to make sure that prompts that people use to teach information out of it are much harder to abuse.
That any information that goes into the model is harder to tease out or it can be obfuscated. But if I was going to guess, the folks at Open AI and the ChatGPT like folks that are building products like it are all sitting down, going, “Okay, how do we solve this brand new security space of making sure that information that goes in can be compartmentalized and can’t be extracted wholesale the way that it goes in?
And that people are aware that, hey, anything you put into this can be retrieved by somebody else without scaring people off of using it for very useful purposes. And that’s a hard line to balance.
How did we get here?
8:38.977
[David Spark] Louis Thomas of ADPsaid, “This comes at the same time of all these free to use web based time optimization apps. An unsuspecting employee would be encouraged to upload data to an app that will customize/generate a presentation/dashboard report, etc. for free in half the time it would take for them to make it themselves.
With the pressures of short deadlines and fast performance, it’s easy to see why those who may not be as security conscious are opting for these solutions. Very difficult to mitigate even with training in place. And Rachael Greaves, the author, again, of this post, of Castlepoint Systems also, said, “Do you remember all those PDF converters we used to have?
There are so many productivity tools that suck up our data, and staff will find them and use them if we can’t provide secure alternatives.” So, let me start with you, Suha. Suha, you have a tool for which people are entering all kinds of information in there, and you are learning from that. But if I understand how it’s used, there is no way for me to tease, heck, any information out from your tool.
[Suha Can] No, there is no way to tease the information out of Grammarly.
[David Spark] So, what is it you’re learning from the information that’s going in?
[Suha Can] To our earlier conversation, we have a lot of investments in anonymization and deidentification when you use hGrammarly. So, as you are using Grammarly, we immediately forget who you are. And at that point in time, it will be impossible for either us or you or any other user to tease out that information that you have sent us.
[David Spark] But you’re using the information as some kind of a learning model I would assume, yes?
[Suha Can] Yes. We definitely are, but we do this in an aggregate and deidentified way. So, we will not be learning about David and what you have written, but we will see that users write in a certain way, and [Inaudible 00:10:42] since they are more likely to accept and not accept it.
[Geoff Belknap] Yeah, I think this is an important point. Something like Grammarly is like… Look, the rules of grammar are known.
[David Spark] Yeah, it’s not based on what Geoff writes.
[Geoff Belknap] 100%. So, you can improve on that. But what’s so different… And I think this is why we’re having a bit of a moment in the security community about this is things like generative AI models are being built on top of all of the data that people put into it, and they’re treating it like a new Wiki or Encyclopedia Britanica for those of you that are old like me.
And in this case, people want the same information out that somebody else put in, and I think we’re running into an issue today where all the work that people like Grammarly have done, like anonymizing it, and aggregating data, and deidentifying it, we’re not quite there yet with public generative AI models and LLMs.
I am very comfortable that we’re going to get there in not a short amount of time, but that’s just one example of a number of security hurdles that we haven’t solved yet in the space.
Sponsor – Opal
11:44.750
[David Spark] Before I go on any further, I do want to tell you about our sponsor for today’s episode. Really thrilled to have them on board because they are talking about a very, very difficult problem, and they’re doing some impressive stuff on trying to conquer this really difficult problem. So, we all know that access can be hard to calibrate.
The who you are, what you do, and why you need it is a complex set of relationships when framed against the reality of work. It’s not just about implementing the best practices we know but how to integrate them with the culture and habits of a particular organization. That’s really complex. Now, for the teams responsible for nailing this balance, this can be a dauting task.
The policies involved can be complex.
And in sensitive systems, the stakes are high. Too much access and you give a bad actor an opening. At worst, resulting in a company ending breach. Too little, and you put roadblocks between people and their work, thereby slowing the business down. Neither is good. We’re looking for that goldilocks moment.
So, Opal is designed to give teams the building blocks for their IAM strategies and seamlessly apply intelligent policies that are built to grow with your organization. Whether that’s setting good rules for day one access or helping to clean up the rats’ nest of long lived access in a Cloud with time controls.
Opal is designed to give teams the building blocks for IAM strategies and seamlessly apply intelligent policies that are built to grow with your organization.
Whether that’s setting good rules for day one access or helping to clean up the rats nest of long lived access in the Cloud with time controls. Opal is used by best in class security teams today such as Figma, Databricks, Blend, Marqeta, Scale AI, and more. There is no one size fits all when it comes to access, but they’re here, that’s Opal, to provide the guiderails every step of the way.
Go check them out at opal.dev.
What are the best practices?
13:44.748
[David Spark] Nicholas S. of PCG Cyber said, “If someone has voluntarily uploaded sensitive information, ChatGPT was misused as a tool, not a cause. Anyone thinking on this problem carefully can see this is an insider threat issue.” And Matthew Newman of TechInnocens said, “Provide an immediate position to your organization on what is allowed and what isn’t.
Encourage experimentation and learning with an envelope of acceptable use. It doesn’t need to be perfect but set some guardrails until you can stand up a policy.” So, let’s focus on that. I really like this last comment of like how can we all wonderfully safely use this fantastic tool. I’ll start with you, Suha, on that people when they see a new technology don’t listen when the security department say, “Don’t use it.” They want to use it.
But the security department can be there to guide them as to use it safely. So, let’s get into some specifics. How can we use generative AI and ChatGPT safely? What’s your advice?
[Suha Can] I think first of all, separate experimentation and using it with nonconfidential data with using it with confidential company data. And for each of them, definitely there’s a demand, so you have to solve for both. On the confidential data part, what we hear then is very quickly assess and establish a working business relationship with the most developed providers out there and establish a business account with certain data prediction agreements in place.
What are those? Well, we want to establish an agreement about the fact that the data we use with a particular provider will not be used to train their models. We also have got an agreement with them about data retention. Because typically these tools and these providers for abuse monitoring, they will make a copy of your data for a limited amount of time.
But that’s already too long. It’s like two or three days. That data you don’t have visibility over, so you don’t know what happens to it.
It’s the typical third party security risk. We are not very comfortable with opening the gates to everybody to use this data. So, what we hear then is quickly assess the largest providers. We entered into business agreement with each of them with our own terms, and we made these business accounts available to everybody internally who can then use it with confidential data in a safe and secure way.
On top of that, on the experimentation side, we made it possible for employees to use a larger number of these tools. Again, we provide and allow this to them. One of these you can use but not others. And, again, we assist them carefully. There is a lot of players out there, but I don’t think they are significantly differentiated from each other.
So, we pick a small number, and we say, “Hey, with these ones, you can go experiment with these ones. Just don’t use company confidential data.” And it’s pretty easy to do.
[David Spark] So, this actually seems like a pretty darn good model that you’ve done. What I’m hearing from you, you’ve seen this issue, and you got ahead of it for your company. Am I right in that assumption?
[Suha Can] Yeah, definitely. We very quickly arrived at a set of guardrails which we then wrote it down as more of an acceptable use policy, and our employees are abiding by that as far as we can tell.
[David Spark] Geoff, what’s happened at LinkedIn?
[Geoff Belknap] Well, I don’t want to comment specifically about what we’re doing, but I think Suha makes some great points here. The number one thing you can do when there’s this transformative technology out there that everybody wants to play with, because it might be great for your product… It might be something we need to experiment with to get ideas about what we want to build is you want to build a paved path.
You want to give people a safe playground to experiment with these things. There are, today, options available from Azure and other people where you can build yourself a private playground for Open AI and use models similar to ChatGPT’s in a place where your corporate identity is required to use the data, that it is within your enclave or protective or just in general… You have control over what data goes into it and what data comes out of it.
What we have to remember is that public sites, although they are very interesting and inspirational like ChatGPT, are not the place for anyone to be doing business. Unless you’re buying a business account or an enterprise account, and you have those basic controls over how, just as Suha said…how is that data being used, how is it being trained, how can I control whether that data comes in or out of that platform.
It’s just basic vendor management. And I think the easiest thing that I will talk about that we do is just to remind people like, “Hey, this stuff is not approved for business use. If you want to experiment with AI for business use, A, B, C. Here’s the internal options to do that.” Or just tell people, “You can’t until there’s something internal to do.”
[David Spark] Is this kind of like the next stage of Shadow IT or consumerization of IT as they call it? Because all of a sudden people could do an end around of IT and they could load up their favorite SaaS app and start entering whatever the heck data they want into it. In a sense, it’s kind of the same problem, yes, Geoff?
[Geoff Belknap] Absolutely. But I want to be really clear for our listeners – this discussion really centers around ChatGPT, the publicly facing internet website that’s open to everybody. It’s not every product that has AI built into it. Not every product. And the thing to keep in mind here is Open AIs, ChatGPT, the public version is like telling a six-year-old a secret and saying, “Don’t tell your friends.”
[David Spark] [Laughs]
[Geoff Belknap] And then all of your friends are showing up and going like, “Is the candy in the closet?” He goes, “No.” “Is the candy in the fridge?” He goes, “No.” You say, “Is the candy in he candy drawer?” And he goes, “Well, I’m not supposed to tell you.” And you’re like, “Ah-ah. I think I have some idea of what questions to ask next to figure out where the candy is.” That’s the level of security we’re at now.
You really need something more than that if something is going to be used for your professional business data.
[David Spark] You’re nodding your head, Suha. Close this out.
[Suha Can] I think the paved path or idea is the one to keep in mind here. And also be proactive and also realizing that your employees are definitely playing with this thing. I think it falls on us, the CISOs and security teams, to provide them clarity and a safe way to do so.
What are we going to do now?
20:00.549
[David Spark] Matthew Smith of Lumify Group said, “Amen. The only way to even remotely secure information is to simply not release it. As soon as anyone else knows, the risk is there.” And I think they’re saying just don’t put the data in, not not play with the tool but just don’t put the data into the system.
And Reema Jagannath of IBM said, “Free tools never come for free. We pay them with data, which is vulnerable.” And I think I heard a line… I don’t even know who said this, but if you’re not paying for the product, you are the product. And that, I think, is kind of the case here with ChatGPT. I’ll start with you, Geoff, on this.
Are we the product of ChatGPT? I think we are.
[Geoff Belknap] We are certainly ChatGPT’s muse. It is learning from all of us. And it’s learning from me that I don’t know how to write thank you notes. But it’s given me lots of advice. Look, I think Rema and Matthew have the right idea here. Data minimization is the right way to go. I think the easy start here is just set some boundaries.
Remind people that you’re not supposed to be putting confidential information in public tools. And as we just talked about in the last segment, provide them some path to experimentation that is safe. Because I’ll tell you what, what little I have experienced so far of this new wave of LLMs is it will absolutely transform your business if you use it the right way.
But using it the right way, you need to have some experimentation, and you need people to be inspired by how they’re going to build this into your business and use it day to day and think about how to do that in a very safe, secure, managed way.
[David Spark] Suha, earlier in the show, you went through great details of the fact that you guys have been analyzing the different tools out there, saying, “Here are the safe tools to use. And also here’s the right behavior.” But the thing that I find phenomenal about ChatGPT… And let me point out, we’re recording this in May, and this is coming out in August.
There may be a very different story come August. We don’t know yet. But have you had to keep amending your policies as you’re seeing things change over time? Because this was just in the past few months there’s been great changes. What have you had to do?
[Suha Can] Definitely. Just a great example of this is when we first analyzed the large providers out there, we found that there were significant differences in their security posture.
[David Spark] Can you give us a detail? Like what you saw in the differences in security postures.
[Suha Can] There were definitely those that are more enterprise grade in terms of the compliance credentials that the organization has but also their policies on training. And even the fact whether they have a bug bounty or not external. These are the type of things we look at when we evaluate a vendor.
And in the beginning just a few months ago, there were maybe one such player that meets all these requirements. And very quickly, we have seen maybe because we are also talking to all of these folks at the same time…they are hearing our concerns, and they are reacting it. So, very quickly I have seen the providers move from training by default to not training by default, to not having an enterprise account solution to having an enterprise account solution, to not having a bug-bounty public to having a bug-bounty public.
And so I think what I expect will happen is in the near term, they will all have the same enterprise security posture. Some might be coming from a research background, and they have evolved into a provider at scale. Maybe they were surprised, like Geoff mentioned earlier on. But I think the trust posture will look very similar across that.
One way I’ve been thinking about this generally is when you’re reiterating a new third party partner… So, let’s say how much of a net new dependency is that for you. And I think some of the larger Cloud providers, they are already on everybody’s map. You probably have an account with them already. So, using an additional service from them is a much easier jump for a company like Grammarly who cares a lot about enterprise grade security.
So, that’s the path we are taking for our longer term partnerships, but we also recognize that most players will reach a baseline security posture and a set of configuration that will speak to all of the CISOs that are out there.
[David Spark] Let me just throw one last thing… I think it’s okay that there’s sort of this panoply of tools – one that are kind of very insecure to more secure enterprise grade because they allow these tools to sort of play a little differently and to learn and understand. Are you okay with the panoply of tools that are out there, Geoff?
[Geoff Belknap] I’m okay with it. I just think we have to be responsible about how we use those things.
[David Spark] Right.
[Geoff Belknap] And I think every business today has rules about how to use these things responsibly, and we just need to abide by them. It’s very tempting to push it further and put your work into it and have it write your doc for you because it will.
[David Spark] And it literally looks like magic when it comes out. Like, “Oh my God, look at that.”
[Geoff Belknap] It is something to behold. And I think that’s why the onus is a little bit on people like Suha and I to remind people of their responsibilities and the policies we need. And frankly, it’s on people like Open AI and ChatGPT to make sur that those products are safe and that even though I shouldn’t put data into it, it should be very hard to get that data out of a large learning model like that.
Closing
25:30.392
[David Spark] All right. Well, we have come to the portion of the show where I ask both of you which quote was your favorite, and why. I’m going to start with you, Suha, here. Of all these wonderful quotes here, which quote was your favorite? And explain why.
[Suha Can] I like Matthew Newman’s quote that was talking about providing immediate position to your organization what is allowed.
[David Spark] Which is clearly what you have done over at Grammarly.
[Suha Can] That’s right. And I think in the spirit of growth mindset, I fundamentally do not wish to blame the humans for using really useful tools. That’s kind of how I see this. So, the next best thing I can do is reinforce how you need to put in guardrails around user generated AI among your employees, and these enterprise grade security practices are very critical for your deployment.
But it is to also provide clarity to those paths and highlight the thinking behind it versus saying that, “Oh, no, humans again.” It’s like the phishing problem that we have and all those things, and I think it’s not. I think it’s humans are doing the right thing. We have to keep up with it essentially in this case.
[David Spark] All right, good one. Geoff, your favorite quote and why.
[Geoff Belknap] I’m going to go with Rema from IBM who said, “Free tools never come free. We pay for them with data which is vulnerable.” And I think in this case, it’s just good to remind yourself that if you’re an employee working for a tech company and you find one of these cools tools, keep in mind that you don’t control what happens to the data you put into that.
And as much as it would be helpful for your job to put that data in and for it to transform into some way that it’s useful for you, just take a step back and realize that you have responsibility to protect that information. And not every free website on the internet has the same responsibility.
[David Spark] I like that. Good closing comment right there. All right. Well, we’ve come to the very end of the show. Geoff, by the way, there’s a site called LinkedIn. If you’re looking for a job, it’s a great place to do it. And in fact, we just put out a job listing for a writer/producer. And we put it on LinkedIn.
What do you know?
[Geoff Belknap] What a great idea.
[David Spark] Suha, by the way, are you hiring over at Grammarly?
[Suha Can] Yeah, totally. We are hiring for Grammarly across all sorts of roles. Go to Grammarly.com/jobs. And if you want to find out about our security team, please check out Grammarly.com/jobs.
[David Spark] Aw, very, very good. Grammarly.com/jobs. Well, thank you very much. You can check out Suha Can over on LinkedIn as well. We’ll have a link to his profile page on the actual blog post for this very episode. I want to thank our sponsor, opal.dev – secure the identity perimeter. Thank you very much, Opal, for sponsoring and being a brand new sponsor of the CISO Series.
We love that. Thank you so much. And thank you to our audience. We greatly appreciate your contributions, as always, and listening to Defense in Depth.
[Voiceover] We’ve reached the end of Defense in Depth. Make sure to subscribe so you don’t miss yet another hot topic in cyber security. This show thrives on your contributions. Please, write a review. Leave a comment on LinkedIn or on our site, cisoseries.com, where you’ll also see plenty of ways to participate including recording a question or a comment for the show.
If you’re interested in sponsoring the podcast, contact David Spark directly at david@cisoseries.com. Thank you for listening to Defense in Depth.