Creating Synthetic Users For LDAP Using ChatGPT
Image generated using Tome.app

Creating Synthetic Users For LDAP Using ChatGPT

Following my previous article on Generating Realistic Synthetic Data with ChatGPT, I have decided to take this further to see how the concept can be used for my work.

In my work, it is often useful to show the customer some features of our platform before it is deployed in the customer environment. There is a need to connect up our data privacy platform to an LDAP server to show that users could be authenticated from a centralised user directory. The typical way to do this will be to set up a simple LDAP service like ApacheDS, and then import an LDIF file consisting of some sample users. Creating this set of users from scratch, especially when we wanted to customise to the customer's organisation, was often time-consuming, error-prone and tedious.

This can be made much simpler with ChatGPT. To do this, I started with a simple prompt:

Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password
For the email field, it should be of the format 
firstname.lastname@interesting.com
Please provide the output in LDAP ldif format.         

This generated a list of users, and in a relatively clean LDIF format:

No alt text provided for this image
First attempt in generating some users in LDIF format.

I had specified the format of the email address, as typically when we do demos and POCs, the users tend to be from the same organisation. However, ChatGPT was not able to use the right DC, which was still set to dc=example,dc=com. I also didn't like the way that uid is a number, and that the displayName attribute which I needed for my application is missing. But I was quite surprised it managed to derive the objectClasses properly, though it was missing a few that I needed.

So, I tried to be more prescriptive with a modified prompt:

Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password

Please provide the output in LDAP ldif format.

Please use the following as part of the DN: "dc=interesting,dc=com"
For the uid attribute, it should be of the format firstname.lastname
For the email field, it should be of the format 
firstname.lastname@interesting.com
For the displayName attribute, use the first name and last name.
The following objectClasses need to be included: 
tlsKeyInfo, person, organizationalPerson         

This gave the following output:

No alt text provided for this image
Second attempt to fix the uid, dc, displayName and objectClasses.

This was definitely much better, with the fields added in correctly. In my prompt I left out the inetOrgPerson objectClass, to see if ChatGPT will include it or if it will strictly use what I had provided. It did the latter, but I could always add it to the list later.

The next thing I needed was to have some user groups, as it was often nicer to have people from different teams in the organisation appearing in our application to show application features that were available to different groups of users. To do this, I modified the prompt to have ChatGPT generate groups, and randomly assign people into those groups.

Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password

Please provide the output in LDAP ldif format.

Please use the following as part of the DN "dc=interesting,dc=com"
For the uid attribute, it should be of the format firstname.lastname
For the email field, it should be of the format 
firstname.lastname@interesting.com
For the displayName attribute, use the first name and last name.
The users should belong to the OU: "ou=Users"
The following objectClasses need to be added: 
tlsKeyInfo, person, organizationalPerson

The following groups will also need to be created under the 
OU "ou=Groups": Sales, Marketing, Services, Engineering
Randomly assign the above users into the groups as members.         

I liked the way ChatGPT explained it at the top:

No alt text provided for this image
Getting ChatGPT to generate user groups.

And the groups appeared in their own section with the list of members taken from the generated list of users:

No alt text provided for this image
Observing that for each group, we could see the users as members of that group.

While the new prompt generated the groups and the members, one thing that bothered me was that using the objectClass groupOfUniqueNames and setting uniqueMember fields may not work in some customer environments. I tried to specify to ChatGPT that the LDIF file should be compatible with ApacheDS:

Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password

Please provide the output in LDAP ldif format that can be 
imported into ApacheDS.

...        

Interestingly, the membership information was now tagged to users (using the memberOf attribute) instead of the group level:

No alt text provided for this image
Checking if it changed anything if we mentioned ApacheDS.

The issue with using memberOf was that it was harder to determine which are the members in a group. Some customers prefer to use the member attribute instead of memberOf to determine membership information at the group level (only 1 can be chosen). To do this, I updated the prompt by adding this clause:

These groups will use the objectClass groupOfNames        

Hence, the final prompt:

Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password

Please provide the output in LDAP ldif format.

Please use the following as part of the DN "dc=interesting,dc=com"
For the uid attribute, it should be of the format firstname.lastname
For the email field, it should be of the format 
firstname.lastname@interesting.com
For the displayName attribute, use the first name and last name.
The users should belong to the OU: "ou=Users"
The following objectClasses need to be added: 
tlsKeyInfo, person, organizationalPerson,inetOrgPerson

The following groups will also need to be created under the 
OU "ou=Groups": Sales, Marketing, Services, Engineering
These groups will use the objectClass groupOfNames
Randomly assign the above users into the groups as members.         

Now the membership information was tagged at the group level instead of the user:

No alt text provided for this image
Updating the prompt to use groupOfNames generated membership information within the group.

ChatGPT had also updated the objectClass accordingly to member instead of memberOf.

Of course there was still some work left to clean up the output and replace the passwords, but getting ChatGPT to help certainly made a difference!

Parting Shot

I was also curious how it would differ if I asked ChatGPT to generate the LDIF to be compatible with Microsoft Active Directory. I updated the following line in the prompt:

Please provide the output in LDAP ldif format that is compatible 
with Microsoft Active Directory.        

This generated users that had additional attributes, which I identified that are found in Microsoft Active Directory (e.g. sAMAccountName, userPrincipalName, givenName).

No alt text provided for this image
Specifying the output to be compatible with Microsoft Active Directory

I hope this helps some of you to make your work easier!


#genai #chatgpt #data

Sebastien Cognet

Protect data in all data management platform (access & privacy)

1y

Very useful

Like
Reply
Joe Apfelbaum

✨♏CEO, evyAI -AI LinkedIn™ Trainer, Business Development Training B2B Marketing via Ajax Union // Networking Connector, Author, Speaker, Entrepreneur, AI Expert, Single Father👭👨👦🧑🤝🧑

1y

Have you head of evyAI? Its a LinkedIn assistant that helps you generate comments on posts and customize invite notes to LinkedIn connections with AI. It does not Automate Linkedin but it does save a ton of time. You can try it with no CC at www.evyai.com - Let me know what you think! BOOM

Like
Reply
Jong Hang Siong

I founded OTONOCO in Singapore to design and build SaaS and Mobile Apps that incorporates Generative and Agentic AI to solve complex problems in the industry

1y

👍🏻

Like
Reply

To view or add a comment, sign in

More articles by Gerald Yong

Insights from the community

Others also viewed

Explore topics