Creating Synthetic Users For LDAP Using ChatGPT
Following my previous article on Generating Realistic Synthetic Data with ChatGPT, I have decided to take this further to see how the concept can be used for my work.
In my work, it is often useful to show the customer some features of our platform before it is deployed in the customer environment. There is a need to connect up our data privacy platform to an LDAP server to show that users could be authenticated from a centralised user directory. The typical way to do this will be to set up a simple LDAP service like ApacheDS, and then import an LDIF file consisting of some sample users. Creating this set of users from scratch, especially when we wanted to customise to the customer's organisation, was often time-consuming, error-prone and tedious.
This can be made much simpler with ChatGPT. To do this, I started with a simple prompt:
Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password
For the email field, it should be of the format
firstname.lastname@interesting.com
Please provide the output in LDAP ldif format.
This generated a list of users, and in a relatively clean LDIF format:
I had specified the format of the email address, as typically when we do demos and POCs, the users tend to be from the same organisation. However, ChatGPT was not able to use the right DC, which was still set to dc=example,dc=com. I also didn't like the way that uid is a number, and that the displayName attribute which I needed for my application is missing. But I was quite surprised it managed to derive the objectClasses properly, though it was missing a few that I needed.
So, I tried to be more prescriptive with a modified prompt:
Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password
Please provide the output in LDAP ldif format.
Please use the following as part of the DN: "dc=interesting,dc=com"
For the uid attribute, it should be of the format firstname.lastname
For the email field, it should be of the format
firstname.lastname@interesting.com
For the displayName attribute, use the first name and last name.
The following objectClasses need to be included:
tlsKeyInfo, person, organizationalPerson
This gave the following output:
This was definitely much better, with the fields added in correctly. In my prompt I left out the inetOrgPerson objectClass, to see if ChatGPT will include it or if it will strictly use what I had provided. It did the latter, but I could always add it to the list later.
The next thing I needed was to have some user groups, as it was often nicer to have people from different teams in the organisation appearing in our application to show application features that were available to different groups of users. To do this, I modified the prompt to have ChatGPT generate groups, and randomly assign people into those groups.
Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password
Please provide the output in LDAP ldif format.
Please use the following as part of the DN "dc=interesting,dc=com"
For the uid attribute, it should be of the format firstname.lastname
For the email field, it should be of the format
firstname.lastname@interesting.com
For the displayName attribute, use the first name and last name.
The users should belong to the OU: "ou=Users"
The following objectClasses need to be added:
tlsKeyInfo, person, organizationalPerson
The following groups will also need to be created under the
OU "ou=Groups": Sales, Marketing, Services, Engineering
Randomly assign the above users into the groups as members.
I liked the way ChatGPT explained it at the top:
And the groups appeared in their own section with the list of members taken from the generated list of users:
While the new prompt generated the groups and the members, one thing that bothered me was that using the objectClass groupOfUniqueNames and setting uniqueMember fields may not work in some customer environments. I tried to specify to ChatGPT that the LDIF file should be compatible with ApacheDS:
Recommended by LinkedIn
Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password
Please provide the output in LDAP ldif format that can be
imported into ApacheDS.
...
Interestingly, the membership information was now tagged to users (using the memberOf attribute) instead of the group level:
The issue with using memberOf was that it was harder to determine which are the members in a group. Some customers prefer to use the member attribute instead of memberOf to determine membership information at the group level (only 1 can be chosen). To do this, I updated the prompt by adding this clause:
These groups will use the objectClass groupOfNames
Hence, the final prompt:
Please generate a sample synthetic dataset with 10 rows,
consisting of the following fields:
1. First Name
2. Last Name
3. Email Address
4. Password
Please provide the output in LDAP ldif format.
Please use the following as part of the DN "dc=interesting,dc=com"
For the uid attribute, it should be of the format firstname.lastname
For the email field, it should be of the format
firstname.lastname@interesting.com
For the displayName attribute, use the first name and last name.
The users should belong to the OU: "ou=Users"
The following objectClasses need to be added:
tlsKeyInfo, person, organizationalPerson,inetOrgPerson
The following groups will also need to be created under the
OU "ou=Groups": Sales, Marketing, Services, Engineering
These groups will use the objectClass groupOfNames
Randomly assign the above users into the groups as members.
Now the membership information was tagged at the group level instead of the user:
ChatGPT had also updated the objectClass accordingly to member instead of memberOf.
Of course there was still some work left to clean up the output and replace the passwords, but getting ChatGPT to help certainly made a difference!
Parting Shot
I was also curious how it would differ if I asked ChatGPT to generate the LDIF to be compatible with Microsoft Active Directory. I updated the following line in the prompt:
Please provide the output in LDAP ldif format that is compatible
with Microsoft Active Directory.
This generated users that had additional attributes, which I identified that are found in Microsoft Active Directory (e.g. sAMAccountName, userPrincipalName, givenName).
I hope this helps some of you to make your work easier!
Protect data in all data management platform (access & privacy)
1yVery useful
✨♏CEO, evyAI -AI LinkedIn™ Trainer, Business Development Training B2B Marketing via Ajax Union // Networking Connector, Author, Speaker, Entrepreneur, AI Expert, Single Father👭👨👦🧑🤝🧑
1yHave you head of evyAI? Its a LinkedIn assistant that helps you generate comments on posts and customize invite notes to LinkedIn connections with AI. It does not Automate Linkedin but it does save a ton of time. You can try it with no CC at www.evyai.com - Let me know what you think! BOOM
I founded OTONOCO in Singapore to design and build SaaS and Mobile Apps that incorporates Generative and Agentic AI to solve complex problems in the industry
1y👍🏻