Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization for multi-valued LDAP attributes such as groups uniqueMember #255

Closed
FlagonC opened this issue Dec 8, 2023 · 12 comments · Fixed by #285
Closed

Optimization for multi-valued LDAP attributes such as groups uniqueMember #255

FlagonC opened this issue Dec 8, 2023 · 12 comments · Fixed by #285
Milestone

Comments

@FlagonC
Copy link

FlagonC commented Dec 8, 2023

When configuring LSC synchronization, it is currently not possible to optimize modifications to be able to make add and delete modifications for multivalued attributes containing many entries, such as groups with uniqueMember.
This can lead to slowness when used with attributes containing thousands of entries.

For exemple, i have a test group contenning multiples users:

dn: cn=test,ou=group,dc=test
uniqueMember: uid=test01,ou=people,dc=test
uniqueMember: uid=test02,ou=people,dc=test
uniqueMember: uid=test03,ou=people,dc=test

If i add the user test04, LSC will make the modification like this:

dn: cn=test,ou=group,dc=test
changetype: modify
replace: uniqueMember
uniqueMember: uid=test01,ou=people,dc=test
uniqueMember: uid=test02,ou=people,dc=test
uniqueMember: uid=test03,ou=people,dc=test
uniqueMember: uid=test04,ou=people,dc=test

The most optimized solution would be to do the following for large groups

dn: cn=test,ou=group,dc=test
changetype: modify
add: uniqueMember
uniqueMember: uid=test04,ou=people,dc=test

https://github.com/lsc-project/lsc/issues/70
We can do add by following this function, but not for delete

@LiquidFenrir
Copy link
Contributor

Hello @coudot, I have fixes/improvements for this implemented but not uploaded/PR'd yet, I'd like to know which one would be preferred as there are two, each with pros and cons.

  1. "optimizing" the FORCE policyType in the case of a REPLACE_VALUES operation in BeanComparator.java to use smaller ADD and DELETE modifications instead of a complete REPLACE.
    Leverages SetUtils.findMissingNeedles in the opposite direction of MERGE to find the "excess" elements in the destination compared to the source.

  2. Involves adding two policyTypes in the XSD schema, DIFFERENCE and KEEP_SOURCE (names unimportant, but tries to make the meaning obvious), also in case of a REPLACE_VALUES operation:

    • DIFFERENCE is the set difference operation: output = destination - source. Applying the DELETE modification to this output element set, makes this the opposite of the MERGE policyType: it removes excess elements in the destination (whereas MERGE adds the missing elements in the destination).
    • KEEP_SOURCE is the opposite of KEEP (which keeps the destination untouched, as I understand it): it makes the destination a synchronization of the source, working like the optimized FORCE (or like a combination of MERGE and DIFFERENCE).

Both are tested to work the same, at least for the uniqueMember example provided.

I like the first one less because it changes the current expected behaviour, and may break some people's usage, but it's a smaller code change.
The second one seems better, it adds features that might be useful and can't risk affecting users' workflows.

I can publish both branches to compare/see the changes, and PR the accepted one. Or I can wait until the community/maintainers decide one idea is better than the other and publish only that branch.
Up to you, thanks in advance!

@coudot
Copy link
Member

coudot commented Dec 18, 2023

I've no strong opinion on this but we should avoid to have a too complex configuration. Maybe there is no need to keep the old behavior (replace all values) and just use the new way (delete/add). Or keep the replace mode only it there is only one value in source and destination.

Maybe @davidcoutadeur @soisik @rouazana can comment on this.

@LiquidFenrir
Copy link
Contributor

Thanks for the quick reply, I've pushed the changes for the optimized FORCE policy at https://github.com/INTM-Group/lsc/tree/opti-force-policy, you can review them and make a PR directly from our branch, afaik. Otherwise, let me know and it can be done on our side easily, or maybe we should wait for the other maintainers' comments.
And it is a good idea to keep the replace mode when it is better than a ton of deletes and adds, which resulted in the second commit. Thanks for the insight.

@rouazana
Copy link
Contributor

Hello,

This looks great, thank you for your contribution!

Two remarks:

I opened the MR for more discussion here: #259

@SchaffnerMi
Copy link

Hello,

Thank you for your work. Would it be possible to test this as well? Is it possible to obtain a package for installation under ubuntu?

Thanks,
Michel

@SchaffnerMi
Copy link

Hello,

We tried to package the application and tested it on our test platform. Unfortunately this doesn't seem to work. The contents of the member "attribute" is completely replaced as before. We set the policy to FORCE for the attribute, do we need to do something more?

Thanks for your feedback,
Michel

@davidcoutadeur
Copy link
Contributor

Thanks for the quick reply, I've pushed the changes for the optimized FORCE policy at https://github.com/INTM-Group/lsc/tree/opti-force-policy, you can review them and make a PR directly from our branch, afaik. Otherwise, let me know and it can be done on our side easily, or maybe we should wait for the other maintainers' comments. And it is a good idea to keep the replace mode when it is better than a ton of deletes and adds, which resulted in the second commit. Thanks for the insight.

Hello,

Thanks again for your contribution!

I have read the PR "1. optimizing" the FORCE policyType". This optimization seems really interesting. I have no particular remark on the code itself.
I have no such use case currently, but maybe there might be some use cases where we want to force replacement. For example if we want to force a memberOf or member integrity process by our directory?
If we want to have this option, this requires to have an option to force the (missingValues.size() + extraValues.size()) >= toSetAttrValues.size() condition.

I didn't understand the way the second solution is intended to work ("2. Involves adding two policyTypes in the XSD schema, DIFFERENCE and KEEP_SOURCE"). Could you show the source code or explain how the DIFFERENCE and KEEP_SOURCE are used?

@davidcoutadeur
Copy link
Contributor

Hello,

We tried to package the application and tested it on our test platform. Unfortunately this doesn't seem to work. The contents of the member "attribute" is completely replaced as before. We set the policy to FORCE for the attribute, do we need to do something more?

Thanks for your feedback, Michel

Did you check you had less modifications than the number of values in the source attribute?

@SchaffnerMi
Copy link

In fact we tested on a group of more than 100,000 members where there are less than 10 changes.
Maybe we have a problem with our version of LSC which we compiled with some difficulty.

@LiquidFenrir
Copy link
Contributor

LiquidFenrir commented Jan 11, 2024

Hello,

I didn't understand the way the second solution is intended to work ("2. Involves adding two policyTypes in the XSD schema, DIFFERENCE and KEEP_SOURCE"). Could you show the source code or explain how the DIFFERENCE and KEEP_SOURCE are used?

The second solution was to add a policyType DIFFERENCE that does delete operations on elements in the destination but not the source (like the opposite of MERGE), and a policyType KEEP_SOURCE would be the equivalent of a DIFFERENCE and a MERGE in one operation (acting just like the optimized FORCE of the first commit of the PR).
I still have the patch locally and can upload the diff file if you wish to see the code, or even make a separate branch.

They would be used for granularity/more control, but I'm not sure it's that big of a use.

I have no such use case currently, but maybe there might be some use cases where we want to force replacement. For example if we want to force a memberOf or member integrity process by our directory?

I have to admit I'm not well versed in LDAP/AD workings at all, I kind of just made the changes and tested them on a few minimal examples (with a main groupOfUniqueNames object and uniqueMembers getting add/delete/replace'd to be as close to the issue as possible). It's highly probable that there are edge cases where this isn't advisable, or even break compatibility, I'll resort to your (the main contributors') advice/knowledge on this.

@SchaffnerMi
Copy link

SchaffnerMi commented Jan 11, 2024

Hello,

According to the logs the problem seems to come from that the new value and the old value are the same:

Jan 11 11:00:22 - DEBUG - In object "CN=GG_ARC,ou=structures,ou=demo,ou=groups,dc=ad-dev,dc=demo,dc=org": Replacing attribute "member": source values are [uid=toto,ou=demo,ou=people,o=annuaire, uid=alice,ou=demo,ou=people,o=annuaire], old values were [CN=alice,OU=employee,OU=demo,OU=people,DC=ad-dev,DC=demo,DC=org, CN=toto,OU=employee,OU=demo,OU=people,DC=ad-dev,DC=demo,DC=org], new values are [CN=toto,OU=employee,ou=demo,ou=people,dc=ad-dev,dc=demo,dc=org, CN=alice,OU=employee,ou=demo,ou=people,dc=ad-dev,dc=demo,dc=org]

We use code from this page to update members: https://lsc-project.org/documentation/latest/synchronizegroups.html

when we delete a value in member in the LDAP and we do a new sync all value are rewrite into de AD with the exception of the deleted entry like:

old values were (42 entries)
a.castagna
adelaunay
alexandraweber
amandineperret
amarchierjamet
...

new values are (41 entries)
adelaunay
alexandraweber
amandineperret
amarchierjamet
...

also the behavior of not returning everything when there is only one modification on a list of 42 entries does not seem to work

The structure of the source values (LDAP) are not the same with the AD structure, is this why it doesn't work?
source value are uid=XXXX,,ou=demo,ou=people,o=annuaire
and
New value are= CN=XXXX,OU=employee,ou=demo,ou=people,dc=ad-dev,dc=demo,dc=org

@davidcoutadeur
Copy link
Contributor

This PR is replaced by #285

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment