I have to do various controls on a dataset - I created a multicast. After performing controls (one control per copy), I merge my (7) multicasted datasets using a Union All transformation. The problem I'm having are the duplicate rows created by merging the multicast copies.
How do I get rid of the duplicates? Is the Sort Transformation the solution by setting the option Remove rows with duplicate sort values to True? I have a unique key by which I'm able to discard the duplicates correctly. Are there any other ways (at a Union All level)? Is there sth like Union and Union All like in SQL?
I'm working on my 1st integration serv. project and it seems that more I work more questions I have. Shoudn't be the opposite? Thank you for the help.
See if the aggregate transformation can help you...
This paper has also some suggestions:
http://technet.microsoft.com/en-us/library/aa964137.aspx|||
Can you avoid creating duplicates in the first place? Perhpas the Conditional Split could be used instead of the Multicast?
I have found Sort to be the best de-duplication option, generally faster than the aggregate, but test it with your data if performance is an issue.
No comments:
Post a Comment