×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Convert data structure

Convert data structure

Convert data structure

(OP)
Hello C# legends,

I have raw data with a family structure vertical as below

CODE

ROW_N FamilyID CompanyName HierarchyLevel ClientID
1      A        XX1          0              A1
2      A        XX2          1              A12
3      A        XX3          2              A345
4      A        XX4          2              A456
5      A        XX5          3              A6788
6      A        XX6          2              A346
7      A        XX7          3              A6789
8      B        XX8          0              B1
9      B        XX9          1              B11 
10     B        X10          2              B345
11     B        X11          2              B567 
12     B        X12          3              B6789 


We want to build parent-children relationships from the above structure to horizontal using ClientID as keys as per below.
The logic is the parent is the closest one level higher hierarchy order by ROW_N.

Hence expected output is below

CODE

ROW_N FamilyID ParentID ParentName ChildrenID ChildrenName HierarchyLevel 
1      A        A1         XX1        A12        XX2          1
2      A        A12        XX2        A345       XX3          2
3      A        A12        XX2        A456       XX4          2
4      A        A456       XX4        A6788      XX5          3
5      A        A12        XX2        A346       XX6          2
6      A        A346       XX6       A6789       XX7          3
7      B        B1         XX8        B11        XX9          1
8      B        B11        XX9        B345       X10          2
9      B        B11        XX9        B567       X11          2
10     B        B567        X11       B6789      X12          3 

SQL gurus say this is too complex for SQL hence I was wondering whether you guys could help the logic to build in C#?

RE: Convert data structure

Hi peac3,

I don't have MS SQL server, so I used MS Access.
First I created a MS Access database file named peac3.accdb with one table named Table_inp which contains your data.

Then I created C# Console Application which reads the MS Access table and writes out results on the console. The next step would be to write the result into other table.

The logic is similar as in in the awk script posted here:
https://www.tek-tips.com/viewthread.cfm?qid=182923...

Program.cs

CODE

using System;
using System.Collections.Generic;
using System.Data.OleDb;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace db_example
{

    public class Parent
    {
        public string FamilyID;
        public string ParentID;
        public string ParentName;
    }

    internal class Program
    {
        static void Main(string[] args)
        {
            string conn_string = "PROVIDER=Microsoft.ACE.OLEDB.12.0" +
                  "; DATA SOURCE=peac3.accdb" +
                  "; Persist Security Info=False;";
            OleDbConnection dbcon = new OleDbConnection(conn_string);

            string query = "select * from Table_inp";
            OleDbCommand cmd = new OleDbCommand(query);
            cmd.Connection = dbcon;

            dbcon.Open();

            // declare parent dictionary
            var parent_dict = new Dictionary<String, Parent>();

            var reader = cmd.ExecuteReader();
            int count = 0;
            while (reader.Read())
            {
                var HierarchyLevel = (int)reader["HierarchyLevel"];
                var FamilyID = reader["FamilyID"].ToString();
                var ClientID = reader["ClientID"].ToString();
                var CompanyName = reader["CompanyName"].ToString();
                var current_key = FamilyID + "#" + HierarchyLevel.ToString();
                var parent_key = FamilyID + "#" + (HierarchyLevel - 1).ToString();

                if (parent_dict.ContainsKey(parent_key))
                {
                    count++;
                    if (count == 1) {
                        Console.WriteLine(
                            "ROW\tFamID\tParID\tParNam\tChldID\tChldNam\tHier");
                    }
                    var parent_data =
                        parent_dict[parent_key].FamilyID + "\t" +
                        parent_dict[parent_key].ParentID + "\t" +
                        parent_dict[parent_key].ParentName;

                    var out_line =
                        count.ToString("0#") + "\t" +
                        parent_data + "\t" +
                        ClientID + "\t" +
                        CompanyName + "\t" +
                        HierarchyLevel.ToString();

                    Console.WriteLine(out_line);
                }

                // create new parent object from current record
                // to store it into the parrent dictionary
                Parent parent = new Parent
                {
                    FamilyID = FamilyID,
                    ParentID = ClientID,
                    ParentName = CompanyName
                };

                // store current record into parent dictionary
                parent_dict[current_key] = parent;  
            }
            dbcon.Close();
        }
    }
} 
The output is this:


RE: Convert data structure

(OP)
Excellent, you are such a legend.

Quick question, why are you not sorting it by ROW_N or is it already by default?

RE: Convert data structure

Good question smile
Yes, it didn't occur to me because it is default sorted by ROW_N in my MS Access table


So change the query to

CODE

var query = "select * from Table_inp order by ROW_N"; 

RE: Convert data structure

(OP)
Hi Mikrom,

I just realised the CLIENTID can be duplicated as the company name could be similar, we need to show the duplication in the child.
But when we choose the parent, we need to dedup it by the highest match rank.

For example below

CODE

ROW_N FamilyID CompanyName HierarchyLevel ClientID MatchRank
1      A        XX1          0              A1
2      A        XX2          1              A12
3      A        XX3          2              A345        
4      A        XX4ABC       2              A456A        2
5      A        XX4ABC       2              A456B        1
6      A        XX4ABC       2              A456C        3
7      A        XX5          3              A6788
8      A        XX6          2              A346
9      A        XX7          3              A6789
10     B        XX8          0              B1
11     B        XX9          1              B11 
12     B        X10          2              B345
13     B        X11          2              B567 
14     B        X12          3              B6789 

expected output would be

CODE

ROW_N FamilyID ParentID ParentName ChildrenID ChildrenName HierarchyLevel 
1      A        A1         XX1        A12        XX2          1
2      A        A12        XX2        A345       XX3          2
3      A        A12        XX2        A456A       XX4ABC       2
4      A        A12        XX2        A456B       XX4ABC       2
5      A        A12        XX2        A456C       XX4ABC       2
6      A        A456B      XX4ABC     A6788      XX5          3
7      A        A12        XX2        A346       XX6          2
8      A        A346       XX6       A6789       XX7          3
9      B        B1         XX8        B11        XX9          1
10      B        B11        XX9        B345       X10          2
11      B        B11        XX9        B567       X11          2
12     B        B567        X11       B6789      X12          3 

Any idea how to filter by this exception?

RE: Convert data structure

Hi peac3,
I don't understand what you want to achieve.

Where did you get ParentID = A456BBC? We don't have such a name in the input table.

CODE

ROW_N FamilyID ParentID ParentName ChildrenID ChildrenName HierarchyLevel 
...
4      A        A456BBC    XX4        A6788      XX5          3
... 

Why shouldn't A6788 have A456 as its parent when in the input table A456 is its predecessor one level lower?

CODE

ROW_N FamilyID CompanyName HierarchyLevel ClientID MatchRank
...
6      A        XX4          2              A456        
7      A        XX5          3              A6788
... 

What does mean match rank and which match rank is lower in this case, 1 or 2 ?

RE: Convert data structure

(OP)
Sorry mikrom,
it was like Sunday after midnight when I typed this, so it has some typos.

We just received a new data set to handle duplicate company names.
In that case, they rank it for us to identify the client who has the biggest deal amount and flag it in MatchRank.
With match rank, then we should identify which client ID should be the parent.

The vanilla rule is still picked up from the higher row with additional exceptions below.
if there are duplicate companies, list them all as children then pick up the highest rank as parent.

I have rectified the sample accordingly

Quote:


Where did you get ParentID = A456BBC? We don't have such a name in the input table.

sorry typo, the parentID should be A456B because this one is the highest match rank (number 1), and the parent name should be XX4ABC -> as per row 6 (I have corrected this)

Quote:


Why shouldn't A6788 have A456 as its parent when in the input table A456 is its predecessor one level lower?

because A456A, A456B and A456C have the same company name hence we need to pick up the highest match rank.

Quote:


What does mean match rank and which match rank is lower in this case, 1 or 2?

lower number is higher hence 1 is the highest...

I have rectified the sample as below


CODE

ROW_N FamilyID CompanyName HierarchyLevel ClientID MatchRank
1      A        XX1          0              A1
2      A        XX2          1              A12
3      A        XX3          2              A345        
4      A        XX4ABC       2              A456A        2
5      A        XX4ABC       2              A456B        1
6      A        XX4ABC       2              A456C        3
7      A        XX5          3              A6788
8      A        XX6          2              A346
9      A        XX7          3              A6789
10     B        XX8          0              B1
11     B        XX9          1              B11 
12     B        X10          2              B345
13     B        X11          2              B567 
14     B        X12          3              B6789 

expected output would be

CODE

ROW_N FamilyID ParentID ParentName ChildrenID ChildrenName HierarchyLevel 
1      A        A1         XX1        A12        XX2          1
2      A        A12        XX2        A345       XX3          2
3      A        A12        XX2        A456A       XX4ABC       2
4      A        A12        XX2        A456B       XX4ABC       2
5      A        A12        XX2        A456C       XX4ABC       2
6      A        A456B      XX4ABC     A6788      XX5          3
7      A        A12        XX2        A346       XX6          2
8      A        A346       XX6       A6789       XX7          3
9      B        B1         XX8        B11        XX9          1
10      B        B11        XX9        B345       X10          2
11      B        B11        XX9        B567       X11          2
12     B        B567        X11       B6789      X12          3 

RE: Convert data structure

So, when saving into the parrent dictionary you also have to take MatchRank into account.

RE: Convert data structure

(OP)

Quote:


So, when saving into the parrent dictionary you also have to take MatchRank into account.

Correct, but I am not quite sure whether I understand your logic codingly,
pseudo-code or example would be nice :)

RE: Convert data structure

I tried this
Input:


Output:


Program Code:

CODE

using System;
using System.Collections.Generic;
using System.Data.OleDb;
using System.Linq;
using System.Runtime.InteropServices.ComTypes;
using System.Security.Policy;
using System.Text;
using System.Threading.Tasks;

namespace db_read_example
{
    public class Parent
    {
        public string FamilyID;
        public string ParentID;
        public string ParentName;
        public string MatchRank;
    }

    internal class Program
    {
        static void Main(string[] args)
        {
            var db_file = "c:\\00_mikrom\\Work\\C#\\peac3.accdb";
            var conn_string = "PROVIDER=Microsoft.ACE.OLEDB.12.0" +
                  "; DATA SOURCE=" + db_file +
                  "; Persist Security Info=False;";
            OleDbConnection dbcon = new OleDbConnection(conn_string);

            var query = "select * from Table_inp order by ROW_N";
            OleDbCommand cmd = new OleDbCommand(query);
            cmd.Connection = dbcon;

            dbcon.Open();

            // declare parent dictionary
            var parent_dict = new Dictionary<String, Parent>();

            var reader = cmd.ExecuteReader();
            int count = 0;
            while (reader.Read())
            {
                var HierarchyLevel = (int)reader["HierarchyLevel"];
                var FamilyID = reader["FamilyID"].ToString();
                var ClientID = reader["ClientID"].ToString();
                var CompanyName = reader["CompanyName"].ToString();
                var current_key = FamilyID + "#" + HierarchyLevel.ToString();
                var parent_key = FamilyID + "#" + (HierarchyLevel - 1).ToString();
                var MatchRank = reader["MatchRank"].ToString();

                // check current match rank
                bool current_matchrank_numeric = false;
                int current_matchrank = 0;
                if (!string.IsNullOrWhiteSpace(MatchRank))
                {
                    current_matchrank_numeric =
                    int.TryParse(MatchRank, out current_matchrank);
                }

                if (parent_dict.ContainsKey(parent_key))
                {
                    count++;
                    if (count == 1)
                    {
                        // write header
                        Console.WriteLine("ROW\tFamID\tParID\tParNam\t" +
                            "ChldID\tChldNam\tHier");
                    }

                    var parent_data =
                        parent_dict[parent_key].FamilyID + "\t" +
                        parent_dict[parent_key].ParentID + "\t" +
                        parent_dict[parent_key].ParentName;

                    var out_line =
                        count.ToString("0#") + "\t" +
                        parent_data + "\t" +
                        ClientID + "\t" +
                        CompanyName + "\t" +
                        HierarchyLevel.ToString();

                    Console.WriteLine(out_line);
                }

                // create new parent object from current record
                // to store it into the parrent dictionary
                Parent parent = new Parent
                {
                    FamilyID = FamilyID,
                    ParentID = ClientID,
                    ParentName = CompanyName,
                    MatchRank = MatchRank
                };

                // store current record into parent dictionary
                if (parent_dict.ContainsKey(current_key))
                {
                    if (!string.IsNullOrWhiteSpace(parent_dict[current_key].MatchRank))
                    {
                        bool previous_matchrank_numeric = false;
                        int previous_matchrank = 0;
                        previous_matchrank_numeric =
                             int.TryParse(parent_dict[current_key].MatchRank,
                                          out previous_matchrank);
                        if (current_matchrank <= previous_matchrank)
                        {
                            parent_dict[current_key] = parent;
                        }
                    }
                    else
                    {
                        parent_dict[current_key] = parent;
                    }
                }
                else
                {
                    parent_dict[current_key] = parent;
                }
            }
            dbcon.Close();
        }
    }
} 

RE: Convert data structure

(OP)
Thanks Mikrom,

You are such a legend!

After discussion with our stakeholders, if the client names are duplicated, they do not want to pick based on the biggest deal only (match rank), there are other considerations.
Hence solution-wise, I will build a web front end for them to cherry-pick manually and it will be captured into our reference table.

Yes, they changed their mind :(

So I have built the reference table as simple as below.

CODE

FamilyID ClientName ClientID
A          XX4ABC    A456B 

Hence for exception cases as in duplicate names, this reference table above will determine uniquely which parent should be (replace the match rank logic).
So when saving into the parent dictionary we no longer need to Match Rank into account but we take it from the reference table instead.

How is the best way to incorporate it into your c# code above?

Thank you so much in advance,

RE: Convert data structure

If you have changed the basic data structure, you will not be able to use the program I posted above. You then have to solve it in your own way.

RE: Convert data structure

(OP)
Sorry mikrom, I was throwing a solution after I was talking to my stakeholders and yesterday tek-tips was down.

No, the basic data structure does not change. I just need to merge my reference table into the main table.
Hence to handle duplicate names, I just need to replace the column MatchRank with UserManualFlag as below.


CODE

ROW_N FamilyID CompanyName HierarchyLevel ClientID UserManualFlag
1      A        XX1          0              A1
2      A        XX2          1              A12
3      A        XX3          2              A345        
4      A        XX4ABC       2              A456A        
5      A        XX4ABC       2              A456B        Y
6      A        XX4ABC       2              A456C        
7      A        XX5          3              A6788
8      A        XX6          2              A346
9      A        XX7          3              A6789
10     B        XX8          0              B1
11     B        XX9          1              B11 
12     B        X10          2              B345
13     B        X11          2              B567 
14     B        X12          3              B6789 


expected output would be the same as before

CODE

ROW_N FamilyID ParentID ParentName ChildrenID ChildrenName HierarchyLevel 
1      A        A1         XX1        A12        XX2          1
2      A        A12        XX2        A345       XX3          2
3      A        A12        XX2        A456A       XX4ABC       2
4      A        A12        XX2        A456B       XX4ABC       2
5      A        A12        XX2        A456C       XX4ABC       2
6      A        A456B      XX4ABC     A6788      XX5          3
7      A        A12        XX2        A346       XX6          2
8      A        A346       XX6       A6789       XX7          3
9      B        B1         XX8        B11        XX9          1
10      B        B11        XX9        B345       X10          2
11      B        B11        XX9        B567       X11          2
12     B        B567        X11       B6789      X12          3 

So, when saving into the parent dictionary, how the best way to incorporate UserManualFlag='Y' into account

Thank you Sir

RE: Convert data structure

With the UserManualFlag it seems to be a modification of MatchRank.

RE: Convert data structure

This seems to work.

Program Code:

CODE

using System;
using System.Collections.Generic;
using System.Data.OleDb;

namespace db_read_example
{
    public class Parent
    {
        public string FamilyID;
        public string ParentID;
        public string ParentName;
        public string UserManualFlag;
    }

    internal class Program
    {
        static void Main(string[] args)
        {
            var db_file = "c:\\00_mikrom\\Work\\C#\\peac3.accdb";
            var conn_string = "PROVIDER=Microsoft.ACE.OLEDB.12.0" +
                  "; DATA SOURCE=" + db_file +
                  "; Persist Security Info=False;";
            OleDbConnection dbcon = new OleDbConnection(conn_string);

            var query = "select * from Table_inp order by ROW_N";
            OleDbCommand cmd = new OleDbCommand(query);
            cmd.Connection = dbcon;

            dbcon.Open();

            // declare parent dictionary
            var parent_dict = new Dictionary<String, Parent>();

            var reader = cmd.ExecuteReader();
            int count = 0;
            var previous_key = "";
            while (reader.Read())
            {
                var HierarchyLevel = (int)reader["HierarchyLevel"];
                var FamilyID = reader["FamilyID"].ToString();
                var ClientID = reader["ClientID"].ToString();
                var CompanyName = reader["CompanyName"].ToString();
                var current_key = FamilyID + "#" + HierarchyLevel.ToString();
                var parent_key = FamilyID + "#" + (HierarchyLevel - 1).ToString();
                var UserManualFlag = reader["UserManualFlag"].ToString();

                if (parent_dict.ContainsKey(parent_key))
                {
                    count++;
                    if (count == 1)
                    {
                        // write header
                        Console.WriteLine("ROW\tFamID\tParID\tParNam\t" +
                            "ChldID\tChldNam\tHier");
                    }

                    var parent_data =
                        parent_dict[parent_key].FamilyID + "\t" +
                        parent_dict[parent_key].ParentID + "\t" +
                        parent_dict[parent_key].ParentName;

                    var out_line =
                        count.ToString("0#") + "\t" +
                        parent_data + "\t" +
                        ClientID + "\t" +
                        CompanyName + "\t" +
                        HierarchyLevel.ToString();

                    Console.WriteLine(out_line);
                }

                // create new parent object from current record
                // to store it into the parrent dictionary
                Parent parent = new Parent
                {
                    FamilyID = FamilyID,
                    ParentID = ClientID,
                    ParentName = CompanyName,
                    UserManualFlag = UserManualFlag
                };

                // store current record into parent dictionary
                if (parent_dict.ContainsKey(current_key) && (previous_key == current_key))
                {
                    if (parent_dict[current_key].UserManualFlag.Trim().ToUpper() != "Y")
                    {
                        parent_dict[current_key] = parent;
                    }
                }
                else
                {
                    parent_dict[current_key] = parent;
                }
                previous_key = current_key; 
            }
            dbcon.Close();
        }
    }
} 

Input:



Output:

RE: Convert data structure

(OP)
Fantastic, you are such a legend.
I tried for a few companies, it works... let me try it with hundreds thousands of companies

RE: Convert data structure

(OP)
Hi Mikrom,

We need some little enhancement in your codes.
if there was some not duplicated value after the duplicated, it should be the parents the one right above it, not the one with Manual Flag = 'y'

As per below example

Data like below

CODE

ROW_N FamilyID CompanyName HierarchyLevel ClientID UserManualFlag
1      A        XX1          0              A1
2      A        XX2          1              A12
3      A        XX3          2              A345        
4      A        XX4ABC       2              A456A        
5      A        XX4ABC       2              A456B        Y
6      A        XX4ABC       2              A456C        
7      A        XX4-1       2              A456D        
8      A        XX5          3              A6788
9      A        XX6          2              A346
10      A        XX7          3              A6789
11     B        XX8          0              B1
12     B        XX9          1              B11 
13     B        X10          2              B345
14     B        X11          2              B567 
15     B        X12          3              B6789 



your code will produce below, to dedup the duplicated, however, your codes won't detect after duplication there is a singular value which is (A456D)

CODE

ROW_N FamilyID ParentID ParentName ChildrenID ChildrenName HierarchyLevel 
1      A        A1         XX1        A12        XX2          1
2      A        A12        XX2        A345       XX3          2
3      A        A12        XX2        A456A       XX4ABC       2
4      A        A12        XX2        A456B       XX4ABC       2
5      A        A12        XX2        A456C       XX4ABC       2
6      A        A12        XX2        A456D       XX4-1       2
7      A        A456B      XX4ABC     A6788      XX5          3
8      A        A12        XX2        A346       XX6          2
9      A        A346       XX6       A6789       XX7          3
10      B        B1         XX8        B11        XX9          1
11      B        B11        XX9        B345       X10          2
12      B        B11        XX9        B567       X11          2
13     B        B567        X11       B6789      X12          3 


The expected output would be below

CODE

ROW_N FamilyID ParentID ParentName ChildrenID ChildrenName HierarchyLevel 
1      A        A1         XX1        A12        XX2          1
2      A        A12        XX2        A345       XX3          2
3      A        A12        XX2        A456A       XX4ABC       2
4      A        A12        XX2        A456B       XX4ABC       2
5      A        A12        XX2        A456C       XX4ABC       2
6      A        A12        XX2        A456D       XX4-1       2
7      A        A456D      XX4-1      A6788      XX5          3
8      A        A12        XX2        A346       XX6          2
9      A        A346       XX6       A6789       XX7          3
10      B        B1         XX8        B11        XX9          1
11      B        B11        XX9        B345       X10          2
12      B        B11        XX9        B567       X11          2
13     B        B567        X11       B6789      X12          3 

Any idea on how to enhance this?

Thanks in advance

RE: Convert data structure

Hi peac3,

If you remove the Y flag from the 5th rows, then the result will be ok.




RE: Convert data structure

(OP)

Quote:


If you remove the Y flag from the 5th rows, then the result will be ok.


I know, but I need to keep that flag because the company name may be the parents in different families and that flag is to dedup which one is the right ClientID.

I think I have found a solution by comparing it with the previous name...

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login


Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close