From Data Wrangling to Business Computing: July 2019

Monday, July 29, 2019

CSV file with multiples tables, how to get the middle table only

【Source】
https://stackoverflow.com/questions/57095143/python3-7-csv-file-with-multiples-tables-how-to-get-the-middle-table-only

【Problem】
I have a CSV file with multiples headers and tables which was created by our system. Number of rows are dynamics but the table title name is always the same. Between each table, there's a blank row. I'm using python3.7.3 and want to get the middle table(Device table) then upload to our database.

How can I do to get the middle table only? Can regex work with CSV file in this case?

Original file:

    Report title:ABC
    Created Date:Jul-15-2019

    Model
    Model Name,Number
    abc,1
    abc,2

    Device
    Device Name,Number
    efg,1
    efg,2
    efg,3

    Missing Device
    Device Name,Number
    xyz,3
    xyz,4

The table I want to have(without table name):

    Device Name,Number
    efg,1
    efg,2
    efg,3

【Answer】

	A	B	C
1	=file("multiples-tables.csv").read@n()
2	=create(DeviceName,Number)
3	for A1.group@i(~=="")
4		if A3.m(2)=="Device"
5			=A3.to(4,)
6			>C5.run(A2.insert(0,string(C5.~.array().m(1)),C5.~.array().m(2)))

A1：Read in sequence by line
A2：Create sequence table
A3：Grouping in empty rows
B4：Determine whether the second item of the intra-group sequence is “Device”
C5：Interception of required data items
C6：Insert data item to order table

【Summary】
CSV files often contain a variety of table structures. How to quickly extract certain types of data, esProc gives a general and simple method. For more examples of processing, please refer to Qian College - esProc - text computing.

Accumulate in order according to the parent ID

【Problem】
https://forums.opentext.com/forums/discussion/183667/how-to-use-one-data-set-output-to-create-another-data-set-dynamically

I have one data set as below

final result should be like below. each deal should get the accumulated Debit and credit based on it’s parent deal id.

【Answer】
Idea: The questioner does not explain whether the original table records are inserted in chronological order or not. From the sample data, it seems that they are inserted in chronological order. Therefore, it is only necessary to traverse the sample data in order to construct a cumulative table, insert the non-parent DealID directly into the cumulative table, and insert the parent DealID after accumulating with the previous cumulative record.

	A	B	C
1	=file("C:/Users/debit.txt").import@t()
2	=create(DealID,Credit,Debit)
3	for A1
4		if A2.select(DealID==A3.OriginalDeal).count(DealID)==0
5			=A2.insert(0,A3.DealID,A3.Accounted,A3.AccountedDebit)
6		esle
7			=A2.select@1(DealID==A3.OriginalDeal)
8			=A2.insert(0,A3.DealID,A3.Accounted+C7.Credit,A3.AccountedDebit+C7.Debit)

After running:
A1:

A2:

Monday, July 29, 2019

CSV file with multiples tables, how to get the middle table only

Accumulate in order according to the parent ID

Blog Archive