Set operations of data between Excel columns (intersection, union, and difference)

Judith-Excel-Sharing - Mar 6 - - Dev Community

Sometimes in work, we may encounter situations where two or more sets of data intersect or union, but Excel itself does not support direct set operations. For set operations, a series of functions need to be used to implement them, which is cumbersome and not easy to understand. Here is a good method for doing set operations. By using an Excel plugin SPL, you can directly perform set operations.

1. Set operations of two columns of data

For two columns of data, operations can be directly done with operators (intersection ^, union &, difference ) in SPL, which is simple and clear,

For example:

There is a list of products with the top ten sales figures for January and February:

Sales Table

(1)Intersect

Identify the products which enter top ten in both January and February.

The operation is very simple. Enter the following code in a blank cell of Excel:
=spl("=?1^?2",B2:B11,C2:C11)
As shown in the figure:

Sales Table with code inserted

Then ctrl-enter returns the result directly:

Result Table of intersect coding

Code explanation: =spl() indicates calling the SPL plugin, ?1 and ?2 represents the data parameters involved in the calculation, which refer to B2:B11 and C2:C11 respectively, is the intersection operator, and ?1?2 is to find the intersection of two sets of data.

Similarly, you can also calculate union and difference.

(2)Union

Identify the list of products that have entered the top ten once or more:
=spl("=?1&?2",B2:B11,C2:C11)

Result Table of Union coding

(3)Difference

Identify the list of products that made it to the top ten in January but did not make it to the top ten in February:
=spl("=?1\\?2",B2:B11,C2:C11) <code> </code>

Result Table One of difference coding

Identify the list of products that made it to the top ten in February but did not make it to the top ten in January:
=spl("=?2\\?1",B2:B11,C2:C11)

Result Table Two of difference coding

2. Set operations on multi column data

The set operations of multi column data can also use the writing method of two column data, that is, ?1?2?3……, but when there are many columns, it is still a bit cumbersome to write, and each column needs to pass data once.

Conveniently, SPL also provides functions isect(), union(), and diff() that can directly calculate the intersection, union, and difference of multiple sets of data.

For example:

There is a list of products with sales ranking in the top ten in the first few months of this year (the number of months will increase over time):

Sales Table

(1)Intersection of multiple columns of data

Identify the list of products that have entered top ten in each of the several months.
=spl("=transpose(?1).isect()",B2:E11)
Then ctrl-enter returns the calculation result, and it can be seen that two products have achieved sales in top ten in each of the first four months.

Result table

Code explanation: =spl()indicates calling the SPL plugin, ?1 represents the data parameter involved in the calculation, which refers to B2:E11. After reading in SPL, it will be in the form of a multidimensional matrix [[“Sasquatch Ale”, “Northwoods Cranberry Sauce”, “Aniseed Syrup”,“Chef Anton’s Gumbo Mix”],[“Steeleye Stout”,“Original Frankfurter grüne So?e”,“Chef Anton’s Cajun Seasoning”, “Sasquatch Ale”],……,[“Louisiana Hot Spiced Okra”, “Flotemysost”,“Gnocchi di nonna Alice”, “Escargots de Bourgogne”]] to participate in the calculation.

isect()calculates the intersection of multiple sets of data, as it needs to calculate the intersection of each column of data, therefore, it is necessary to first transpose it with the transpose() function.

This writing method can pass in all data at once, and there is no limit to the number of columns.

Similarly, it can also calculate the union and difference of multiple column data.

(2)Multiple column data union

Identify the products that have made it to the top ten once or more:
=spl("=transpose(?1).union()",B2:E11)
The calculation result is relatively long and has been omitted here.

(3)Multi column data difference

Identify the list of products that made it to the top ten in January but did not make it to the top ten in other months:
=spl("=transpose(?1).diff()",B2:E11)

Result table

Operations not supported in Excel can be easily solved with a plugin!!!

In fact, the functions of SPL go far beyond that. It provides many flexible data processing functions, which can handle various complex Excel operations. The syntax is also simple, and using it to assist Excel calculations can double the work efficiency in seconds.

And SPL also comes with rich reference cases Desktop and Excel Data Processing Cases, where 90% of Excel problems in the workplace can be solved. The code in the book can be basically copied and then used with slight modifications.

SPL download address: esProc Desktop Download

Plugin Installation Method: SPL XLL Installation and Configuration

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .