How to subtract two sets of data with unequal spacings?

zqhu · June 10, 2020, 11:30pm

Hi everybody,

I have a problem regarding data processing in python. Assuming we have two sets of data (x1, y1 and x2, y2), both have different points and spacing. For example:

x1=[4, 3, 2.5, 2, 1, -1, -1.5, -2, -4]
y1=[1, 0.8, 0.7, 0.5, 0.3, 0, -0.3, -0.6, -0.9]
x2=[5, 3, 2, 1, -1, -1.5, -5]
y2=[2, 1.8, 1, 0, 0.2, -0.5, -1.5, 2.5]

How to subtract y1 from y2 ?

y3=y2-y1

So far I have tried to use interpolate to first get y1_new, which represents y1 in x2:

y1_new = y1 = np.interp(x2,x1,y1)

Thanks in advance.

JeremyLT · June 10, 2020, 11:32pm

What do you want y1 - y2 to mean in this context? Subtraction between two vectors of different lengths is not, in general, well defined.

zqhu · June 10, 2020, 11:41pm

Hi Jeremy,

Actually y1 is considered as baseline, which I want to subtract from y2.

JeremyLT · June 10, 2020, 11:43pm

Sure, but what does it mean when you have different numbers of values in each array? Those arrays don’t appear to be linearly spaced, so its hard to see a clean way to do this.

Are you trying to find the difference between two lines/curves in some sense?

zqhu · June 10, 2020, 11:49pm

The data could be presented in a better way:
y1 = f(x1)
y2 = f(x2)

Exactly, I want to find the difference between two lines.

JeremyLT · June 11, 2020, 12:00am

The problem here is that you have an extrapolation problem, which is inherently messy because you’re basically making wild guesses about the sampled function outside of the domain.

The only way I can think to do this cleanly would be to use np.interp() as you have above, mapping the values from the larger domain (x2) on the smaller domain (x1).

y2_smaller_domain = np.interp(x1, x2, y2)

There are also be some more advanced interpolation functions (it depends upon the context of the data what interpolant is best).

You could do the reverse, mapping the values from the smaller domain (x1) onto the larger domain (x2), but you would have to decide how you want to ‘guess’ that your function behaves outside of its domain, which is very difficult. Extrapolation always has more error than interpolation.

JeremyLT · June 11, 2020, 12:07am

Edit: It looks like scipy.interpolate.interp1d has an extrapolation capability: https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.interp1d.html

Same caveats about how error prone extrapolation is from above still apply. There is a reason ‘extrapolate wildly’ is used as a term to criticize.

zqhu · June 11, 2020, 2:42pm

Thanks for your explanations Jeremy! My data is huge (around 5000 points) and two sets of x only differ slightly. So making extrapolation/interpolation is reasonable, I think.
Now I figured it out by using following code:

from scipy import interpolate
f = interpolate.interp1d(x1, y1, kind='nearest',fill_value="extrapolate")
y_ = f(x2)
y = y2-y_
plt.plot(x2,y)

zqhu · June 11, 2020, 2:45pm

It’s the answer, thanks so much!

JeremyLT · June 11, 2020, 2:56pm

As long as the bit of error is acceptable in your application, that’s what matters in the end. I’m glad I could help!

Topic		Replies	Views
Diff two arrays. When will I ever use it?	7	1689	January 16, 2021
## Data Structures: Perform a Difference on Two Sets of Data JavaScript	2	490	February 2, 2021
How to interpolate a function for two variable? Python	1	1028	September 20, 2021
Need help to understand what's wrong with my solution javascript	5	758	January 17, 2021
Help with the Symmetric Differece	8	1072	January 16, 2021

How to subtract two sets of data with unequal spacings?

Related topics