Java Buzz Forum - Polynomial Trendlines

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Java Buzz Forum
Polynomial Trendlines

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Tim Vernum

Posts: 58
Nickname: tpv
Registered: Dec, 2002

Tim Vernum is passionate about designing secure systems

Polynomial Trendlines

Posted: Mar 28, 2011 6:41 PM

This post originated from an RSS feed registered with Java Buzz by Tim Vernum.
Original Post: Polynomial Trendlines Feed Title: A word used to describe something Feed URL: http://blog.adjective.org/feed/category/Development/rss2 Feed Description: Tim's random musings on software development, java, jetty and other such frivolities.	Latest Java Buzz Posts Latest Java Buzz Posts by Tim Vernum Latest Posts From A word used to describe something

I recently had to generate a polynomial trendline for some data I had. Microsoft Excel can do it (as can some other applications) but I needed to do it "by hand" (actually I did it in Java...)

Partly for my own benefit, and partly for the benefit of the ~~google~~ Duck Duck Go, I'm going to write down how I did it.

So, let's start at the beginning.

Why a trendline

I had plotted a line graph of the data I had. What I needed was a curve that approximated the line. There are 2 purposes for that

To demonstrate what the overall pattern of the data is - is the line going up or down? Does it cycle or is it moving in 1 consistent direction?
To make (loose) predictions about what will happen in the future. If the line is going down, when does it hit zero? If it's going up, at what rate is it moving?

What I needed

What we need then, is a formula f(x) that produces a y value. So that, for any given x (horizontal) value, we can generate a y value. Where the x value is in the range of the data we already have, we want the f(x) to produce a y value that is close to the data we have (i.e. we minimises the error).

A polynomial trendline

For a polynomial trendline, we want to find a formula in the form: a{0} + a{1}x + a{2}x^2 + ... + a{n-1}x^n-1 + a{n}x^n = y (Where a{n} is the nth coefficient and x^n is x raised to the power n)

So, a polynomial of order 2 would be: a + (b * x) + (c * x * x) = y
We need to find out the "best" values for a, b, and c, so that we plug in a value for x and determine an approximation for y.

Finding the coefficients

For a polynomial of order n, we have n+1 coefficients to work out.
Because we have n+1 unknowns, we need to use n+1 different points from our data in order to find a formula.

So, to find the formula for an order 2 polynomial, we need 3 points from our data to help us find a, b, and c. Let's do a simple example by hand, and then we can look at how to solve the general case.

Let's assume our three points are (0,0), (1,2), (2,1). We use those values for x and y to produce 3 simultaneous equations:

a + b * 0 + c * 0 = 0
a + b * 1 + c * 1 = 2
a + b * 2 + c * 4 = 1

Simplifying, we get

a = 0
a + b + c = 2
a + 2b + 4c = 1

Which means

a = 0
b = 2 - c
b = (1 - 4c)/2

Thus

2 - c = (1-4c)/2
4 - 2c = 1 - 4c
2c = -3
c = -3/2

And

b = 2 - -3/2
b = 7/2

a=0
b=7/2
c=-3/2

And our final formula is (7x)/2 - (3x^2)/2 = y

So, we can predict that if x = 3, then y = (7*3)/2 - (3*3*3)/2 = 21/2 - 27/2 = -3

The general case

The general case for an order n polynomial, with n+1 unknowns, and n+1 sample points, is a series of n+1 simultaneous equations with n+1 coefficients.
We can solve that using Cramer's Rule.

In Java, I used JAMA to implement Cramer's rule, which is actually quite easy.
For a set of data points (x{0}, y{0}) , (x{1}, y{1}) , (x{2}, y{2}) , ... , (x{n}, y{n}), the coefficient matrix is

| x{0}^0 x{0}^1 x{0}^2 ... x{0}^n |
| x{1}^0 x{1}^1 x{1}^2 ... x{1}^n |
| x{2}^0 x{2}^1 x{2}^2 ... x{2}^n |
| ...
| x{n}^0 x{n}^1 x{n}^2 ... x{n}^n |

and the column matrix is

| y{0} |
| y{1} |
| y{2} |
| ...
| y{n} |

Solving with Cramer's rule gives us

| a{0} |
| a{1} |
| a{2} |
| ...
| a{n} |

And for any given x, we can find y by calculating a{0}x^0 + a{1}x^1 + a{2}x^2 + ... + a{n}x^n

Which data points to use

We can pick any set of data points, and that will give us an equation, but we want the best fit. So how do we calculate that?

In my case, I simply picked random points.
I repeated that 25000 times, and then chose the equation that had the lowest error (using the sum of squares).

What order polynomial

In theory the higher the order the better the fit - that is, the error will be smaller. (This is true, because if a lower order equation would fit better, then we will simply find that a{n} is 0 when we solve our simultaneous equations).

But we're not simply looking for a better fit - we're also looking for something that visually demonstrates the trend of the data and makes helpful predictions about future data. That means that a lower order equation (which actually has a higher error) is more useful to us.

In my case, after plotting curves up to order 5, I determined that the equation for order 2 better suited my needs - so I could probably have saved myself some trouble with Cramer's rule and implemented the simpler case.

Read: Polynomial Trendlines

Previous Topic

Next Topic


	Web Artima.com