The Artima Developer Community
Sponsored Link

Java Buzz Forum
Polynomial Trendlines

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Tim Vernum

Posts: 58
Nickname: tpv
Registered: Dec, 2002

Tim Vernum is passionate about designing secure systems
Polynomial Trendlines Posted: Mar 28, 2011 6:41 PM
Reply to this message Reply

This post originated from an RSS feed registered with Java Buzz by Tim Vernum.
Original Post: Polynomial Trendlines
Feed Title: A word used to describe something
Feed URL: http://blog.adjective.org/feed/category/Development/rss2
Feed Description: Tim's random musings on software development, java, jetty and other such frivolities.
Latest Java Buzz Posts
Latest Java Buzz Posts by Tim Vernum
Latest Posts From A word used to describe something

Advertisement

I recently had to generate a polynomial trendline for some data I had. Microsoft Excel can do it (as can some other applications) but I needed to do it "by hand" (actually I did it in Java...)

Partly for my own benefit, and partly for the benefit of the google Duck Duck Go, I'm going to write down how I did it.

So, let's start at the beginning.

Why a trendline

I had plotted a line graph of the data I had. What I needed was a curve that approximated the line. There are 2 purposes for that

  • To demonstrate what the overall pattern of the data is - is the line going up or down? Does it cycle or is it moving in 1 consistent direction?
  • To make (loose) predictions about what will happen in the future. If the line is going down, when does it hit zero? If it's going up, at what rate is it moving?

What I needed

What we need then, is a formula f(x) that produces a y value. So that, for any given x (horizontal) value, we can generate a y value. Where the x value is in the range of the data we already have, we want the f(x) to produce a y value that is close to the data we have (i.e. we minimises the error).

A polynomial trendline

For a polynomial trendline, we want to find a formula in the form: a{0} + a{1}x + a{2}x^2 + ... + a{n-1}x^n-1 + a{n}x^n = y (Where a{n} is the nth coefficient and x^n is x raised to the power n)

So, a polynomial of order 2 would be: a + (b * x) + (c * x * x) = y
We need to find out the "best" values for a, b, and c, so that we plug in a value for x and determine an approximation for y.

Finding the coefficients

For a polynomial of order n, we have n+1 coefficients to work out.
Because we have n+1 unknowns, we need to use n+1 different points from our data in order to find a formula.

So, to find the formula for an order 2 polynomial, we need 3 points from our data to help us find a, b, and c. Let's do a simple example by hand, and then we can look at how to solve the general case.

Let's assume our three points are (0,0), (1,2), (2,1). We use those values for x and y to produce 3 simultaneous equations:

  • a + b * 0 + c * 0 = 0
  • a + b * 1 + c * 1 = 2
  • a + b * 2 + c * 4 = 1

Simplifying, we get

  • a = 0
  • a + b + c = 2
  • a + 2b + 4c = 1

Which means

  • a = 0
  • b = 2 - c
  • b = (1 - 4c)/2

Thus

  • 2 - c = (1-4c)/2
  • 4 - 2c = 1 - 4c
  • 2c = -3
  • c = -3/2

And

  • b = 2 - -3/2
  • b = 7/2

So

  • a=0
  • b=7/2
  • c=-3/2

And our final formula is (7x)/2 - (3x^2)/2 = y

So, we can predict that if x = 3, then y = (7*3)/2 - (3*3*3)/2 = 21/2 - 27/2 = -3

The general case

The general case for an order n polynomial, with n+1 unknowns, and n+1 sample points, is a series of n+1 simultaneous equations with n+1 coefficients.
We can solve that using Cramer's Rule.

In Java, I used JAMA to implement Cramer's rule, which is actually quite easy.
For a set of data points (x{0}, y{0}) , (x{1}, y{1}) , (x{2}, y{2}) , ... , (x{n}, y{n}), the coefficient matrix is

| x{0}^0 x{0}^1 x{0}^2 ... x{0}^n |
| x{1}^0 x{1}^1 x{1}^2 ... x{1}^n |
| x{2}^0 x{2}^1 x{2}^2 ... x{2}^n |
| ...
| x{n}^0 x{n}^1 x{n}^2 ... x{n}^n |

and the column matrix is

| y{0} |
| y{1} |
| y{2} |
| ...
| y{n} |

Solving with Cramer's rule gives us

| a{0} |
| a{1} |
| a{2} |
| ...
| a{n} |

And for any given x, we can find y by calculating a{0}x^0 + a{1}x^1 + a{2}x^2 + ... + a{n}x^n

Which data points to use

We can pick any set of data points, and that will give us an equation, but we want the best fit. So how do we calculate that?

In my case, I simply picked random points.
I repeated that 25000 times, and then chose the equation that had the lowest error (using the sum of squares).

What order polynomial

In theory the higher the order the better the fit - that is, the error will be smaller. (This is true, because if a lower order equation would fit better, then we will simply find that a{n} is 0 when we solve our simultaneous equations).

But we're not simply looking for a better fit - we're also looking for something that visually demonstrates the trend of the data and makes helpful predictions about future data. That means that a lower order equation (which actually has a higher error) is more useful to us.

In my case, after plotting curves up to order 5, I determined that the equation for order 2 better suited my needs - so I could probably have saved myself some trouble with Cramer's rule and implemented the simpler case.

Read: Polynomial Trendlines

Topic: Installing Java 7 on Mac OS X Previous Topic   Next Topic Topic: jDBI 2.12 and the SQL Object API

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use