A Direct Logistic Fit of the 1950-1990 Data

We have been using calculus ideas to study the USA population census data. The results have led us to the conclusion that a logistic curve provides the best fit of the data for the period 1950-1990. This is an important conclusion because it suggests that of all the possible functions to choose from, we should try to fit a logistic function directly to the data. What does this mean? It means that starting with the data points and the function

[Maple Math]

[Maple Math] , where [Maple Math] .

our problem is to determine the values of [Maple Math] that minimize the distances between the given census data points and the corresponding values on the logistic curve. How do we do this? If instead of the logistic function the function were linear, we could use Maple's built-in least squares line-fitting routine. We already have used this tool at several junctures in our study. But Maple has no built-in non-linear curve fitting routines. So, it looks like we are stuck. However, the Maple Users Group (MUG) has come to the rescue just in time. Members of the MUG have developed a set of non-linear curve fitting procedures that we will try out here. You should comment at the end on how well you think these numerical methods work.

MUG nonlinear curve fit procedure: activate the cells inside

After activating the above cells, we first read in our data points:

> dataCensus:=readdata(USAPopCensus17901990,[integer,float]):
ndc:=nops(dataCensus):
data3:=[seq(dataCensus[i],i=17..21)];

[Maple Math]

Next, we define the function for which we want to determine the constants L, K, c in the expression for u:

> with(math3):
clear (L,K,c);
u:=L/(1+exp(K*c)*exp(-K*t));

[Maple Math]

[Maple Math]

Following the outline provided by the MUG, we make a list of t-values:

> Lt:=[seq(data3[i][1],i=1..5)];

[Maple Math]

Then we make a list of the corresponding y-values:

> Ly:=[seq(data3[i][2],i=1..5)];

[Maple Math]

Now, we have to make a guess as to the approximate values of L, K, c. From our previous work this is not a problem and we take [Maple Math] . We then follow the instructions from the MUG quote: "The default number of iterations is 5. We [might] try using 4 iterations ( iter = 4 ). The last argument (optional) must be a name. When present the iteration history will be put in that variable. The first 5 arguments of the procedure must be present and in the proper order. The last two arguments are optional. If both are present the number of iterations must be given first. In that case the same name may be used over and over again without apostrophes."

> with(math3):
clear(L,K,c,sq);

[Maple Math]

> NLF:=nonlinfit(Lt,Ly,y=u,t,{L=350000,K=.029,c=1961},iter=5,sq);

[Maple Math]

> sq;

[Maple Math]
[Maple Math]
[Maple Math]
[Maple Math]
[Maple Math]

Now that the fit is complete, here is an example of how to get hold of the expressions we want to use:

> sq[5];
sq[5][1];
rhs(sq[5][1]);
g:=rhs(NLF);
g:=unapply(g,t);

[Maple Math]

[Maple Math]

[Maple Math]

[Maple Math]

[Maple Math]

We grab L, K, c from the output:

> with(math3):
clear(L,K,c);
L:=rhs(sq[4][1]);
c:=rhs(sq[4][2]);
K:=rhs(sq[4][3]);

[Maple Math]

[Maple Math]

[Maple Math]

[Maple Math]

Then we plot the fitted logistic function and the data points:

> f:=x->L/(1+exp(-K*(x-c)));
p1:=plot(f(x),x=1950..2010):
p2:=plot(data3,style=point):
with(plots):
display([p1,p2]);

[Maple Math]

[Maple Plot]

And for fun, we project the curve further into the future in a separate graph:

> p1:=plot(f(x),x=1950..2100):
p2:=plot(data3,style=point):
with(plots):
display([p1,p2]);

[Maple Plot]

Finally, as we have done in the past we compute the total error for comparison purposes:

tm5p.html

> ferror:=[seq([data3[i][1],abs(f(1950+10*(i-1))-data3[i][2]),abs(f(1950+10*(i-1))-data3[i][2])/data3[i][2]*100],i=1..5)]:
toterrorf:=add((f(1950+10*(i-1))-data3[i][2])^2,i=1..5):

with(math3):
headers:=["year","absolute error","% error"]:
printtable(ferror,"NLF-Direct Logistic Fit", headers);
TotalSqrError:=toterrorf;

                                 NLF-Direct Logistic Fit

 

                   year            absolute error             % error

                   --------------------------------------------------

                   1950              727.241600               .479445

                   1960             1419.593400               .785734

                   1970              241.342700               .117798

                   1980             1041.434000               .457327

                   1990              577.996500               .231269

 

[Maple Math]

Go back