n - Max-Planck-Institut für Astronomie

Practical Numerical
Training UKNum
Statistik, Datenmodellierung
PD. Dr. C. Mordasini
Max-Planck-Institute für Astronomie, Heidelberg
Programm:
1) Repetition elementare Statistik
2) Regressionsanalyse
3) Lineare Regression
4) Nicht-lineare Regression
1 Elementare Statistik
and is evaluated by dividing the sum of individu
characteristics) or not. The arithmetic mean of a sample is a m
Consider
Table
1
which
14
measureme
and is evaluated by dividing the sum of individual
data points b
3
Table 1 Chlorate ion
concentration
mmol/cm
produced
in a chemical
reactor
operated of
at the
a pH
Consider
Table in
1 which
14 measurements
con
Einfache statistische Grössen I
produced
in
a
chemical
reactor
operated
at
a
pH
of
7.0.
15.9 11.5 14.8 11.2 13.7 15.9
12.0
15.0
14.1
Studiere ein statistisches Datensample mit n Werten, z.b. Messwerten.
•
3 mmol/cm
Table
1
Chlorate
ion
concentration
in
Table
1
Chlorate
ion
concentration
in
mmol/cm
•Das Sample/Stichprobe kann mit sogenannten Momenten charakterisiert
12.0
15.0
14.1
15.915.9
11.5 14.8
11.2
13.7 11.2
15.9 1
12.0
15.0
14.1
11.5
14.8
werden.
Dies
sind
Summen
von
ganzzahligen
Potenzen
der
Werte.
The arithmetic mean y is mathematically defined as
n
Arithmetischer
Mittelwert
The arithmetic mean
y is mathematically defined as
n
The
arithmetic
mean
is
mathematically
defin
y
y Wert der Messwerte an
Gibt den mittleren
i
y
i 1
y
n
yi
i 1
n yi
n
i 1 of the individual data points y divided by th
which is
the
sum
i
y
One of the
of the spread
the data is the
which is the sum of the individual
data points
byran
y iof divided
n measures
definedoder
as theder
difference
between the maximum and minimum
Alternativen sind derwhich
Median
Mode.
is Rtheof
sum
of
the
individual
data
points
y
One of the measures
the
spread
of
the
data
is
the
ra
y max y min
Rangeas the difference
One of thethe
measures
of theand
spread
of the
wherebetween
defined
maximum
minimum
is the
maximum ofbetween
the values ofthe
y maxthe
yi , maximum
i 1,2,..., n,
defined as
difference
R y max y min
y min is the minimum of the values of yi , i 1,2,..., n. .
where
R
Problem: Ausreisserwhere
y max
y min
However, range may not give a good idea of the sp
points may be far away from most other data points (such d
i
min
y min is the minimum of the values of yi , i 1,2
Einfache statistische
Grössen
II
However,
range
may
not
give
a
good
However, range may not give a good idea o
points
maybebefarfaraway
away
from
data
points may
from
mostmost
otherother
data points
Residuum/Fehler
That
is why
whythe
thedeviation
deviation
from
the average
or
That is
from
the average
or arithm
measure einem
the
The
residual
between
the ist
data
measure
thespread.
spread.
The
residual
between
thepod
Das Residuum i zwischen
Datenwert
i und
dem
Mittelwert
eei i yyi i y y
Thepositiv
difference
ofofeach
data
point
fromdie
theSumme
mean
can
Das Residuum kann
oder negativ
sein,
daher
kann
The
difference
each
data
point
from
the me
which
side of
the mean the
data point
lies
(recall
the
der Residuen über
das
ganze
Datensample
(zufällig)
gleich
Null
sein
which side of the mean the data point lies (reca
one gegenseitig
calculates the
sum of können.
such differences
to find t
da sich die Residuen
auslöschen
Daher ist die
one
calculates
the
sum
of
such
differences
to
simply
eachein
other.
ThatMass.
is why the sum of the
Summe der Quadrate
dercancel
Residuen
besseres
simply
each
other.
That
why the
sum
a better cancel
measure.
The
sum of
the is
squares
of the
di
aerror
better
measure.
Thebysum of the squares of
(SSE),
S t , is given
Summe der Fehlerquadrate
error (SSE), nS t , is given
by
2
St
n
i 1
yi
y
2
S
y
y
t
i
Since the magnitude of the summed squared error is d
Der Betrag der Summe der Fehlerquadrate
ist offensichtlich von der
i 1
Anzahl Datenpunkte
abhängig.
Daher
suchen
wir einen
Mittelwert.
an average
value
of the
summed
squared
error is defin
Since the magnituden of the summed squared err
simply cancel each other. That is why the sum of the square of the dif
a better measure. The sum of the squares of the differences, also ca
error (SSE), S t , is given by
Einfache statistische Grössen III
n
St
(Stichproben-)Varianz
yi
y
2
i 1
Since the magnitude of the summed squared error is dependent on the n
Ein Mittelwert der Summe der Fehlerquadrate ist die
an average value of the summed squared error is defined as the variance
(Stichproben-)Varianz
n
yi
y
2
St
i 1
n 1
n 1
2
The
variance,
sometimes
two differentweil
convenient
Die Stichprobenvarianz wird mitis (n-1)
und written
nicht ninberechnet
wir form
2
den Mittelwert selbst schon aus der Stichprobe berechnet haben.
Dies bedeutet dass wir einen Freiheitsgrad verloren haben, denn wenn
wir den Mittelwert und n-1 Datenpunkte kennen, können wir den n-ten
Wert berechnen. Ist der Mittelwert extern gegeben (nicht durch von
der Stichprobe her berechnet), sollte n statt n-1 verwendet werden.
The standard deviation
of (14.1.6) as an estimator of the kurtosis of an underlying
!
!
when
it
is
the
sample
estimate
(14.1.3).
However,
the
kurtosis
depends
such
normal distribution is 96/N when σ is the true standard deviation, and on24/N
a high
that there
are (14.1.3).
many real-life
distributions
for which
the on
standard
when
it ismoment
the sample
estimate
However,
the kurtosis
depends
such
deviation
of (14.1.6)
as an
is effectively
infinite.for which the standard
a high
moment
that there
areestimator
many real-life
distributions
Calculation
of as
thean
quantities
defined
in this section
is perfectly straightforward.
deviation
of (14.1.6)
estimator
is effectively
infinite.
Many textbooks use the binomial theorem to expand out the definitions into sums
Calculation
of the quantities defined in this section is perfectly straightforward.
Varianz
Fortsetzung
of various powers of the data, e.g., the familiar
Many textbooks use the binomial theorem to expand out the definitions into sums
Numerisch
Varianz
in verschieden
the
 familiar
 Arten
 geschrieben werden:
of various kann
powersdie
of the
data, e.g.,
N
&
1 
2
2
2 − x2


−
N
≈
x
x
x
(14.1.7)
Var(x1 . . . xN ) =
j
N
N1 − 1 &
j=1 2


xj  − N x2  ≈ x2 − x2
(14.1.7)
Var(x1 . . . xN ) =
N −1
j=1
but this can magnify the roundoff error by a large factor and is generally unjustifiable
inSchreibweise
terms of computing
speed.
A clever way(bei
to minimize
roundoff
error, especially
Eine
die
Rundungsfehler
grossen
N)
reduziert
ist
der
butfor
this
can samples,
magnify the
roundoff
error
by a large
factoralgorithm
and is generally
unjustifiable
[1]: First calculate x,
large
is
to
use
the
corrected
two-pass
korrigierte
pass Algorithmus.
Dabei
wird
zuerst
der Mittelwert
in then
termscalculate
oftwo
computing
speed.
A
clever
way
to
minimize
roundoff
error, especially
Var(x1 . . . xN ) by
berechnet,
und dann
Varianz
als two-pass algorithm [1]: First calculate x,
for large samples,
is to die
use the
corrected

2 

then calculate Var(x1 . . . xN ) by


N
N


&
1  &
1
(14.1.8)
(xj − x)2 −  (xj − x)2 
Var(x1 . . . xN ) =



N − 1
N
N
N




&
&
j=1
j=1
1
1
 (xj − x)
(14.1.8)
(xj − x)2 −
Var(x1 . . . xN ) =

N −1
N j=1

 j=1
The second sum would be zero if x were exact, but otherwise it does a good job of
correcting the roundoff error in the first term.
The second sum would be zero if x were exact, but otherwise it does a good job of
Einfache statistische Grössen IV
However, why is the variance divided by (n 1) an
However, why is the variance divided by
This is because with the use of the mean in calcul
This
is
because
with
the
use
of
the
mean
independence of one of the data points. That is, if you kno
onen of
thepoints
data points.
That is, i
the independence
value of one ofofthe
data
can be calculated
the value of one of the n data points can be ca
Standardabweichung
points.
Um ein Mass points.
der Streuung
den
gleichenback
Einheiten
wie
die level of un
To bring inthe
variation
to the
same
Messgrössen
zu standard
haben,
ist
die
Standardabweichung
als
Wurzel
der
called
deviation,
,
is
defined
as
To bring the variation back to the same le
Einfache statistische Grössen V
Varianz gegeben:
called standard deviation,
n
yi
y
2
, is defined as
n
St
i 1
2
y
y
i
n 1 S
n 1
t
i 1
Furthermore, the ratio of the standard deviation to
Variationskoeﬃzient
n 1 the spread of a sam
1 to normalize
variation c.v is alson used
Das Verhältnis von Standardabweichung
Mittelwert
iststandard
ein relatives,
Furthermore, thezuratio
of the
dev
c.v für die
100
dimensionsloses Mass
Streuung im Sample
variation yc.v is also used to normalize the spread
Example 1 c.v
100
[%]
y
Use the data in Table 1 to calculate the
Einfache statistische Grössen VI
That being the case, the skewness or third moment, and the kurtosis or fourth
ment should be used with caution or, better yet, not at all.
The skewness characterizes the degree of asymmetry of a distribution around its
n.Skewness
While the mean,
standard
deviation,
and
average
deviation
are
dimensional
(Schiefe)
ntities, that is, have the same units as the measured quantities x j , the skewness
Auch
bekannt
als
das
dritte
Moment,
charakterisiert
es
das
Ausmass
onventionally defined in such a way as to make it nondimensional. It is a pure
derthat
Asymmetrie
eine
Verteilung
Messdaten
umThe
den
Mittelwert.
Esisist
ber
characterizes
only
the shapevon
of the
distribution.
usual
definition
eine dimensionslose Grösse.
"
#
N
3
!
xj − x
1
606
14.
Skew(xChapter
) = Statistical Description of Data
(14.1.5)
1 . . . xN
N j=1
σ
re σ = σ(x1Skewness
. . . xN ) is the distribution’s standard Kurtosis
deviation (14.1.3). A positive
e of skewness signifies a distribution with an asymmetric tail positive
extending out
(leptokurtic)
ards more positive x; a negative value signifiesEin
a negative
distribution
whose
tail extends
positiver Wert
entspricht
positive
towards negative
more negative x (see Figure
14.1.1).(platykurtic)
einer asymmetrischen Verteilung
Of course, any set of N measured values is likely
to give
nonzero
for
mit einem
Tailader
gegenvalue
positive
1.5), even if the underlying distribution is in factWerte
symmetrical
skewness).
weisst.(has
Derzero
Modus
ist
(14.1.5) to be meaningful, we need to have some
ideaalsofder
its Mittelwert.
standard deviation
kleiner
et al. 1992 of the skewness of the underlying distribution. Unfortunately, that
nPress
estimator
estimated by the sample mean, (14.1.1). In real life it is good practice to believe in
skewnesses only when they are several or many times as large as this.
The kurtosis is also a nondimensional quantity. It measures the relative
peakedness or flatness of a distribution. Relative to what? A normal distribution,
what else! A distribution with positive kurtosis is termed leptokurtic; the outline
of the Matterhorn is an example. A distribution with negative kurtosis is termed
Kurtosis
(Wölbung)
platykurtic; the outline of a loaf of bread is an example. (See Figure 14.1.1.) And,
Auch
das vierte
Moment
bekannt,
charakterisiert
as you als
no doubt
expect, zentrale
an in-between
distribution
is termed
mesokurtic. die
The conventional
definition
the kurtosis Es
is ist eine dimensionslose
Kurtosis
die Spitzigkeit
einerofVerteilung.


Grösse.
#4 
N "
1 !
xj − x
tical Description of Data
−3
(14.1.6)
Kurt(x1 . . . xN ) =
N

σ
Einfache statistische Grössen VII
j=1
Kurtosis
negative
(platykurtic)
Sample page from
Copyright (C) 198
Permission is gra
readable files (inc
http://www.nr.com
(b)et al.
Press
positive
(leptokurtic)
Eine Verteilung mit einer positiven Kurtosis
heisst leptokurtisch. Ein Beispiel ist das
Matterhorn. Eine oben abgeflachte
Verteilung ist im Gegensatz platykurtisch.
Als Referenz wird die Gaussverteilung
verwendet.
Höhere Moment wie die Skewness und
Kurtosis sind weniger robust als der
Mittelwert oder die Standardabweichung.
3/2
A Poisson
A Gaussian distribution hasSkew(x)
all its semi-invariants
higher than
I22 equal to zero. (14.1.12)
= I3 /I2
Kurt(x)
= I4 /I
2
distribution has all of its semi-invariants equal to its mean. For more details, see [2].
Einfache statistische Grössen VIII
A Gaussian distribution has all its semi-invariants higher than I2 equal to zero. A Poisson
distribution has all of its semi-invariants equal to its mean. For more details, see [2].
Median and Mode
Median
Median and Mode
Der
Median
The
medianeiner
of a Wahrscheinlichkeitsverteilung
probability distribution functionp(x)
p(x)istisder
the Wert
valuexxmed
med for
für welchen
grössere
und kleinere
von
x gleich
which
larger
smaller
values
of xdistribution
areWerte
equally
probable:
The and
median
of a probability
function
p(x) iswahrscheinlich
the value x med for
which larger and smaller
sind.
# ∞probable:
# xmedvalues of x are equally
1
xmed
p(x) dx = 1 = # ∞ p(x) dx
−∞
p(x) dx =2 = xmed p(x) dx
#
−∞
2
xmed
(14.1.13)
(14.1.13)
The median of a distribution is estimated from a sample of values x 1 , . . . ,
The median
of
a distribution
is estimated
from
a sample
of
values x 1 , . . . ,
Der
Median
einer
Stichprobe
x
,...,
x
ist
der
Wert
x
der
dieselbe
1
N
i
xN by finding that value x i which has equal numbers of values above it and below
xN by finding that value x i which has equal numbers of values above it and below
grössere
und
kleinere when
WerteNhat.
Offensichtlich
gibtit es
dies nicht
it. Anzahl
Ofit. course,
this
is
not
possible
is
even.
In
that
case
is
conventional
Of course, this is not possible when N is even. In that case it is conventional
N
gerade
ist.
In
diesem
Fall
ist
es
die
Konvention,
den
Mittelwert
to falls
estimate
the
median
as
the
mean
of
the
unique
two
central
values.
values
to estimate the median as the mean of the unique two central values. If If
thethe
values
zentralen
Werte
zu
verwenden.
die matter,
Datenpunkte
in order,
j x=jzwei
. ,.N
sorted
into
for
matter,
descending)
order,
xj der
j1,=. .1,
. . ,are
N are
sorted
intoascending
ascending(or,
(or,Falls
for that
that
descending)
then
the formula
for
thegeordnet
median
ansteigendem
Wert
heisst dies formelmässig:
then
the formula
for
the
medianissind,
is
$ $x
N
x(N(N
,,
N odd
odd
+1)/2
+1)/2
xmed
(14.1.14)
xmed
(14.1.14)
==1 1
(x
),
N
even
N/2++xx
(N/2)+1),
(x
N
even
2
N/2
(N/2)+1
2
Press et al.
Einfache statistische Grössen VIII
Modus
Der Modus einer Wahrscheinlichkeitsfunktion p(x) ist der Wert x wo
p den maximalen Wert annimmt. Bei einer empirischen
Häufigkeitsverteilung ist es einfach der häufigste Wert. Der Modus
ist vor allem hilfreich wenn die Verteilung ein einziges, relativ
scharfes Maximum enthält. Gelegentlich treten aber bimodale
Verteilungen mit zwei relativen Maxima auf. Dann sollte man beide
Werte individuell kennen. Denn sowohl Modus wie auch Mittelwert
sind in diesem Fall keine sehr nützlichen Grössen, da sie nur einen
“Kompromiss” zwischen dein zwei Maxima darstellen.
In der Physik können solche bimodalen Verteilungen ein Hinweis
sein, dass zwei unterschiedliche Mechanismen wirken.
Press et al.
2 Regressionsanalyse
Regressionsanalyse
Was ist Regressionsanalyse?
Die Regressionsanalyse liefert (quantitative) Informationen über die
Beziehung einer abhängigen Variable und einer oder mehrerer
unabhängiger Variablen, soweit eine solche Beziehung in einem
Datensatz enthalten ist. Sie wir benützt für
1. Prognosen
2. Modellanpassung (Parameter Bestimmung)
3. Modellvalidierung
Bei der Regressionsanalyse liegt die Betonung auf der Untersuchung
der Art der Beziehung zwischen physikalischen Grössen (die als nicht
fehlerbehaftet angenommen werden). Bei der verwandten
Ausgleichsrechnung (Fitting) geht es hingegen primär darum, die
Parameter eines gegeben Modells zu bestimmen, unter Beachtung der
Fehler der einzelnen Messungen.
Methods
MethodeLeast
derSquares
kleinsten
Quadrate
This is the most popular metho
models. It has well known probab
regression parameters with the smalles
We wish to predict the respon
Wir wollen das Verhältnis von n Messdaten (x1,y1),(x2,y2),......,(xn,yn)
durch ein Regressionsmodell
also by
regressionf ausdrücken,
model given
y f (x)
wobei die Funktion
f von a priori
where,
the unbekannten
function Regressionsparametern
f (x) has regressi
abhängt. Diese müssen nun abgeschätzt werden. Wichtige Beispiele:
For example
f(x) = a + a x Einfache lineare Regression mit den Parametern a und a
f (Modell
x ) mita0den Parametern
a1 x is aa straight-li
f(x) = a e
Exponentielles
and a
Dies ist die bekannteste Methode um Parameter eines Modells in
einer Regressionsanalyse zu schätzen. Die Methode folgt gut
bekannten Wahrscheinlichkeitsverteilungen und liefert die
Parameter für die die Varianz minimal ist.
0
0
1
0
a x
1
0
1
1
f(x) = a0 + a1x + a2 x2 Quadratisches Modell mit
a xParametern a0, a1 und a2
f ( x)
a e is an exponential
1
f ( x ) y a0f (xa) 1 x is a straight-line regression mo
Methode der kleinsten Quadrate II
where, the function
f (x) has regression constants that ne
a1 x
f ( x ) a0 e is an exponential model with cons
For example
2
is a quadratic
modelmodel
with
f ( x ) f (ax0) aa01 x a1 xa2isxa straight-line
regression
Ein Mass für die Güte mit der ein Regressionsmodell
f(x) die
a1 x
f (yx )voraussagt
aof
is an
model
with
constan
measure
of fit,des
thatResiduums
is how
the
0 e goodness
Abhängigkeit derA
Variable
istexponential
die Grösse
2
Ei bei allen
n
Datenpunkten.
with co
f ( x ) ya0is athe
a2 x is a quadratic
response variable
of themodel
residual,
E
1 x magnitude
A measure of goodness of fit, that is how the re
Ei yi f ( xi ), i 1,2,....n
response variable y is the magnitude of the residual, Ei a
E
are
zero,
one
may
have
f
Ideally, if all Ethe residuals
i
yi f ( xi ), i 1,2,....n
i
Bei einem perfekten Modell wären alle Ei gleich Null.
all the Ideally,
points iflieallon
model.EThus,
minimization
of
th
are
zero,
one
may
have
foun
thearesiduals
i
regression
coefficients.
the least
method,
In der Methode
derthe
kleinsten
schätzt
mansquares
die
all
points Quadrate
lie on aIn
model.
Thus,
minimization
of thees
r
are chosen
such
thatdieminimization
of
theder
sum
of estim
the
Regressionsparameter
so dass
Summe
derleast
Quadrate
regression
coefficients.
In the
squares
method,
n
Residuen minimal
arewird:
chosen
2 such that minimization of the sum of the sq
minimize
Ei n .
minimize
i 1
2
Ei . -> minimal
i 1
Daher auch
Nameminimize
“kleinste
Quadrate”.
Whyderminimize
the sum
of of
the
theresiduals?
residual
Why
the
sum
thesquare
square of
of the
3 Lineare Regression
Lineare Regression
Gegeben seinen n Datenpunkte
. Bestimme
die Regressionsgerade
y
x
Illustration mit Mathematica
Parameterschätzung I
Die Methode der kleinsten Quadrate minimiert die Summe der
quadrierten Residuen des linearen Modelles, und gibt eine eindeutige
Regressionsgerade vor.
Unsere Aufgabe ist die Bestimmung der Regressionsparameter a0 und
a1. Dazu benutzen wir elementare Analysis (Ableitung = 0 bei Maxima/
Minima).
Beim Minimum muss für die partiellen Ableitungen gelten (Kettenregel):
Parameterschätzung II
Dies
gibt
Linear Regression
Linear Regression
n
nn
y i xi
a0
a1
n
n
a0
i 1
i 1
xi
xi
na
a1
a0
a0
aa xx
0
i
ai 1
n
n
i 1
i 1
x
xi
2
i
n
a0
yi
i 1 n
a0
n
xi
n
0 a x2
1 i
na 0
. . . a0
yi
i 1
ai1 1
x i y i2
n
i 1
0
i 1
. . . a0
a0
i 1n
i 1 0
n
i
2
1 i
i 1
a0
Noting
that
i 1
na 0
i
i 1
n
Da
Noting
that
ya 0xxi
i 1
i 1
nn
xi
n
i 1
xi y i
na 0
i 1
i 1
i 1
Noting that
a a a III
...
Parameterschätzung
n
x3 , y3
0
0
ax02 , y2na 0
0
i 1
na 0
x1 , y1
a1
n
ny
xi
i 1
n
Figure 3 Linear regression ofay0 vs.x ix
i 1
ypical point, xi .
yi
a0
a1 x
x3 , y3
i 1
n
n
x
2
1 showing
i
i i
data
residuals
i 1
i 1
a
x
x y and square of residual at
x1 , y1
Dies können wir als lineares Gleichungssystem mit zwei Gleichungen
auffassen und als 2x2 Matrix schreiben wie wir es in der letzten Vorlesung
gesehen
haben,
mit den(14)
Unbekannten
0 und aregression
1 (alle xi und
Solving
the above
Equations
andFigure
(15) gives
3aLinear
ofyi ysind
vs. bekannt).
x data show
n
n
a1
xi y i
i 1
n
n
x i2
✓n
n
◆✓ ◆
✓
point, xi .
y ⌃xtypical a
xi
n
i
i
i 1
1
y i ⌃x
2
i 2 ⌃xi
n
xi
0
a1
=
⌃yi
⌃xi yi
◆
xi , yi (16
Solving
the above Equations (14) and (15) gives
Für so eine kleine
Matrix
findet man
schnell:
i 1
i 1
n
a0
x i2
i 1
n
yi
i 1
n
n
i 1
n
xi
i 1
x
2
i
n
i 1
n
i 1
2
xi
xi y i
n
a1
n
xi y i
i 1
x2 ,n y2
x i2
n
i 1
n
n
xi
i 1
n
i 1
yi
i 1
2
xi
(17
S xy
nS xxx y
xi y i
S xxi
S
x y nxy
Parameterschätzung
IV
n
i 1
xy
i 1
S xx
Wir definieren
ni
n
S xx
2
1i
ix
i 1
n
n
__
06.03.6 x
x
i 1i
xi
1
_ _
_2
i
x
2
i
_2
nx
n x_
x
n
_
xi
1
n
xi
ix 1
n
06.03.6
n
_
y
x
2
i
n x_ 2
nx
i 1
xi
i 1
n
n
yi
yi n1i x
_ _
nxy
n
i yi
i 1 i 1
n
n
weycan rewrite
n
_2
nny
_ _
n
i
2
S
_
xy
S
x
nx
S xy i 1 x iyy i n x y
xx
i
a
1
y_
i
i 1
we
can
rewrite
i 1
S xx
n
in 1
n
2
_
_
_
y
we can rewrite
S
2
x
_ a xy y i a x
S xx
xi n x
n
0i 1
1
a1Regressionsgerade
Damit können wir die Parameter
für die
schreiben als
Si xy1
x
a1 n
n
S
we can rewrite
xx
S xx
n
Example
1
x _
_
_
_ Si
_
y
xy
i
_
i 1
xaa01 y a1 x
The
torque
to
turn
the
torsional
sp
i T
1 aneeded
a
y
x
0 y
1
nS
below
n
xx
n
Table 5 Torque versu
Example 1 y_i
_ we can rewrite
_
S
Angle,
Example
1
xy
i
1
a
y
a
x
y 0 T needed1 to turn the torsional
a1
The torque
spring of a mousetrap throug
n
_
S xy
xn i
Beispiel I
Geben sei folgendes Datenset. Bestimme die Regressionsgerade gemäss der Methode
der kleinsten Quadrate.
x
y
0.698132
0.188224
0.959931
0.209138
1.134464
0.230052
1.570796
0.250965
1.919862
0.313707
y
0.4000
0.1000
0.5000
2.0000
X
Gesucht sind somit a0 and a1 für das lineare Model
Beispiel II
Einsetzen in die oben genannten Gleichungen führt direkt zu
y
a0
a1
x
4 Nichtlineare Regression
Nichtlineare Regression
Einige wichtige nichtlineare Modelle
1. Exponentiell:
2. Power law:
3. Saturation growth:
4. Polynom:
Nichtlineare Regression: Exponentiell
Gegeben seien n Datenpunkte
wobei
eine nichtlineare Function von
bestimme
ist via die Methode der kleinsten Quadrate.
Figure. Nonlinear regression model for discrete y vs. x data
Beispiel
Exponentielles Modell
Parameter a und b zu bestimmen!
€
Exponentiell: Bestimme Parameter I
Die Summe der Quadrate der Residuen ist gegeben als
n
Sr = ∑ ( y i − ae
i=1
bx i
)
2
Bestimme Minimum: Leite ab nach a und b und setze gleich Null.
n
∂Sr
bx i
bx i
= ∑ 2( y i − ae )( −e ) = 0
∂a i=1
n
∂Sr
bx i
bx i
= ∑ 2 y i − ae ( −ax ie ) = 0
∂b i=1
(
)
Exponentiell: Bestimme Parameter II
Ausmultiplizieren liefert (a≠0)
Exponentiell: Bestimme Parameter III
Die erste Gleichung können wir direkt nach a lösen:
n
∑y e
bx i
i
a=
i=1
n
∑e
2bx i
i=1
Dies setzen wir in die zweite Gleichung ein.
n
n
i
∑y x e
i
i=1
y
e
∑
€
bx i
i
bx i
−
i=1
n
∑e
i=1
n
∑x e
i
2bx i
2bx i
=0
i=1
Den Parameter b können wir mit den bekannten numerischen
Methoden (z.B. Bisektion) zur Lösung nichtlinearer Gleichungen
bestimmen. Sobald b gefunden ist, können wir auch a berechnen.
Beispiel - Exponentielles Modell I
Radioaktiver Zerfall von Technetium-99m
Technetium-99m wird zum Beispiel in der Medizin eingesetzt. Nimm
an dass die Aktivität als Funktion der Zeit (relativ zum Anfangswert)
gemessen wurde.
t(hrs)
0
1
3
5
7
9
1.000
0.891
0.708
0.562
0.447
0.355
Wir wissen dass der radioaktive Zerfall einem exponentiellen Zerfallsgesetz folgt.
Führe daher eine Regression mit dem exponentiellen Modell durch.
Beispiel - Exponentielles Modell II
Die relative Intensität soll deshalb durch das Modell
Bestimme:
a) Die Werte der Regressionsparameter
b) Die Halbwertszeit von Technetium-99m
c) Die Intensität nach 24 Stunden
beschrieben werden.
und
Bestimmung der Parameter
Der Wert von λ ist durch die nichtlinear Gleichung gegeben:
€
n
∑γ e
n
λt i
i
f ( λ ) = ∑ γ i t ie
λt i
−
i=1
n
2 λt i
i=1
n
∑γ e
λt i
i
A ist dann:
A=
i=1
n
∑e
i=1
∑t e
i
∑e
i=1
n
2 λt i
i=1
2 λt i
=0
Lösung der nichtlinearen Gleichung
Damit lässt sich A
berechnen:
6
∑γ e
λt i
i
A=
i=1
6
∑e
2 λt i
i=1
Muss ja so sein...
Vergleich Daten und Regression
T1/2 =
ln(1/2)
=
T1/2 = 6.022 hrs
ln(2)
Relative Intensität nach 24 Stunden
Diese ist offensichtlich gegeben als
In anderen Worten, nach 24 Stunden sind noch
der anfänglichen Aktivität vorhanden.
Linearisation von Daten I
Die Bestimmung der Parameter nichtlinearer Modelle kann auf gekoppelte,
nichtlineare Gleichungssystem führen, die schwierig zu lösen sind.
Deshalb ist es manchmal besser die Daten zu linearisieren, falls dies möglich
ist. Für den exponentiellen Zerfall ist dies der Fall.
Gegeben sei das exponentielle Modell
Wir wenden den natürlichen Logarithmus an, dies gibt
Sei
,
und
Offensichtlich habe wir nun ein lineares Modell mit den Parametern a0 und a1
Sobald a0 und a1 bekannt sind, können wir wieder a und b bestimmen.
Linearisation von Daten II
Wir wissen
n
n
n
n∑ x i zi − ∑ x i ∑ zi
i=1
a1 =
i=1
2
n
⎛
⎞
n ∑ x i2 − ⎜ ∑ x i ⎟
⎝ i=1 ⎠
i=1
n
_
Sobald
i=1
_
a0 = z − a1 x
bestimmt sind, können wir die ursprünglichen Parameter berechnen:
€
Beispiel - Linearisation von Daten I
Radioaktiver Zerfall wie zuvor:
0
1
3
5
7
9
1.000
0.891
0.708
0.562
0.447
0.355
,
0.750
0.500
0.250
0
Exponentielles Modell
Es sei
Relative intensity of radiation, γ
t(hrs)
1.000
0
2
5
Time t, (hours)
und
Linearisierter Zusammenhang von
sodass
und
7
9
Beispiel - Linearisation von Daten II
Bestimme die linearen Parameter
n
n
wo
n
n∑ t i zi − ∑ t i ∑ zi
a1 =
i=1
i=1
i=1
2
n
⎛
⎞
n ∑ t12 − ⎜∑ t i ⎟
⎝ i=1 ⎠
i=1
n
und
Table. Summation data for linearization of data model
mit
1
0
1
0.00000
0.0000
0.0000
2
1
0.891
−0.11541
−0.11541
1.0000
3
3
0.708
−0.34531
−1.0359
9.0000
4
5
0.562
−0.57625
−2.8813
25.000
5
7
0.447
−0.80520
−5.6364
49.000
6
9
0.355
−1.0356
−9.3207
81.000
−2.8778
−18.990
165.00
25.000
Beispiel - Linearisation von Daten III
Wir finden
Da
und
Das Regressionsmodell ist somit
Beispiel - Linearisation von Daten IV
Die Halbwertszeit von Technetium 99m ist erreicht wenn
Der aus unserem Experiment und Regression bestimmte Wert stimmt recht gut mit
dem Literaturwert von ca. 6.01 Stunden überein.
Referenzen
•Dieses Script basiert auf http://numericalmethods.eng.usf.edu
by Autar Kaw, Jai Paul und Numerical Recipes (2nd/3rd
Edition) by Press et al., Cambridge University Press
http://www.nr.com/oldverswitcher.html