nexoncn.com

文档资料共享网 文档搜索专家

文档资料共享网 文档搜索专家

当前位置：首页 >> >> Finite Element Thin Plate Splines for Data Mining Applications

Finite Element Thin Plate Splines for Data Mining Applications

Markus Hegland, Stephen Roberts, and Irfan Altas

Abstract. Thin plate splines have been used successfully to model curves

and surfaces. A new application is in data mining where they are used to model interaction terms. These interaction splines break the \curse of dimensionality" by reducing the high-dimensional nonparametric regression problem to the determination of a set of interdependent surfaces. However, the determination of the corresponding thin plate splines requires the solution of a dense linear system of equations of order n where n is the number of observations. For data mining applications n can be in the millions, and so standard thin plate splines, even using fast algorithms may not be practical. A nite element approximation of the thin plate splines will be described. The method uses H 1 elements in a formulation which only needs rst order derivatives. The resolution of the method is chosen independently of the number of observations which only need to be read from secondary storage once and do not need to be stored in memory. The formulation leads to a saddle point problem. Convergence and solution of the method and its relationship to the standard thin plate splines will be discussed.

Recent years have seen a considerable e ort to develop reliable and e cient data mining tools to discover hidden knowledge in very large data bases. Data mining methods have been applied to a number of application areas, for instance the calculation of pro t according to customer, product and acquisition channel; trend and response prediction and identi cation; the development of tools to examine corporate intelligence reports, tax/insurance frauds and patterns in airline crashes. The interested reader can refer to 5] for detailed information. The fundamental issue is the development and adaptation of algorithms to extract some useful information from very large databases. Often the number of observations can be of the order of millions, where there may be thousands of variables recorded.

Mathematical Methods for Curves and Surfaces II Morten D hlen, Tom Lyche, and Larry L. Schumaker (eds.), pp. 245{252. Copyright c 1998 by Vanderbilt University Press, Nashville, TN. ISBN 0-8265-1315-8. All rights of reproduction in any form reserved.

x1. Introduction

o

245

246

M. Hegland, S. Roberts, and I. Altas

In this work we study the development of nonparametric models y = f ( 1; : : :; m) to t very large data sets. Here y is the expectation of an observation and 1 ; : : :; m represent the dimensionality of the data. If the data set is complex (i.e., m is large) one will have to deal with the curse of dimensionality when one tries to estimate the model f . To overcome this curse, additive models or interaction splines have been used 8]. In generalised additive models, if interaction terms are limited to order two interactions, this leads to a problem of determination of coupled surfaces. Thus, an important part of any data analysis algorithm for this problem is the determination of an approximating surface for extremely large data sets. Thin plate splines are very popular for the estimation of surfaces y = f ( 1; 2) from observed data. A variational characterisation of the thin plate splines was proposed by Duchon 7]. He showed that an explicit formulation of the solution can be obtained in terms of radial basis functions which leads to a symmetric inde nite dense linear system of equations. It was seen that this system can be reduced to a positive de nite system of equations which can be solved by a conjugate gradient method 11]. Further improvements using ideas from multipole expansions and Lagrange functions lead to methods which are of O(n) or O(n log(n)) in complexity 2,3,4], where n is the number of observations. Though these methods provide fast algorithms, the amount of storage still depends on the number of observations. In this work, we introduce a new smoothing method which can be viewed as a discrete thin plate spline like those proposed in 1,12]. This new spline has similar properties to ordinary thin plate splines, but combines the favourable properties of nite element smoothing with those of thin plate splines. In Section 2, we introduce the method which is based on formulating the thin plate spline functional in terms of rst order derivatives. This is similar to mixed nite element techniques for the biharmonic equation. In Section 3 we describe the discrete problem and explain the assembly and solution of equations. In Section 4, we provide some computational examples comparing our method with Duchon's thin plate splines and the method of nite element least squares tting. Finally we end with a short conclusion. The thin plate spline for a general domain is the function f for which all the partial derivatives of total order 2 are in L2( ) and which minimises the functional n 1 X(f (x ) ? y )2 + Z ?f (x)2 + 2f (x)2 + f (x)2 dx; x 1 1 1 2 2 2 n i=1 i i where x = ( 1; 2)T and xi = ( 1i ; 2i)T 2 , yi 2 IR, i = 1; : : :; n denote the measurements. The case = IR2 was extensively investigated by Duchon 7]. For compact , nite element approximations, called Dm splines, have been investigated by Arcangeli 1], where the function f is approximated by functions in a discrete subspace of H 2 ( ).

x2. First Order Thin Plate Splines

Finite Element Thin Plate Splines

247

In this work we investigate a mixed method in which the need to work with H 2( ) is removed. We replace the second derivatives in the above functional with the rst derivatives of the gradient u of the function f . Now the condition ru = f is replaced with the condition f = div(u) with appropriate boundary conditions, together with the condition curl(u) = 0. This rst order derivative formulation allows us to work with simple continuous piecewise polynomial nite element spaces. In addition the underlying equations that need to be solved are Poisson equations, for which robust fast solvers exist. In addition we have found that the condition curl(u) = 0 does not need to be enforced to lead to good data tting. To describe our reformulation we introduce the space V = H 1 ( )2 IR3 , where H 1 ( ) is the closed subspace of the SobolevRspace H 1( ) consisting of functions with zero mean. It is seen that jvj2 := jrvj2dx de nes a norm 1 on H 1 ( ) and so k(u; c)k2 := ju1j2 + ju2 j2 + cT c de nes a norm on V . 1 1 V For u 2 H 1 ( )2 let u0 2 H 1 ( ) be the solution of (rv; ru0) = (rv; u) for all v 2 H 1 ( ), where ( ; ) denotes the L2( ) inner product. Let the function values of u0 on the measurement points be denoted by

u Pu := (u0(x1 ); : : :; u0(xn ))T :

Furthermore, let 1 T 2 IRn;3 X := x1 xn 1 and y := (y1; : : :; yn)T . Using all of this one can introduce the functional

u c u c J (u; c) = n?1 (Pu + Xc ? y )T (Pu + Xc ? y ) + (ju1j2 + ju2 j2) 1 1

and the associated bilinear form a and linear form F given by

u c v d a(u; c; v ; d) = n?1(Pu + Xc)T (Pv + Xd) + ((u1; v1)1 + (u2 ; v2)1) v d F (v ; d) = n?1(Pv + Xd)T y :

Now we can de ne our new spline in terms of the minimiser of J over the space V . The following theorem shows that such a minimiser exists and is unique. Theorem 1. If the data points fxi g are not collinear then there is exactly one (u; c) 2 V such that

J (u; c) J (v ; d); for all (v ; d) 2 V ,

and this (u; c) 2 V is also determined by the variational equations

a(u; c; v ; d) = F (v ; d); for all (v ; d) 2 V :

nuity of a and F and the V-ellipticity of a.

Proof: The proof of this theorem is based on the establishment of the conti-

248

M. Hegland, S. Roberts, and I. Altas

The continuity of a and F follows immediately from the fact that the u mapping u 7! Pu is continuous from H 1 ( )2 to IRn . This follows from standard regularity theorems for elliptic problems which imply that if u 2 H 1 ( )2 then u0 2 H 2 ( ). The pointwise control is provided by the Sobolev imbedding theorem which states that H 2 ( ) C 0 ( ). Indeed, there exists a constant C such that ku0kL1 C kuk1 . To prove that a is V -elliptic, we note that u c u c c u u (Pu + Xc)T (Pu + Xc) (1 ? )cT X T Xc + (1 ? ?1)PuT Pu for any < 1, and so c u u a (u; c; u; c) n?1(1 ? )cT X T Xc + n?1(1 ? ?1)PuT Pu + (ju1j2 + ju2j2 ): 1 1 Provided > 0 and X T X is positive de nite, that is the data points are not collinear, we can make a choice of close to 1 which ensures that n?1 (1 ? ?1 )Pu T Pu is smaller than (ju1 j2 + ju2 j2 ). Consequently there exists a u u 1 1 positive constant CE so that a (u; c; u; c) CE k(u; c)k2 . In particular the V bilinear form a is V -elliptic. With this new method we have not guaranteed that u is the gradient of a function. To ensure this we need to apply the condition curl(u) = 0, where curl(u) = u2; 1 ? u1; 2 . The space W = f(u; c) 2 V j curl(u) = 0g is a closed subspace of V , and so one has Theorem 2. If the data points fxi g are not collinear, then there is exactly one (u; c) 2 W such that J (u; c) J (v ; d); for all (v ; d) 2 W: If u0 2 H 1 ( ) such that (rv; ru0) = (rv; u); for all v 2 H 1 ( ), then f ; (x) = u0 (x) + cT x is the thin-plate spline over , i.e., the minimiser of the original thin plate spline functional de ned over . Proof: The existence and uniqueness of (u; c) follows from the closedness of W in V 6]. For any (u; c) 2 W with u0 as de ned above one gets u = ru0 . With f (x) := u0(x) + cT x one observes that the value of J (ru0 ; c) is equal to the value of the original thin plate spline functional in terms of f . Thus one gets a rst order characterisation of the thin-plate smoothing spline for a compact domain . In practical tests it was seen that the di erence between the minima over V and W is small. Thus in the following we will discuss the minimisation over V which provides an alternative to Duchon's thin plate spline f ; .

Zero Curl Condition

Finite Element Thin Plate Splines

249

The discretisation of the problem follows the standard nite element framework. We consider a nite dimensional subspace Xh H 1( ), e.g., continuous piecewise bilinear functions on quadrilaterals or piecewise linear func2 tion on triangles. The space Vh approximating V is simply Xh IR3 , where Xh = Xh \ H 1 ( ). In these numerical experiments we are not enforcing the zero curl condition, and so can use simple H 1 elements, in particular continuous piecewise bilinear functions on rectangles. Of course to actually implement this method we need to be able to solve the Poisson equation for u0 which we can only do approximately. For this reason we introduce the following discrete functional c c J ;h(u; c) = n?1(Ph u + Xc ? y )T (Phu + Xc ? y ) + (ju1j2 + ju2 j2) 1 1 and the associated bilinear form ah and the linear form Fh c d ah (u; c; v ; d) = n?1(Ph u + Xc)T (Ph v + Xd) + ((u1; v1 )1 + (u2; v2 )1) d Fh(v ; d) = n?1(Ph v + Xd)T y : Here Ph u is the vector of pointwise values of the function u0;h which solves the discrete Poisson problem (rvh ; ru0;h) = (rvh ; u), for all vh 2 Xh. Using the methods of the last section, together with a standard L1 convergence bound for nite element approximations of the Poisson equation, it is possible to show that ah and Fh are continuous and that ah is V -elliptic. The discrete spline is de ned as the minimiser of J ;h over Vh . This is a non-conforming method since ah 6= a and Fh 6= F , but we can show that Theorem 3. If the data points fxi g are not collinear, then there is exactly one (uh ; ch ) 2 Vh such that J ;h(uh; ch) J ;h(v h; dh); for all (v h ; dh) 2 Vh and this (uh; ch) 2 Vh is also determined by the variational equations ah (uh ; ch ; v h ; dh ) = Fh (v h ; dh ); for all (v h ; dh ) 2 Vh : In addition, for any h0 > 0 and > 0 there exists a positive constant C such that k(u; c) ? (uh; ch)kV Ch1? n?1=2ky kIRn for all h < h0 . Here (u; c) is the solution of the continuous problem described in Theorem 1. Hence our method is essentially rst order accurate in the V norm. The appearing in this theorem is due to the fact that H 1+ ( ) C 0 ( ) for any > 0. The bound on k(u; c) ? (uh; ch)kV follows from the fact that ja(u; c; v ; d) ? ah (u; c; v ; d)j C h1? k(u; c)kV k(v ; d)kV kF (v ; d) ? Fh(v ; d)k C h1? n?1=2 ky kIRn k(v ; d)kV for (u; c); (v ; d) 2 V .

x3. Discretisation of the First Order Thin Plate Spline

250

M. Hegland, S. Roberts, and I. Altas

The Equations

Now we can describe the equations we must solve. We need to nd (uh; ch) 2 Vh and u0;h 2 Xh. Let b0; b1; : : :; bm be a basis of Xh such thatP0mx) = 1 for b( all x 2 . Then one can uniquely represent v 2 Xh by v = c + i=1 vi bi . Now set v j = (vj1; : : :; vjm )T , j = 0; 1; 2. Then one gets the matrix representation of J as

c v J (v 0; v 1 ; v 2 ; c) = n?1 kNv 0 + Xck2 + (v T Av 1 + v T Av 2 ); 2 v 1 v

where k k denotes the Euclidian norm in IRn , N ]ij = bj (xi ) and A]ij = (rbi ; rbj ); i; j = 1; : : :; m. Note that A is positive de nite as b0 (which corresponds to the nullspace of the Laplacian) has been treated separately. This representation has also been chosen so that the terms with ci drop out and v one gets the constraint Av 0 ? B1v 1 ? B2v 1 = 0, where Bk ]ij = (@bi=@ k ; bj ). This is a quadratic optimisation problem, and one gets the equivalent linear system of equations 10]

T A 0 0 0 ?B1 3 2 u1 3 2 0 3 T A 0 0 ?B2 7 6 u2 7 6 0 7 6 0 6 0 n?1N T N n?1N T X A 7 6 u0 7 = 6 n?1 N T y 7 ; 7 7 6 76 6 0 4 0 0 n?1 X T N n?1 X T X 0 5 4 c 5 4 n?1 X T y 5 0 w ?B1 ?B2 A 0 0

2

where w is the Lagrange multiplier vector. It is seen that the system matrix is symmetric inde nite and sparse as the blocks A, N T N and Bk are sparse. The equation is solved using a variant of Uzawa's method. The details of the solution method are described in 9]. All matrices not involving X or N can be treated as standard nite element matrices and are independent of the data size, n. The building up of the matrices N T N , N T X and X T X requires that each measurement is visited. However, the storage of the information and the further processing only depends on the resolution of the elements determined by m. Finally we would like to demonstrate our method on an example problem. For the following example we use the \peaks" function from Matlab. The peaks function consists of a linear combination of several scaled and translated Gaussian distributions. All the computations were done with Matlab on a Sun Sparc workstation at ANU. We compare our method with Duchon's standard thin plate splines and with nite element least square tting. We use 500 random equally distributed data points on 0; 1]2. To the data which is in the range of ?10; 10] a data error normally distributed with expectation 0 and standard deviation 0.1 was added. As the choice of the smoothing parameter is not discussed here we choose = 10?5 xed.

x4. Examples

Finite Element Thin Plate Splines

exact duchon tps fem least squares fem

251

10 10

10

10

0

0

0

0

?10 1 0.5 0 0 0.5

?10 1 1

0.5 0 0

?10 1 1 0.5

0.5 0 0

?10 1 1 0.5

0.5 0 0

1 0.5

data points 1 0.8 0.6 0.4 0.2 0 ?10 1 10

duchon ? tpsfem

lsfem ? tpsfem

10

0

0

?10 1 0.5 1 0.5 0 0 0.5 0 0 1 0.5

0

0.5

1

Fig. 1. Comparison of thin plate splines (duchon), First order Thin plate splines (tps fem) and Finite element least square t (lsfem).

As can be seen in Figure 1, the thin plate spline (duchon) and the rst order thin plate spline (tpsfem) produce very similar results except in the region close to the boundary. Remember that the thin plate splines are computed for the domain IR2 where the rst order thin plate spline is computed for the square. Furthermore, the curl condition is not enforced. The nite element least squares solution has too few points and produces an ill-conditioned linear system and poor results. Let us summarise the main points of our paper. We have described a new discrete thin plate spline based on a rst order formulation of the variational problem. This provides a method which can deal with large data sets, and produces results similar to Duchon's standard thin plate splines. Acknowledgments. This work was supported by the Cooperative Research Centre for Advanced Computational Systems at the Australian National University.

x5. Conclusions

References

1. Arcangeli, R., Some applications of discrete Dm splines, in Mathematical Methods in Computer Aided Geometric Design, T. Lyche and L. L. Schumaker (eds.), Academic Press, New York, 1989, 35{44.

252

M. Hegland, S. Roberts, and I. Altas

2. Beatson, R. K. and G. N. Newsam, Fast evaluation of radial basis functions. I, Comput. Math. Appl. 24 (1992), 7{19. 3. Beatson, R. K. and M. J. D. Powell, An iterative method for thin plate spline interpolation that employs approximations to Lagrange functions, Pitman Res. Notes Math. Ser. 303 (1994), 17{39. 4. Beatson, R. K., G. Goodsell, and M. J. D. Powell, On multigrid techniques for thin plate spline interpolation in two dimensions, Lectures in Appl. Math. 32 (1996), 77{97. 5. Berry, M. J. A. and G. S. Lino , Data Mining Techniques for Marketing, Sales and Customer Support, John Wiley and Sons, New York, 1997. 6. Ciarlet, P. G., The Finite Element Method for Elliptic Problems, NorthHolland, Netherlands, 1978. 7. Duchon, J., Splines minimizing rotation-invariant semi-norms in Sobolev spaces, Lecture Notes in Math. 571 (1977), 85{100. 8. Hastie, T. J. and R. J. Tibshirani, Generalized Additive Models, Chapman and Hall, London, 1990. 9. Hegland, M., S. Roberts, and I. Altas, Finite element thin plate splines for smoothing applications, submitted to Computational Techniques and Applications: CTAC97. 10. Golub, G. H. and C. F. Van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore, 1989. 11. Sibson, R. and G. Stone, Computation of thin-plate splines, SIAM J. Sci. Statist. Comput. 12 (1991), 1304{1313. 12. Torrens, J. J., F. J. Seron, M. C. Lopez de Silanes, R. Arcangeli, and D. Apprato, Finite-element interpolation of nonregular parametric surfaces, (Contributions to the computation of curves and surfaces), (Puerto de la Cruz), 1989, 101{108. Markus Hegland Computer Sciences Laboratory The Australian National University Canberra ACT 0200, AUSTRALIA

markus.hegland@anu.edu.au

stephen.roberts@anu.edu.au

Stephen Roberts School of Mathematical Sciences The Australian National University Canberra ACT 0200, AUSTRALIA

Irfan Altas, School of Information Studies Charles Sturt University Wagga Wagga, NSW 2678, AUSTRALIA

ialtas@csu.edu.au

更多相关文档:

COMPACT SUPPORT *THIN* *PLATE* *SPLINE* ALGORI

Altas, *Finite* *Element* *Thin* *Plate* *Splines* *for* *Data* *Mining* *Applications*, in Mathematical Methods *for* Curves and Surfaces II, M. Daehlen, T. Lyche and L....

A kind of bivariate *spline* space over rectangular partition and pure bending of *thin* *plate*_数学_自然科学_专业资料。维普资讯 http://www.cqvip.com 一 %...

as the number of observations *for* *data* *mining* *applications* is often in ...We have developed a ?nite *element* approximation of a *thin* *plate* *spline* ...

and *thin* *plate* *splines*, multivariate Kriging and Kriging *for* large *data* ...kzs A spatial smoothing algorithm based on convolutions of *finite* rectangular...

139-151, 1998 Interpolation of Rainfall *Data* with *Thin* *Plate* Smoothing *Splines* - Part I: Two Dimensional Smoothing of *Data* with Short Range Correlation ...

Registration of Medical Image *Data* Using Approximating *Thin*-*Plate* *Splines*", ... Approximating *Thin*-Pla... 暂无评价 14页 免费 *Finite* *Element* *Thin* Pl.....

Conference on Pattern Recognition, Cambridge, UK, August 2004 Pose Invariant Affect Analysis using *Thin*-*Plate* *Splines* Abstract This paper introduces a method ...

The final *element* *for* the *thin*-*plate* *splines* image warping applet is the ImageWarpFunction class. This class implements the *thin*-*plate* *splines* interpolation...

surface Laplacian, diffusion, *finite* *element*, *thin*-*plate* *spline*, ...However, when real-world *data* is acquired, fiducials are anticipated to be...

Fitting scattered *data* on sphere-like surfaces ...*For* interpolation, we discuss macro-*element* methods... spherical analogs of *thin* *plate* *splines* 32, 34...

4. Typical *thin*-walled section Explicit stiffness matrix expressions *for* *plate*...*data*base of values computed through *finite* *element* analysis *for* a 338 T.D...

To ?t a *thin* *plate* *spline* with 2500 *data* points as knots is not only ...segmentation methods, and | | the number of *elements* in the set in ...

joint of *thin* *plates*_机械/仪表_工程科技_专业资料... Ecole des *Mines* de Douai, BP 10838, F-59508...Fig. 2. *Finite* *Element* mesh and a cross-...

In our experiments, we have used a *data* set ...*element* approximations) - we solve the *splines* \...*For* example, the *thin*-*plate* *spline* in 2D is ...

Magnetic Resonance Image Segmentation with *Thin* *Plate* *Spline* Thresholding_IT认证_资格考试/认证_教育专区。There are many ways to segment magnetic resonance ...

更多相关标签:

相关文档

- Data Mining For Space Applications
- Bijective Image Registration using Thin-Plate Splines.
- Electric potential approximations for an eight node plate finite element
- Spatial Data Mining and Its Applications
- Finite Element Analysis of Thin Steel Plate
- Web usage mining Discovery and applications of usage patterns from web data
- Nonlinear Finite Element Analysis of Thin Strip Temper Rolling Process
- Data mining techniques and applications – A decade review from 2000 to 2011
- Introduction to the special issue on successful realworld data mining applications
- Developing innovative applications in agriculture using data mining

文档资料共享网 nexoncn.com
copyright ©right 2010-2020。

文档资料共享网内容来自网络，如有侵犯请联系客服。email:zhit325@126.com