Comment prouver que la fonction de base radiale est un noyau? Pour autant que je sache, afin de prouver cela, nous devons prouver l'un des éléments suivants:
Pour tout ensemble de vecteurs matrice K ( x 1 , x 2 , . . . , X n ) = ( k ( x i , x j ) ) n × n est semi - définie positive.
A mapping can be presented such as = .
Any help?
svm
kernel-trick
Leo
la source
la source
Réponses:
Zen used method 1. Here is method 2: Mapx to a spherically symmetric Gaussian distribution centered at x in the Hilbert space L2 . The standard deviation and a constant factor have to be tweaked for this to work exactly. For example, in one dimension,
So, use a standard deviation ofσ/2–√ and scale the Gaussian distribution to get k(x,y)=⟨Φ(x),Φ(y)⟩ . This last rescaling occurs because the L2 norm of a normal distribution is not 1 in general.
la source
I will use method 1. Check Douglas Zare's answer for a proof using method 2.
I will prove the case whenx,y are real numbers, so k(x,y)=exp(−(x−y)2/2σ2) . The general case follows mutatis mutandis from the same argument, and is worth doing.
Without loss of generality, suppose thatσ2=1 .
Writek(x,y)=h(x−y) , where
For real numbersx1,…,xn and a1,…,an , we have
To understand this result in greater generality, check out Bochner's Theorem: http://en.wikipedia.org/wiki/Positive-definite_function
la source
I'll add a third method, just for variety: building up the kernel from a sequence of general steps known to create pd kernels. LetX denote the domain of the kernels below and φ the feature maps.
Scalings: Ifκ is a pd kernel, so is γκ for any constant γ>0 .
Proof: ifφ is the feature map for κ , γ−−√φ is a valid feature map for γκ .
Sums: Ifκ1 and κ2 are pd kernels, so is κ1+κ2 .
Proof: Concatenate the feature mapsφ1 and φ2 , to get x↦[φ1(x)φ2(x)] .
Limits: Ifκ1,κ2,… are pd kernels, and κ(x,y):=limn→∞κn(x,y) exists for all x,y , then κ is pd.
Proof: For eachm,n≥1 and every {(xi,ci)}mi=1⊆X×R we have that ∑mi=1ciκn(xi,xj)cj≥0 . Taking the limit as n→∞ gives the same property for κ .
Products: Ifκ1 and κ2 are pd kernels, so is g(x,y)=κ1(x,y)κ2(x,y) .
Proof: It follows immediately from the Schur product theorem, but Schölkopf and Smola (2002) give the following nice, elementary proof. Let
Powers: Ifκ is a pd kernel, so is κn(x,y):=κ(x,y)n for any positive integer n .
Proof: immediate from the "products" property.
Exponents: Ifκ is a pd kernel, so is eκ(x,y):=exp(κ(x,y)) .
Proof: We haveeκ(x,y)=limN→∞∑Nn=01n!κ(x,y)n ; use the "powers", "scalings", "sums", and "limits" properties.
Functions: Ifκ is a pd kernel and f:X→R , g(x,y):=f(x)κ(x,y)f(y) is as well.
Proof: Use the feature mapx↦f(x)φ(x) .
Now, note that
la source