On dit qu'une fonction booléenne f : { 0 , 1 } n → { 0 , 1 }f:{0,1}n→{0,1} est une k-k junta si ff a au plus kk variables d'influence.
Soit f : { 0 , 1 } n → { 0 , 1 }f:{0,1}n→{0,1} une junte de 2 k2k . Notons les variables de ff par x 1 , x 2 , … , x nx1,x2,…,xn . Fix
S 1 = { x 1 , x 2 , … , x n2 },S 2 = { x n2 +1,xn2 +2,…,xn}.
S1={x1,x2,…,xn2},S2={xn2+1,xn2+2,…,xn}.
Clairement, il existe
S∈{S1,S2}S∈{S1,S2}tel que
SScontient au moins
kkdes variables d'influence de
ff.
Soit maintenant ϵ > 0ϵ>0 , et supposons que f : { 0 , 1 } n → { 0 , 1 }f:{0,1}n→{0,1} est ϵ-ϵ loin de chaque 2 k-2k junta (c'est-à-dire qu'il faut changer une fraction d'au moins ϵϵ des valeurs de ff pour en faire une junte à 2 k2k ). Pouvons-nous faire une version "robuste" de la déclaration ci-dessus? Autrement dit, existe-t-il une constante universelle cc et un ensemble S ∈ { S 1 , S 2 }S∈{S1,S2}tel que ff est ϵc-ϵc loin de chaque fonction qui contient au pluskkvariables d'influence dansSS?
Remarque: Dans la formulation originale de la question, cc était fixé à 22 . L'exemple de Neal montre qu'une telle valeur de cc ne suffit pas. Cependant, comme dans les tests de propriétés, nous ne nous préoccupons généralement pas trop des constantes, j'ai un peu assoupli la condition.
Réponses:
La réponse est oui". La preuve est par contradiction.
Pour plus de commodité, notons les premières n / 2 variables par x et les secondes n / 2 variables par y . Supposons que f ( x , y ) soit δ- proche d'une fonction f 1 ( x , y ) qui ne dépend que des coordonnées k de x . Désignons ses coordonnées influentes par T 1 . De même, supposons que f ( x , y ) soitn/2 x n/2 y f(x,y) δ f1(x,y) k x T1 f(x,y) δδ -close to a function f2(x,y)f2(x,y) which depends only on kk coordinates of yy . Denote its influential coordinates by T2T2 . We need to prove that ff is 4δ4δ - close to a 2k2k -junta ˜f(x,y)f~(x,y) .
Disons que ( x 1 , y 1 ) ∼ ( x 2 , y 2 ) si x 1 et x 2 s'accordent sur toutes les coordonnées en T 1 et y 1 et y 2 s'accordent sur toutes les coordonnées en T 2 . Nous choisissons uniformément au hasard un représentant de chaque classe d'équivalence. Soit ( ˉ x , ˉ y ) le représentant de la classe de ( x ,(x1,y1)∼(x2,y2) x1 x2 T1 y1 y2 T2 (x¯,y¯) y ) . Définissez ˜ f comme suit:
˜ f ( x , y ) = f ( ˉ x , ˉ y ) .(x,y) f~
Il est évident que ˜ f est une junte de 2 k (cela ne dépend que des variables de T 1 ∪ T 2 ) . Nous prouverons qu'il est à la distance 4 δ de f dans l'attente.f~ 2k T1∪T2) 4δ f
Nous voulons prouver que Pr ˜ f ( Pr x , y ( ˜ f ( x , y ) ≠ f ( x , y ) ) ) = Pr ( f ( ˉ x , ˉ y ) ≠ f ( x , y ) ) ≤ 4 δ , où x et y sont choisis uniformément au hasard. Considérons un vecteur aléatoire
We have, Pr(f(x,y)≠f(˜x,y))≤Pr(f(x,y)≠f1(x,y))+Pr(f1(x,y)≠f1(˜x,y))+Pr(f1(˜x,y)≠f(˜x,y))≤δ+0+δ=2δ.
Similarly, Pr(f(˜x,y)≠f(˜x,˜y))≤2δPr(f(x~,y)≠f(x~,y~))≤2δ . We have
Pr(f(ˉx,ˉy)≠f(x,y))≤4δ.
It easy to “derandomize” this proof. For every (x,y)(x,y) , let ˜f(x,y)=1f~(x,y)=1 if f(x,y)=1f(x,y)=1 for most (x′,y′)(x′,y′) in the equivalence class of (x,y)(x,y) , and ˜f(x,y)=0f~(x,y)=0 , otherwise.
la source
The smallest cc that the bound holds for is c=1√2−1≈2.41c=12√−1≈2.41 .
Lemmas 1 and 2 show that the bound holds for this cc .
Lemma 3 shows that this bound is tight.
(In comparison, Juri's elegant probabilistic argument gives c=4c=4 .)
Let c=1√2−1c=12√−1 .
Lemma 1 gives the upper bound for k=0k=0 .
Lemma 1: If ff is ϵgϵg -near a function gg that has no influencing variables
in S2S2 , and ff is ϵhϵh -near a function hh that has no influencing variables
in S1S1 , then ff is ϵϵ -near a constant function,
where ϵ≤(ϵg+ϵh)/2cϵ≤(ϵg+ϵh)/2c .
Proof. Let ϵϵ be the distance from ff to a constant function.
Suppose for contradiction that ϵϵ does not satisfy the claimed inequality.
Let y=(x1,x2,…,xn/2)y=(x1,x2,…,xn/2) and z=(xn/2+1,…,xn)z=(xn/2+1,…,xn)
and write ff , gg , and hh as f(y,z)f(y,z) , g(y,z)g(y,z) and h(y,z)h(y,z) ,
so g(y,z)g(y,z) is independent of zz and h(y,z)h(y,z) is independent of yy .
(I find it helpful to visualize ff as the edge-labeling
of the complete bipartite graph with vertex sets {y}{y} and {z}{z} ,
where gg gives a vertex-labeling of {y}{y} ,
and hh gives a vertex-labeling of {z}{z} .)
Let g0g0 be the fraction of pairs (y,z)(y,z) such that g(y,z)=0g(y,z)=0 .
Let g1=1−g0g1=1−g0 be the fraction of pairs such that g(y,z)=1g(y,z)=1 .
Likewise let h0h0 be the fraction of pairs such that h(y,z)=0h(y,z)=0 ,
and let h1h1 be the fraction of pairs such that h(y,z)=1h(y,z)=1 .
Without loss of generality, assume that, for any pair such that g(y,z)=h(y,z)g(y,z)=h(y,z) ,
it also holds that f(y,z)=g(y,z)=h(y,z)f(y,z)=g(y,z)=h(y,z) . (Otherwise, toggling the value of
f(y,z)f(y,z) allows us to decrease both ϵgϵg and ϵhϵh by 1/2n1/2n ,
while decreasing the ϵϵ by at most 1/2n1/2n ,
so the resulting function is still a counter-example.) Say any such pair is ``in agreement''.
The distance from ff to gg plus the distance from ff to hh
is the fraction of (x,y)(x,y) pairs that are not in agreement.
That is, ϵg+ϵh=g0h1+g1h0ϵg+ϵh=g0h1+g1h0 .
The distance from ff to the all-zero function is at most 1−g0h01−g0h0 .
The distance from ff to the all-ones function is at most 1−g1h11−g1h1 .
Further, the distance from ff to the nearest constant function is at most 1/21/2 .
Thus, the ratio ϵ/(ϵg+ϵh)ϵ/(ϵg+ϵh) is at most
min(1/2,1−g0h0,1−g1h1)g0h1+g1h0,
By calculation, this ratio is at most 12(√2−1)=c/212(2√−1)=c/2 . QED
Lemma 2 extends Lemma 1 to general kk by arguing pointwise,
over every possible setting of the 2k2k influencing variables.
Recall that c=1√2−1c=12√−1 .
Lemma 2: Fix any k. If f is ϵg-near a function g that has k influencing variables in S2, and f is ϵh-near a function h that has k influencing variables in S1, then f is ϵ-near a function ˆf that has at most 2k influencing variables, where ϵ≤(ϵg+ϵh)/2c.
Proof. Express f as f(a,y,b,z) where (a,y) contains the variables in S1 with a containing those that influence h, while (b,z) contains the variables in S2 with b containing those influencing g. So g(a,y,b,z) is independent of z, and h(a,y,b,z) is independent of y.
For each fixed value of a and b, define Fab(y,z)=f(a,y,b,z), and define Gab and Hab similarly from g and h respectively. Let ϵgab be the distance from Fab to Gab (restricted to (y,z) pairs). Likewise let ϵhab be the distance from Fab to Hab.
By Lemma 1, there exists a constant cab such that the distance (call it ϵab) from Fab to the constant function cab is at most (ϵhab+ϵgab)/(2c). Define ˆf(a,y,b,z)=cab.
Clearly ˆf depends only on a and b (and thus at most k variables).
Let ϵˆf be the average, over the (a,b) pairs, of the ϵab's, so that the distance from f to ˆf is ϵˆf.
Likewise, the distances from f to g and from f to h (that is, ϵg and ϵh) are the averages, over the (a,b) pairs, of, respectively, ϵgab and ϵhab.
Since ϵab≤(ϵhab+ϵgab)/(2c) for all a,b, it follows that ϵˆf≤(ϵg+ϵh)/(2c). QED
Lemma 3 shows that the constant c above is the best you can hope for (even for k=0 and ϵ=0.5).
Lemma 3: There exists f such that f is (0.5/c)-near two functions g and h, where g has no influencing variables in S2 and h has no influencing variables in S1, and f is 0.5-far from every constant function.
Proof. Let y and z be x restricted to, respectively, S1 and S2. That is, y=(x1,…,xn/2) and z=(xn/2+1,…,xn).
Identify each possible y with a unique element of [N], where N=2n/2. Likewise, identify each possible z with a unique element of [N]. Thus, we think of f as a function from [N]×[N] to {0,1}.
Define f(y,z) to be 1 iff max(y,z)≥1√2N.
By calculation, the fraction of f's values that are zero is (1√2)2=12, so both constant functions have distance 12 to f.
Define g(y,z) to be 1 iff y≥1√2N. Then g has no influencing variables in S2. The distance from f to g is the fraction of pairs (y,z) such that y<1√2N and z≥1√2N. By calculation, this is at most 1√2(1−1√2)=0.5/c
Similarly, the distance from f to h, where h(y,z)=1 iff z≥1√2N, is at most 0.5/c.
QED
la source