Appendix S1: Multiscale analyses  1 
We used wavelet‐based techniques to analyze the spatial structure of MCH and elevation  in the study 2 
region, and  the  relationship between  the  two.   These  techniques have been widely adopted  to  study 3 
multi‐scale  processes,  because  wavelet  theory  is  based  on  scale‐wise  decomposition.  In  particular, 4 
wavelets have proven effective in extracting statistical properties of a variety of long‐range dependence 5 
phenomena,  including  fractals  and  other  scale‐invariant  processes,  in  one  or  more  dimensions 6 
(Heneghan, Lowen, & Teich, 1996). Here we focus on the variance of a 2‐D spatial process, studied using 7 
wavelet decomposition.  8 
Given  a  spatial process  f (x),with  the  vector x = [x, y] denoting Cartesian  coordinates,  the wavelet 9 
coefficients Wf (b,s,) are defined as 10 
1
,( , , ) ( ) ( )f sW s s f d     b x x b x             (S1) 11 
where  , ( )s  x b  is the complex conjugate of the scaled, translated and rotated mother wavelet, s the 12 
scale at which the transform  is applied, b the translation vector, and  the angle of orientation of the 13 
anisotropic  wavelet.  Usually  the  convolution  integral  is  computed  in  the  Fourier  transform  to  take 14 
advantage of the fast Fourier routine: 15 
1
,ˆ ˆˆ ( , , ) ( ) ( )f sW s s f   k k k               (S2) 16 
where ˄  indicates  2‐D  Fourier  transform  and  the  vector k = [kx, ky] denotes  the  angular  frequency 17 
space. One of the most commonly used wavelets is the Morlet, which has a Fourier transform given by 18 
   2 2, 01 1ˆ exp 2 2r rs x ys sk k skc                    (S3) 19 
where c  is a  constant  to ensure unit energy, k0  is a  shape parameter,   is  the anisotropic  ratio and 20 
[ ]r r rx yk kk  are the rotated angular frequencies  r k kR  with rotation matrix  21 
cos sin
sin cos( )
 
  
    R .  22 
If the problem is isotropic, it is possible to express the Morlet in a simplified isotropic form and remove 23 
the dependence on : 24 
 201ˆ exp 2s
s s kc
      k               (S4) 25 
The  functions    are  essentially  band‐pass  filters whose  properties  depend  on  the  associated  scaling 26 
parameter,  s.  They  enable  a  focus  on  features  of  the  process  whose  detail  matches  their  scale 27 
parameters,  i.e., broad  features  for  large  s and  fine  features  for  small  s  (Praveen Kumar & Foufoula‐28 
Georgiou, 1997). The wavelet variance or spectrum, Sff (s,), is computed as:  29 
2( , ) | ( , , ) |ff fAS s W s d   b b                (S5) 30 
integrated over a  sub‐domain A which  can be  considered weakly  stationary  (i.e.  the  first and  second 31 
order moments are spatially homogeneous). 32 
We calculated the wavelet spectrum for elevation and for MCH.  A plot of the wavelet spectrum – that 33 
is, the wavelet variance as function of scale s – depicts the contribution to the total spatial variability by 34 
structures or patterns with a typical scale comparable to s. A peak in the spectrum means that patterns 35 
of  a  characteristic  scale  are  dominant.  A  scale‐invariant  process  exhibits  no  peaks  in  the  wavelet 36 
spectrum; its wavelet variance is a power function of scale (Sff ~sa), meaning Sff appears linear on a log‐37 
log representation.  38 
We  examined whether  elevation  and MCH  had  anisotropic  patterns  by  performing  spectral  analyses 39 
with  an  anisotropic Morlet with    varying.    The  results were  evaluated  using    circular  plots  of  the 40 
wavelet variance, with scale as the radial coordinate and angle as the azimuthal coordinate (P Kumar, 41 
1995).  If there is more energy in some directions than others, this indicates the presence of anisotropy 42 
in  the spatial patterns  (which may exist  for a  limited number of scales only). The anisotropic analyses 43 
were carried out on circular subsets of the studied area to avoid edge effects.  44 
We  calculated  the  wavelet  co‐spectrum  and  wavelet  coherence  between  elevation  and  MCH,  to 45 
examine  how  they  are  related  at different  scales.    The wavelet  co‐spectrum, Cfg(s,), between  two 46 
spatial variables f (x) and g(x) shows how two processes co‐vary as functions of scale, and is defined as 47 
( , ) ( , , ) ( , , )fg f gAC s W s W s d    b b b                (S6) 48 
Combining wavelet spectra and cospectra we compute  the wavelet coherence, which  is a measure of 49 
correlation among spatial processes at different scales and different anisotropic angles:  50 
2( , )( , ) ( , ) ( , )
fg
ff gg
C swch s S s S s
                    (S7) 51 
The wavelet coherence  is analogous to an R2 value, and ranges from 0 to 1. We calculated confidence 52 
intervals  for  the  null  hypothesis  of  no  correlation  (H0:  wch = 0)  using  Monte  Carlo  methods.  53 
Specifically, we used the iterative amplitude adjusted Fourier transform (IAAFT) to randomize one of the 54 
processes (in our case MCH) in a way that preserves the same probability density function as the original 55 
template  and  also  preserves  its  structure  and  in  particular  the  second‐order moments  (the  spectral 56 
density  or  the  autocorrelation  function)  (Schreiber &  Schmitz,  2000; Venema,  2006).   We  generated 57 
1000  IAAFT  data  surrogates  for  MCH  in  a  rectangular  subset  (for  an  example,  see  supplemental 58 
materials  S3)  and  calculated  the  wavelet  coherence  with  elevation  for  each  surrogate.   We  then 59 
compared the coherence of true MCH and elevation with the 95% confidence  interval generated from 60 
these surrogates.   61 
Topographic and hydrological variables  62 
We calculated maps of slope, aspect, and various measures of curvature  (Table 1) based on elevation 63 
maps  smoothed at multiple  scales  (Lashermes, Foufoula‐Georgiou, & Dietrich, 2007).   Smoothing was 64 
performed using Gaussian kernels  for 61 different scales ranging  log‐evenly between 2.5 and 1250 m.  65 
We smoothed at multiple scales because the slope and other topographic variables take different values 66 
depending  on  the  degree  of  smoothing, which  essentially  gives  the  scale  at which  the  variables  are 67 
calculated, and MCH may be  related  to  the value based on smoothing at some scales but not others.  68 
(For no smoothing or only small‐scale smoothing, the topographic variables reflect very  local features, 69 
while  for  large  scale  smoothing  they  reflect  larger  scale  features.)   These  topographic variables were 70 
computed  using  linear  and  nonlinear  combinations  of  first  and  second  order  derivatives  of  the 71 
smoothed DEM (Table 1). High‐order derivatives of the DEM were computed from smoothed elevation 72 
maps  ( ) ( )Hh H h d   x x x x   where  h  is  the  elevation,  H  some  smoothing  function  (here  a 73 
Gaussian).  By the properties of the convolution, the nxm‐order derivative can be computed as 74 
( ) ( )
mn mn
m n m n
h N , h dx y x y
     
 x x x x
                (S8) 75 
where N(x,) is a Gaussian kernel with variance 2. The convolution is again efficiently computed using 76 
the FFT routine.  77 
Spectral  properties  of  elevation  derivatives  (or  any  linear  combination  thereof)  follow  directly  from 78 
elevation spectral properties. For example,  the Lapacian  is  the sum of  the second derivatives  in  the x 79 
and  y  directions,  and  thus  essentially  reflects  the  local  convexity  of  the  landscape.  Derivatives  of  a 80 
Gaussian are wavelets (the second derivative  is also called the Mexican Hat wavelet), which  in Fourier 81 
space are expressed, for any order m and n as 82 
 2( ) ( ) 1ˆ exp 2
m n
x y
s
ik ik sc
       ky                (S9) 
83 
Hence,  the wavelet  transform of  the DEM with a Mexican Hat  is  identical  (except  for a normalization 84 
factor)  to  the  Lapacian obtained after  smoothing  the elevation map with a Gaussian  kernel.  Spectral 85 
properties of non‐linear combinations of derivatives are less straightforward.  86 
We evaluated pairwise correlations of MCH with each topographic variable at each scale.   Specifically, 87 
for the noncircular variables, we calculated Pearson’s correlation coefficients.   For the circular variables 88 
of slope and aspect, we calculated linear‐circular correlations (Mardia, 1976).   89 
The drainage network was extracted from the DEM using the D8 algorithm (Jenson & Domingue, 1988; 90 
O’Callaghan & Mark,  1984)  implemented  in ArcGIS  (ESRI,  v10).  Channels were defined  as  those  that 91 
drain an area of 250 m2 or more. We qualitatively assessed  the  relationship of MCH  to  the drainage 92 
network by visually inspecting maps that overlaid MCH and the drainage network at different sub‐basin 93 
scales.  We then calculated maps of flow distance from nearest stream, defined as the distance that an 94 
imaginary water particle travels to reach the closest channel, and we correlated this distance with MCH 95 
across the landscape.  Finally, we computed the average value of MCH conditioned upon the distance to 96 
the nearest channel for the real dataset, and compared these with 95% confidence  intervals from the 97 
surrogates generated using IAAFT.  All analyses were done in Matlab v7 (R2010a). 98 
References 99 
Heneghan, C., Lowen, S., & Teich, M. (1996). Two‐dimensional fractional Brownian motion: wavelet 100 
analysis and synthesis. Proceedings of the IEEE Southwest Symposium on Image Analysis and 101 
Interpretation, 1996. (pp. 213–217).  102 
Jenson, S., & Domingue, J. (1988). Extracting topographic structure from digital elevation data for 103 
geographic information system analysis. Photogrammetric engineering and remote sensing, 54(11), 104 
1593–1600.  105 
Kumar, P. (1995). A wavelet based methodology for scale‐ space anisotropic analysis. Geophysical 106 
research letters, 22(20), 2777–2780. doi:10.1029/95GL02934 107 
Kumar, Praveen, & Foufoula‐Georgiou, E. (1997). Wavelet analysis for geophysical applications. Reviews 108 
of Geophysics, 35(4), 385–412. doi:10.1029/97RG00427 109 
Lashermes, B., Foufoula‐Georgiou, E., & Dietrich, W. E. (2007). Channel network extraction from high 110 
resolution topography using wavelets. Geophysical Research Letters, 34(23), 1–6. 111 
doi:10.1029/2007GL031140 112 
Mardia, K. (1976). Linear‐circular correlation and rhythmometry. Biometrika, 63, 403–405. 113 
O’Callaghan, J., & Mark, D. (1984). The extraction of drainage networks from digital elevation data. 114 
Computer vision, graphics, and image, 28(3), 323–344.  115 
Schreiber, T., & Schmitz, A. (2000). Surrogate time series. Physica D: Nonlinear Phenomena, 142(3‐4), 116 
346–382. doi:10.1016/S0167‐2789(00)00043‐9 117 
Venema, V. (2006). Surrogate cloud fields generated with the iterative amplitude adapted Fourier 118 
transform algorithm. Tellus, 58A, 104–120. 119 
 120 
%%wavelet variance Matlab code,  121 
% input  122 
%     s: mxn matrix 123 
% 124 
% output 125 
%     E: vector wavelet variance 126 
 127 
[m,n]=size(s);  % matrix size 128 
dj=0.15;        % scale discretization parameter 129 
k0=8;           % Morlet shape factor 130 
J1=50;          % number of scales 131 
Ff=4*pi/(k0+sqrt(4+k0^2));  % Fourier factor 132 
 133 
  134 
scale = 2*2.^((0:J1)*dj); %define scale array 135 
 136 
f = fft2(s);                % 2-D fast Fourier transform 137 
FS(1,:) = abs(f(:)).^2/m/n; % magnitude square 138 
 139 
pulsx = 2*pi/n*[0:floor((n-1)/2)  floor((1-n)/2):-1]; 140 
pulsy = 2*pi/m*[0:floor((m-1)/2)  floor((1-m)/2):-1]; 141 
 142 
[kx,ky] = meshgrid(pulsx,pulsy); % frequencies space matrices 143 
 144 
k=sqrt(kx.^2+ky.^2); 145 
KS=(k(:)*scale/Ff); 146 
H=exp(-(KS-k0).^2); 147 
E(:,1)=FS*H./sum(H); 148 
 149 
 150 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 151 
%%wavelet co-variance Matlab code s1 and s2 input mxn matrices 152 
[m,n]=size(s1); % matrix size 153 
dj=0.15;        % scale discretization parameter 154 
k0=8;           % Morlet shape factor 155 
J1=50;          % number of scales 156 
Ff=4*pi/(k0+sqrt(4+k0^2));  % Fourier factor 157 
 158 
  159 
scale = 2*2.^((0:J1)*dj); %define scale array 160 
 161 
f1 = fft2(s1);                % 2-D fast Fourier transform 162 
f2 = fft2(s2);                % 2-D fast Fourier transform 163 
FS(1,:) = f1.*conj(f2)/m/n;   % magnitude square 164 
 165 
pulsx = 2*pi/n*[0:floor((n-1)/2)  floor((1-n)/2):-1]; 166 
pulsy = 2*pi/m*[0:floor((m-1)/2)  floor((1-m)/2):-1]; 167 
 168 
[kx,ky] = meshgrid(pulsx,pulsy); % frequencies space matrices 169 
 170 
k=sqrt(kx.^2+ky.^2); 171 
KS=(k(:)*scale/Ff); 172 
H=exp(-(KS-k0).^2); 173 
E(:,1)=FS*H./sum(H); 174 
 175 
 176