# Overview of the code for 'Simple noise estimates and pseudoproxies for the last 21k years'
@[toc]( )
## Preliminaries
This document describes the code for the generation of the pseudoproxies for either a longitude-latitude grid over time or a single grid-point over time.
## Preparation
The code uses a number of packages. Not all of these are strictly necessary for the running of the code but may only assist in visualising the data.
```{r packages,tidy=FALSE,echo=TRUE,eval=FALSE}
rm(list=ls())
RNGversion("3.4.4")
library(ncdf4,quietly=TRUE)
library(zoo,quietly=TRUE)
library(palinsol,quietly=TRUE)
library(maps,quietly=TRUE)
library(caTools,quietly=TRUE)
library(astsa,quietly=TRUE)
library(grDevices,quietly=TRUE)
library(latex2exp,quietly=TRUE)
library(lomb,quietly=TRUE)
library(fields,quietly=TRUE)
library(RColorBrewer,quietly=TRUE)
```
## Model components
| ![Conceptual flow of our procedure.][1] |
| :--: |
|Conceptual flow of our procedure. |
The flowchart above separates the algorithm into five larger chunks. These are the input and the output as well as the three modelling steps of sensor, archive, and observation. In the following the code is explained along the steps in the flowchart.
### Input
The code is prepared to read netcdf-data in a latitude-longitude grid over time. The dimensions of the input data also define the dimensions of intermediate products and output.
```{r input,tidy=FALSE,echo=TRUE,eval=FALSE}
file.path<-"./data/"
file.name<-"input.nc"
file.in<-paste(file.path,file.name,sep="")
nc.file.in<-nc_open(file.in)
length.lat<-nc.file.in$dim$lat$len
length.lon<-nc.file.in$dim$lon$len
val.lat<-nc.file.in$dim$lat$vals
val.lon<-nc.file.in$dim$lon$vals
val.lon[val.lon==360]<-0
input.data<-ncvar_get(nc.file.in)
if (length.lon==1&length.lat==1) {
input.data<-array(input.data,dim=c(1,1,length(input.data)))
}
dim.noise.field<-c(length(input.data[1,1,]),length.lon,length.lat)
```
#### Code Structure
##### Parameters
Some steps of the code need additional information on the time-axis This relates to an inadequate declaration of the time-axis in the input file and assumptions on which resolution is necessary for the initial insolation computations. This is mostly relevant for the insolation computation.
```{r para_input_1,tidy=FALSE,echo=TRUE,eval=FALSE}
seq.time.coarse <- seq(-25050,50,100)
length.time<-length(seq.time.coarse)
time.start <- -21999
seq.time.long<-seq(min(seq.time.coarse),40,1)
seq.time.short<-seq(time.start,40,1)
```
##### Comments on the code
If the input data is a field, we have to consider relevant loops over its spatial dimensions. The following ignores this. Latitudinal and longitudinal indices are set to 1.
```{r help_for_manual,tidy=FALSE,echo=TRUE,eval=FALSE}
i<-1
j<-1
```
In addition, an auxiliary index is defined.
```{r code_input_2,tidy=FALSE,echo=TRUE,eval=FALSE}
ind<-0
```
At the latitude level, the latitude section of the input data is selected for fields of input.
```{r code_input_3,tidy=FALSE,echo=TRUE,eval=FALSE}
i.input.data<-input.data[,i,]
```
At the longitude level, first the auxiliary index is incremented by one and a seed is set. Then, the data for the longitude location is chosen from the latitude section of the data.
```{r code_input_4,tidy=FALSE,echo=TRUE,eval=FALSE}
ind<-ind+1
set.seed((20180530+ind))
if(length.lon==1&length.lat==1){
j.input.data<-i.input.data
}else{
j.input.data<-i.input.data[j,]
}
```
### Sensor model
#### Using latitude data
##### Functions
###### Compute insolation
For simplicities sake, the code first calculates the insolation evolution over all latitudes in the input file. Therefore we have to extract orbital characteristics, which is done using functionality and a code-snippet from the `palinsol` package and its reference manual [Crucifix, 2016].
```{r functions_insol_1,tidy=FALSE,echo=TRUE,eval=FALSE}
func.orbit.ber78 <- function(t) {
c(time=t,eps=ber78(t)[1],ecc=ber78(t)[2],varpi=ber78(t)[3])
}
```
Another function computes insolation sums for a range of months and for a time-period in discrete time-steps. The function requires the sequence of time-steps, the latitude of interest, the start and end calendar months as well as the equivalent days of the year [Crucifix, 2016; Berger, 1978]. The function internally uses a solar constant $S0=1365 Wm^{-2}$.
The function returns an array of insolation sums with the dimensions time, latitude, and, z where currently z equals 13 for the 12 calendar months and the annual average.
```{r functions_insol_2,tidy=FALSE,echo=TRUE,eval=FALSE}
func.insolation.sum <- function(seq.time.coarse,mon1.for.insol,mon2.for.insol,
d1.for.insol.b,d2.for.inso.b,val.lat) {
insolation.sum<-array(NA,dim=c(length.time,length(pi.lat),13))
length.time<-length(seq.time.coarse)
data.orbit.ber78 <- data.frame(t(sapply(seq.time.coarse,
func.orbit.ber78)))
S0=1365
pi.lat<-val.lat*pi/180
for( i in length.time:1) {
help.print.i<-0
if(help.print.i!=0){
print(i)
}
orbit<-c(eps=data.orbit.ber78[i,2],ecc=data.orbit.ber78[i,3],
varpi=data.orbit.ber78[i,4])
for ( j in mon1.for.insol:mon2.for.insol) {
for( k in 1:length(pi.lat)){
insolation.sum[i,k,j]<-Insol_d1d2(orbit,d1.for.insol.b[j],
d2.for.insol.b[j],lat=pi.lat[k],S0=S0,
avg=d2.for.insol.b[j]-d1.for.insol.b[j])
}
}
for( k in 1:length(pi.lat)){
insolation.sum[i,k,13]<-Insol_d1d2(orbit,d1.for.insol.b[1],
d2.for.insol.b[12],lat=pi.lat[k],S0=1365,
avg=d2.for.insol.b[12]-d1.for.insol.b[1])
}
}
return(insolation.sum=insolation.sum)
}
```
###### Compute bias
In the next step a couple of functions translate the insolation data into the bias term.
The first function scales the insolation data for further usage.
```{r functions_bias_1,tidy=FALSE,echo=TRUE,eval=FALSE}
func.insolbias.1<- function(series.sum.insol.0,p=0.25) {
if(sd(series.sum.insol.0)!=0){
# anomaly
insol.help<-series.sum.insol.0-mean(series.sum.insol.0)
#normalisation and shift
series.sum.insol.1.b<-(insol.help)/sd(insol.help)+1
# arbitrary scaling
help.i<-quantile(series.sum.insol.1.b,p=p)/
max(series.sum.insol.1.b)
series.sum.insol.1.c<-series.sum.insol.1.b*help.i
# shifting to series.sum.insol=1 for tim=0BP
series.sum.insol.1<-series.sum.insol.1.c-
series.sum.insol.1.c[seq.time.long==0]+1
}else{
series.sum.insol.1<-series.sum.insol.0+1
}
long.series.sum.insol.1<-series.sum.insol.1
return(list(long.series.sum.insol.1=long.series.sum.insol.1))
}
```
A second function simply considers whether larger insolation relates to a positive bias, a negative bias, or whether this is randomized.
```{r functions_bias_2,tidy=FALSE,echo=TRUE,eval=FALSE}
func.insolbias.2<- function(series.sum.insol.1,switch.orient.bias.seas) {
if (switch.orient.bias.seas==1){
series.sum.insol<-series.sum.insol.1**(sample(c(-1,1),1))
}else if(switch.orient.bias.seas==0){
series.sum.insol<-series.sum.insol.1
}else if(switch.orient.bias.seas==(-1)){
series.sum.insol<-series.sum.insol.1**(-1)
}
}
```
##### Code structure
###### Parameters
Parameters for the bias computation influence the months for which the insolation and the bias are computed, and the amplitude of the final bias. Periods for insolation and bias computation could differ but are the same in the current version. Finally, a switch decides whether warmer climates relate to a negative or a positive bias.
```{r para_bias_1,tidy=FALSE,echo=TRUE,eval=FALSE}
sel.mon1.for.insol<-1#6
sel.mon2.for.insol<-12#8
amp.bias.seas<-4
switch.orient.bias.seas<-0
```
Further declarations relate months to days of the year, provide latitudes in radians, repeat the declaration of the months of interest, and prepare the insolation array.
```{r para_bias_2,tidy=FALSE,echo=TRUE,eval=FALSE}
d1.for.insol<-c(1,32,60,91,121,152,182,213,244,274,305,335)-0.5
d2.for.insol<-c(d1.for.insol[2:12]-0.5,365)-0.5
d1.for.insol.b<-d1.for.insol*360/365
d2.for.insol.b<-d2.for.insol*360/365
pi.lat <- val.lat*pi/180
mon1.for.insol<-1 #6
mon2.for.insol<-12 #8
insolation.sum<-array(NA,dim=c(length.time,length(pi.lat),13))
```
###### Code
The code uses the insolation function over all latitudes of interest and then sums over a selection of months.
```{r code_bias_1,tidy=FALSE,echo=TRUE,eval=FALSE}
insolation.sum<-func.insolation.sum(seq.time.coarse,mon1.for.insol,
mon2.for.insol,d1.for.insol.b,
d2.for.inso.b,val.lat)
sel.insolation.sum<-rowSums(insolation.sum[,1,sel.mon1.for.insol:sel.mon2.for.insol])
```
Part of the bias calculation are done in an outer loop over the latitude dimension. This requires some additional calculations.
```{r code_bias_2,tidy=FALSE,echo=TRUE,eval=FALSE}
i.val.lat<-(pi.lat[i])*180/pi
i.pi.lat<-c(1:length(pi.lat))[pi.lat*180/pi==i.val.lat]
ind.time.0<-c(1:length(seq.time.coarse))[seq.time.coarse==0]
```
The discrete insolation data is interpolated to annual values.
```{r code_bias_3,tidy=FALSE,echo=TRUE,eval=FALSE}
series.sum.insol.0<-
approx(seq.time.coarse,rowSums(insolation.sum[,i.pi.lat,
sel.mon1.for.insol:sel.mon2.for.insol]),
xout=seq.time.long)$y
## not implemented: other seasons
# series.sum.insol.jfd<-
# series.sum.insol.mam<-
# series.sum.insol.son<-
```
Then the first bias-function is applied.
```{r code_bias_4,tidy=FALSE,echo=TRUE,eval=FALSE}
series.sum.insol.1<-func.insolbias.1(series.sum.insol.0,
p=0.25)$l[seq.time.long>=time.start]
```
The second bias function is applied at the longitude level together with a final modification of the insolation bias. The final modification is due to the choice of an additive bias instead of, for example, a multiplicative one.
```{r code_bias_5,tidy=FALSE,echo=TRUE,eval=FALSE}
series.sum.insol <- func.insolbias.2(series.sum.insol.1,
switch.orient.bias.seas)-1
```
The bias is added to the input data in the same step as the initital noise (therefore the following code-chunk is included twice). In this step the bias is also scaled by an ad-hoc factor. Please compare the manuscript for a comparison of a few different scalings.
```{r code_bias_6,tidy=FALSE,echo=TRUE,eval=FALSE}
help.bias.noise.data.save<-(j.input.data+
help.noise.save)+
amp.bias.seas*(series.sum.insol)
```
#### Using temperature data
##### Functions
###### Compute initial noise
In addition to the insolation bias, the sensor model adds the initial noise term reflecting environmental influences and disturbances relative to the signal of interest. The noise depends on the background climate variability in terms of the running standard deviation of the signal record of interest.
Whether more variable background climate relates to larger or smaller noise amplitudes can be randomized or decided via a switch. Afterwards, the moving standard deviation is used as standardard deviation for the innovations in the simulation of an AR-process whose parameters have to be specified.
The following function returns the moving standard deviation and the noise record.
```{r function_sensor_1,tidy=FALSE,echo=TRUE,eval=FALSE}
func.initialnoise<-function(j.input.data,length.window.runsd,amp.noise.env,
model.noise.1,switch.orient.runsd.noise.env) {
runsd.noise.env<-runsd(j.input.data,
k=length.window.runsd,align='center',endrule='constant')
if (switch.orient.runsd.noise.env==1){
runsd.noise.env <- runsd.noise.env**(sample(c(-1,1),1))
}else if (switch.orient.runsd.noise.env==(-1)){
runsd.noise.env <- runsd.noise.env**(-1)
}
help.noise.save<- amp.noise.env*
#runsd.noise.env*rnorm(length(j.input.data),0,1)
c(arima.sim(length(j.input.data),
model=list(ar=model.noise.1),sd=runsd.noise.env))
return(list(help.noise.save=help.noise.save,runsd.noise.env=runsd.noise.env))
}
```
##### Code structure
###### Parameters
Parameters for the initial environmental noise are the amplitude of the noise, a switch for whether larger climate variability results in less or more noise, the noise model, and the length of the period over which the background variability is calculated.
```{r para_noise_1,tidy=FALSE,echo=TRUE,eval=FALSE}
amp.noise.env <- 0.5
switch.orient.runsd.noise.env<-0
model.noise.1 <- c(0.3)#c()
#sd.noise.1 <-
length.window.runsd<-1000
```
While the generalized approach is conceputally identical to the smoothing plus AR approach, part of the parameter declarations are distinct.
```{r para_noise_gen,tidy=FALSE,echo=TRUE,eval=FALSE}
model.gen.noise <-c(0.7)
```
##### Code
At the initial noise step, the function for the noise is applied. For convenience sake, the function provides not only the final noise series but also the running standard deviation of the climate record. The initial noise is saved.
```{r code_noise_1,tidy=FALSE,echo=TRUE,eval=FALSE}
help.init.noise.f<-func.initialnoise(j.input.data,length.window.runsd,
amp.noise.env,model.noise.1,
switch.orient.runsd.noise.env)
help.noise.save<-help.init.noise.f$h
runsd.noise.env<-help.init.noise.f$r
noise.save[,j,i]<-help.noise.save
```
The noise term is added to the input data together with the scaled bias (see above). The record, in turn, is saved.
```{r code_noise_2,tidy=FALSE,echo=TRUE,eval=FALSE}
help.bias.noise.data.save<-(j.input.data+
help.noise.save)+
amp.bias.seas*(series.sum.insol)
bias.noise.data.save[,j,i]<-help.bias.noise.data.save
```
The generalised initial noise uses the same functions and uses the same bias-series as computed above.
```{r code_noise_gen,tidy=FALSE,echo=TRUE,eval=FALSE}
gen.noise.env.f<-func.initialnoise(j.input.data,length.window.runsd,
amp.noise.env,model.gen.noise,
switch.orient.runsd.noise.env)
help.gen.noise.env<-gen.noise.env.f$h
gen.runsd.noise.env<- gen.noise.env.f$r
gen.noise.env[,j,i] <- help.gen.noise.env
noise.gen.dat[,j,i]<-j.input.data+help.gen.noise.env
help.bias.noise.gen.dat<-j.input.data+help.gen.noise.env+
amp.bias.seas*series.sum.insol
bias.noise.gen.dat[,j,i] <- help.bias.noise.gen.dat
```
### Archive model
#### Functions
##### Compute random smoothing lenghts
We interpret the archive as a filtering of information. The manuscript describes our various approaches to this filtering.
For the cases, when we use random filter lengths, we have first to provide a record of random filter lengths. There are various options for this dependent on two switches. `switch.smoothing=0` is the Gaussian option from the initial submission. `switch.smoothing` from 1 to 4 reflects different procedures based on AR-processes. For `switch.smoothing` from 2 to 4, the process depends on the running mean background climate and `switch.sm.2` decides whether warmer climate results in more (`switch.sm.2=0`) or less smoothing (`switch.sm.2=1`). The current applications only use `switch.smoothing=3` for which the mean smoothing depends on the mean climate and the random variations occur around this time-varying mean. Please see the code for details of the other options.
The random lengths are checked whether they are below a defined minimum and if they are, a correction is applied. The record of random lengths serves also as input for the subsampling procedure and can be used in the calculation of the sampling of the record.
```{r function_archive_1,tidy=FALSE,echo=TRUE,eval=FALSE}
func.rand.length.smooth <- function(help.bias.noise.data.save,
rand.mean.length.smooth,rand.sd.length.smooth,
rand.length.smooth.mean.1,model.smooth.1,
sd.model.smooth.1,switch.smoothing,
switch.sm.2,runsd.noise.env,runmean.env,
ind,min.rand.length.smooth,
scale.sm,model.clim.smooth.1)
{
if (switch.smoothing==0) {
set.seed((20180528+ind+202))
rand.length.smooth<- rnorm(length(help.bias.noise.data.save),
rand.mean.length.smooth,rand.sd.length.smooth)
} else if (switch.smoothing==1) {
rand.length.smooth<-rand.length.smooth.mean.1+
arima.sim(length(help.bias.noise.data.save),model=list(ar=model.smooth.1),
sd=sd.model.smooth.1)
} else if (switch.smoothing==2) {
if (switch.sm.2==0){
sd.help<-(runmean.env-mean(runmean.env))/sd(runmean.env)*scale.sm+1
rand.length.smooth<-rand.length.smooth.mean.1+
arima.sim(length(help.bias.noise.data.save),
model=list(ar=model.clim.smooth.1),
sd=sd.help*sd.model.smooth.1)
}else{
sd.help<-(-1)*(runmean.env-mean(runmean.env))/sd(runmean.env)*scale.sm+1
rand.length.smooth<-rand.length.smooth.mean.1+
arima.sim(length(help.bias.noise.data.save),
model=list(ar=model.clim.smooth.1),
sd=sd.help*sd.model.smooth.1)
}
} else if (switch.smoothing==3) {
if (switch.sm.2==0){
sd.help<-(runmean.env-mean(runmean.env))/sd(runmean.env)*scale.sm+1
rand.length.smooth<-sd.help*rand.length.smooth.mean.1+
arima.sim(length(help.bias.noise.data.save),
model=list(ar=model.clim.smooth.1),sd=sd.model.smooth.1)
}else{
sd.help<-(-1)*(runmean.env-mean(runmean.env))/sd(runmean.env)*scale.sm+1
rand.length.smooth<-sd.help*rand.length.smooth.mean.1+
arima.sim(length(help.bias.noise.data.save),
model=list(ar=model.clim.smooth.1),
sd=sd.model.smooth.1)
}
} else if (switch.smoothing==4) {
if (switch.sm.2==0){
sd.help<-(runmean.env-mean(runmean.env))/sd(runmean.env)*scale.sm+1
rand.length.smooth<-sd.help*rand.length.smooth.mean.1+
arima.sim(length(help.bias.noise.data.save),
model=list(ar=model.clim.smooth.1),
sd=sd.help*sd.model.smooth.1)
}else{
sd.help<-(-1)*(runmean.env-mean(runmean.env))/sd(runmean.env)*scale.sm+1
rand.length.smooth<-sd.help*rand.length.smooth.mean.1+
arima.sim(length(help.bias.noise.data.save),
model=list(ar=model.clim.smooth.1),
sd=sd.help*sd.model.smooth.1)
}
} else {
print(paste('switch.smoothing = ',switch.smoothing,' not implemented',
sep=" "))
print('I assume and use switch.smoothing = 0')
set.seed((20180528+ind+202))
rand.length.smooth<- rnorm(length(help.bias.noise.data.save),
rand.mean.length.smooth,rand.sd.length.smooth)
}
help.min<-min(rand.length.smooth)
if(help.min<0){
rand.length.smooth<-rand.length.smooth+
abs(help.min)+min.rand.length.smooth
}else if(help.min<min.rand.length.smooth){
rand.length.smooth<-rand.length.smooth+
abs(help.min)
}else{
rand.length.smooth<-rand.length.smooth
}
return(rand.length.smooth=rand.length.smooth)
}
```
##### Smoothing
For the randomized filtering, the record of smoothing lengths enters another function, which provides the smoothing. This function first extends the simulated record and then loops over the filter lengths. In the loop it identifies the range of influence on a certain date, and then provides the averaged value for this date.
```{r function_archive_2,tidy=FALSE,echo=TRUE,eval=FALSE}
func.smooth.1 <- function(rand.length.smooth,min.rand.length.smooth,
help.bias.noise.data.save) {
# maximum of smoothing
max.rand.length.smooth<-ceiling(max(rand.length.smooth))
# extend the vector of filter lengths
ext.length.smooth<-c(rand.length.smooth,
rep(rand.length.smooth[length(rand.length.smooth)],
max.rand.length.smooth))
ext.length.smooth<-ceiling(ext.length.smooth)
ext.2.length.smooth <- c(rep(ext.length.smooth[1],
max.rand.length.smooth),ext.length.smooth)
# extend the temperature vector
v2j.input.data<-
c(help.bias.noise.data.save[1:max.rand.length.smooth],
help.bias.noise.data.save,
help.bias.noise.data.save[
(length(help.bias.noise.data.save)-
max.rand.length.smooth+1):
length(help.bias.noise.data.save)])
# get anomalies
v3j.input.data<-v2j.input.data-mean(help.bias.noise.data.save)
# auxiliary vectors
help.smooth.a<-
help.smooth.c<-
help.smooth<-
v2j.input.data
help.smooth.c[]<-NA
help.smooth.a[]<-0
help.smooth[]<-NA
## ###############################
## loop over length of extended input data
## ###############################
for(tt in (1):(length(v2j.input.data))) {
# maximum times of influence
time.range.influence.smooth <-
(tt):(tt+max.rand.length.smooth-1)
# filtering for these times
times.smooth <-
ext.2.length.smooth[time.range.influence.smooth]
# years of last influence of filters
last.time.smooth <-
time.range.influence.smooth-times.smooth+1
# years where data influences index-time
time.range.influenced <-
time.range.influence.smooth[
tt>=last.time.smooth&tt<=
time.range.influence.smooth]
# filters there
times.smoothed <- ext.2.length.smooth[time.range.influenced]
# filter
values.smooth <- 1/times.smoothed
#filtered data
data.smoothed <- 1/times.smoothed*
v3j.input.data[time.range.influenced]
# Version1 sum of filtered data for index time
help.smooth[tt] <- sum(data.smoothed)
# different filtering, not saved
if(tt>=(max.rand.length.smooth+1)){
diff.ext.length.smooth<-
ceiling(ext.length.smooth[tt-max.rand.length.smooth])
diff.values.smooth<-
rep(1/diff.ext.length.smooth,diff.ext.length.smooth)
help.smooth.c[(tt)]<-
mean(v3j.input.data[(tt-diff.ext.length.smooth+1):tt],
na.rm=T)
}
}
help.b.smooth <- help.smooth[(0.5*max.rand.length.smooth):
(length(help.smooth)-1.5*max.rand.length.smooth-1)]
# only relevant times
help.b.c.smooth<-help.smooth.c[(max.rand.length.smooth+1):
(length(help.smooth.c)-max.rand.length.smooth)]
# only relevant times
help.smooth.save<-help.b.smooth+
mean(help.bias.noise.data.save) # add mean
return(list(help.smooth.save=help.smooth.save,v3j.input.data=v3j.input.data))
}
````
Other archiving procedures are not implemented as functions but simply in the script.
#### Code structure
##### Parameters
The random smoothing allows for a number of different options. `switch.smoothing` allows to shift between these. A number of the options allow also to decide on how the smoothing depends on the climate (via `switch.sm.2`). A scaling parameter `scale.sm` defines the amplitude of the changes in relation to the climate.
```{r para_archive_1,tidy=FALSE,echo=TRUE,eval=FALSE}
## ###############################
## Switch for variety of smoothing approaches
## switch.smoothing = 0 # as in discussion paper
## switch.smoothing = 1 # adoptable noise model
## switch.smoothing = 2 # function of climate var
## switch.smoothing = 3 # function of climate mean
## switch.smoothing = 4 # function of climate mean, version 2
## ###############################
switch.smoothing <- 3
switch.sm.2 <- 1 # 0/1 for dependence of smooth on climate if other switch
scale.sm<-1./10.
```
The random and the uniform smoothing depend on a number of parameteres. These define the base of the computations, their variability, the AR-models used in the various approaches, and the allowed minimum for the smoothing lengths. Base and variability are replaced by a fixed length for the uniform filtering. The manuscript and the executable script use two different sets of these parameters but only one is given here.
```{r para_archive_2,tidy=FALSE,echo=TRUE,eval=FALSE}
rand.mean.length.smooth <- 350
rand.sd.length.smooth <- 75
rand.length.smooth.mean.1 <- 500
model.smooth.1 <- c(0.99)
sd.model.smooth.1 <- 10
model.clim.smooth.1<-c(0.9)
fix.length.smooth <- 501
min.rand.length.smooth <- 40
coeff.ar.smooth <- 0.999
sd.ar.smooth <- 0.01
```
While the generalized approach is conceputally identical to the smoothing plus AR approach, the parameter declarations are distinct.
```{r para_archive_gen,tidy=FALSE,echo=TRUE,eval=FALSE}
length.filter.uniform <- 501
help.len.filter.uni <- (length.filter.uniform-1)/2
coeff.gen.ar.smooth<-0.999
sd.gen.ar.smooth<-0.01
```
#### Code
##### Random smoothing archiving
Some versions of the random smoothing length require the running mean of the background climate.
```{r code_archive_1, echo=TRUE,eval=FALSE}
runmean.env <-runmean(j.input.data,k=1000)
```
After this initial step, the random smoothing length function is applied. The output is later also needed for the subsampling approach.
```{r code_archive_2, echo=TRUE,eval=FALSE}
rand.length.smooth<-func.rand.length.smooth(help.bias.noise.data.save,
rand.mean.length.smooth,
rand.sd.length.smooth,
rand.length.smooth.mean.1,
model.smooth.1,sd.model.smooth.1,
switch.smoothing,switch.sm.2,
runsd.noise.env,runmean.env,
ind,min.rand.length.smooth,
scale.sm,model.clim.smooth.1)
```
The random smoothing archiving is calculated.
```{r code_archive_3, echo=TRUE,eval=FALSE}
help.smooth.save<-func.smooth.1(rand.length.smooth,min.rand.length.smooth,
help.bias.noise.data.save)
v3j.input.data<-help.smooth.save$v3
help.smooth.save<-help.smooth.save$hel
```
This is saved in its field. Note, that the output includes two versions of the random smoothing lengths. Please see the list of output fields for the names of the output.
```{r code_archive_4, echo=TRUE,eval=FALSE}
smooth.save[,j,i]<-help.smooth.save
```
##### Uniform smoothing plus AR-process archiving
The approach uses auxiliary records. For convenience sake, their declaration relies on the maximum of the random smoothing lenghts.
The input data is filtered and the AR-process is added. Then the data is put into the previously declared field.
Again the manuscript and the executable script consider two different approaches but we include only the code for one.
```{r code_archive_6, echo=TRUE,eval=FALSE}
max.rand.length.smooth<-ceiling(max(rand.length.smooth))
help.diff.values.smooth<-
filter(v3j.input.data+mean(help.bias.noise.data.save),
"conv",filter=rep(1/fix.length.smooth,fix.length.smooth))
help.diff.values.smooth<-
help.diff.values.smooth[(max.rand.length.smooth+1):(length(j.input.data)+
max.rand.length.smooth)]
ar.smooth<-arima.sim(length(j.input.data),model=list(ar=c(coeff.ar.smooth)),
sd=sd.ar.smooth)
ar.smooth.save[,j,i]<-help.diff.values.smooth+ar.smooth
````
The generalised approach to the archive is conceptually identical to the smoothing plus AR approach. The following code shows the smoothing and which components are saved.
```{r code_archive_gen, echo=TRUE,eval=FALSE}
help.smooth.j.input.data<-
filter(c(j.input.data[1:help.len.filter.uni],j.input.data,
j.input.data[(length(j.input.data)-(help.len.filter.uni-
1)):length(j.input.data)])
,"conv",filter=rep(1/length.filter.uniform,length.filter.uniform))
smooth.gen.dat[,j,i]<-help.smooth.j.input.data[
(help.len.filter.uni+1):(length(j.input.data)+help.len.filter.uni)]
help.smooth.bias.noise.gen.dat<-
filter(c(help.bias.noise.gen.dat[1:help.len.filter.uni],
help.bias.noise.gen.dat,
help.bias.noise.gen.dat[
(length(help.bias.noise.gen.dat)-(help.len.filter.uni-1)):
length(help.bias.noise.gen.dat)])
,"conv",filter=rep(1/length.filter.uniform,length.filter.uniform))
smooth.bias.noise.gen.dat[,j,i]<-
help.smooth.bias.noise.gen.dat[
(help.len.filter.uni+1):(length(help.bias.noise.gen.dat)+
help.len.filter.uni)]
ar.2.smooth<-arima.sim(length(j.input.data),
model=list(ar=c(coeff.gen.ar.smooth)),
sd=sd.gen.ar.smooth)
ar.gen.dat[,j,i] <- ar.2.smooth
ar.smooth.bias.noise.gen.dat[,j,i]<-
help.smooth.bias.noise.gen.dat[
(help.len.filter.uni+1):(length(help.bias.noise.gen.dat)+
help.len.filter.uni)]+
ar.2.smooth
```
### Observation model
#### Functions
The observation model mainly considers the sampling, the effective dating uncertainty error, and the measurement noise.
However, for the subsampling approach, the first step is the selective observation procedure.
##### Subsampled observations
The subsampling function also relies on the random smoothing lengths. It samples for each date from the time period of influence a selected number of years and then adds a potentially correlated noise to the average of these picks.
```{r function_obs_1,tidy=FALSE,echo=TRUE,eval=FALSE}
func.subsample <- function(ext.length.smooth,help.noise.save,j.input.data,
subsampled.data,model.noise.pick,sd.noise.pick,
n.samp.pick,amp.bias.seas,series.sum.insol)
{
subsampled.data<-rep(NA,length(j.input.data))
for(tt in (1):(length(j.input.data))) {
diff.ext.length.smooth<-ceiling(ext.length.smooth[tt])
if(tt<diff.ext.length.smooth){
sample.pick <- runif(n.samp.pick,min=1,max=tt)
subsampled.data[tt]<- mean(j.input.data[sample.pick]+
help.noise.save[sample.pick]
# +amp.bias.seas*(series.sum.insol[sample.pick])
+c(arima.sim(n.samp.pick,
model=list(ar=model.noise.pick),
sd=sd.noise.pick)),na.rm=T)
}else{
sample.pick <- runif(n.samp.pick,min=tt-
diff.ext.length.smooth+1,max=tt)
subsampled.data[tt]<- mean(j.input.data[sample.pick]+
help.noise.save[sample.pick]
# +amp.bias.seas*(series.sum.insol[sample.pick])
+c(arima.sim(n.samp.pick,
model=list(ar=model.noise.pick),
sd=sd.noise.pick)),na.rm=T)
}
}
return(subsample.data=subsampled.data)
}
```
##### Get the sampling dates
There are a number of versions to randomly select the dates for the sampled records, which are mainly controlled by `switch.sampling`. Generally the randomly selected dates are the dates at which we later sample the records. However, it would also be possible to sample at related dates and then assign the wrong dates. In practice in the current version, we indeed perfectly know and date the sampling dates.
The sampling from the initial submission is provided by `switch.sampling=0`.
For `switch.sampling=1`, we have to distinguish whether the smoothing lengths were uniform or random and thus different for each date. Both implementations try to mimic the equal sampling of an obtained archive where the underlying processes led to an unequal accumulation, which is more problematic for the case of uniform smoothing. Please refer to the code for details.
```{r function_obs_2,tidy=FALSE,echo=TRUE,eval=FALSE}
func.get.date.sampling <- function(switch.sampling,n.samples,j.input.data,
rand.length.smooth,date.samples)
{
help.len<-length(j.input.data)
if (switch.sampling==0) {
# sample base series
date.sample.base<-seq(1,help.len,40)
# sorted list of n.samples samples
date.samples<-sort(sample(date.sample.base,n.samples,replace=FALSE))
# intermediate samples in this base series
date.samples.b<-sample(0:39,n.samples,replace=T)
# final sample dates
date.samples<-date.samples+date.samples.b
date.samples.a<-date.samples
}else if (switch.sampling==1&sd(rand.length.smooth)>0){
sample.base.pool<-c((help.len-200):help.len)
sample.base.poolb<-c(1:1000)
date.start <- sample(sample.base.pool,1L,FALSE,NULL)
date.end <- sample(sample.base.poolb,1L,FALSE,NULL)
help.interval<-(date.start-date.end)/(n.samples-1)
date.samples.a<-rev(seq(date.start,date.end,-help.interval))
help.date<-(rand.length.smooth/mean(rand.length.smooth))[date.samples.a]*
help.interval
help.first<-sample((0:help.interval/3),1)
help.sum<-cumsum(c(help.first,rev(help.date)))
if(max(help.sum)>=date.start){
print(paste("here",max(help.sum),date.start,help.first,help.interval))
help.sum<-help.sum/max(help.sum)*(date.start-1)
}
date.samples.b<-rev(date.start-help.sum)
date.samples<-round(date.samples.b[2:201],0)
date.samples.a<-round(date.samples.a,0)
}else if (switch.sampling==1&sd(rand.length.smooth)==0){
sample.base.pool<-c((help.len-200):help.len)
sample.base.poolb<-c(1:1000)
date.start <- sample(sample.base.pool,1L,FALSE,NULL)
date.end <- sample(sample.base.poolb,1L,FALSE,NULL)
help.interval<-(date.start-date.end)/(n.samples-1)
# sample base series
date.sample.base<-seq(date.end,date.start,help.interval)
help.b.date<-floor(help.interval/2)
date.samples.b<-sample((-help.b.date):(help.b.date),n.samples,replace=T)
date.samples.b[1]<-abs(date.samples.b[1])
date.samples.b[n.samples]<-abs(date.samples.b[1])*(-1)
# final sample dates
date.samples<-date.sample.base+date.samples.b
date.samples<-sort(round(date.samples,0))
date.samples.a<-date.samples
}else if(switch.sampling>1){
print("switch.sampling >1 not implemented")
print("assume switch.sampling=0")
# sample base series
date.sample.base<-seq(1,length(j.input.data),40)
# sorted list of n.samples samples
date.samples<-sort(sample(date.sample.base,n.samples,replace=FALSE))
# intermediate samples in this base series
date.samples.b<-sample(0:39,n.samples,replace=T)
# final sample dates
date.samples<-date.samples+date.samples.b
date.samples.a<-date.samples
}
return(list(date.samples=date.samples,date.samples.real=date.samples.a))
}
```
##### Random dating uncertainties
We randomly assign uncertainties to our sample dates. Again, two options are available via `switch.sampling.unc`. `switch.sampling.unc=0` reflects the version from the initital submission. `switch.sampling.unc>0` uses the smoothing lengths to provide an dating uncertainty if they are not uniform. If they are uniform, a random component is added to a constant uncertainty derived from the uniform filter length.
```{r function_obs_3,tidy=FALSE,echo=TRUE,eval=FALSE}
func.get.dating.uncertainty<- function(n.samples,mean.date.unc,sd.date.unc,
switch.sampling.unc,rand.length.smooth,
date.samples,date.samples.r)
{
if(switch.sampling.unc==0){
# random dating uncertainty standard deviations
sigma.unc.date<-rnorm(n.samples,mean.date.unc,sd.date.unc)
# Ensure that larger than some ad hoc limit
sigma.unc.date<-sign(sigma.unc.date)*ceiling(abs(sigma.unc.date))
sigma.unc.date[sigma.unc.date<=6]<-
abs(sigma.unc.date[sigma.unc.date<=6])+5
# save the uncertainty
return(list(sigma.unc.date=sigma.unc.date))
}else if (switch.sampling.unc>0&sd(rand.length.smooth)!=0){
sigma.unc.date.u<-rand.length.smooth[date.samples]/2
sigma.unc.date.r<-rand.length.smooth[date.samples.r]/2
return(list(sigma.unc.date=sigma.unc.date.r,sigma.unc.date.u=sigma.unc.date.u))
}else if (switch.sampling.unc>0&sd(rand.length.smooth)==0){
sigma.unc.date.help<-abs(rnorm(n.samples,0,sd.date.unc))
sigma.unc.date.u<-rand.length.smooth[date.samples]/2+sigma.unc.date.help
sigma.unc.date.help<-rnorm(n.samples,0,sd.date.unc)
sigma.unc.date.r<-rand.length.smooth[date.samples.r]/2+sigma.unc.date.help
return(list(sigma.unc.date=sigma.unc.date.r,sigma.unc.date.u=sigma.unc.date.u))
}
}
```
##### Get initital effective dating error
The most simple version of the effective dating error samples from a distribution around the mean over a time-frame of influence.
```{r function_obs_4,tidy=FALSE,echo=TRUE,eval=FALSE}
func.date.unc.err <- function(datn,n.samples,date.samples,sigma.unc.date) {
dun3 <- rep(NA,n.samples)
for( nn in 1:n.samples) {
tt <- date.samples[nn]
sig1 <- 2*sigma.unc.date[nn]
max1 <- max(c((tt-sig1),1),na.rm=T)
min1 <- min((tt+sig1),length(datn),na.rm=T)
dun1 <- mean(datn[max1:min1],na.rm=T)-datn[tt]
dun2 <- sd(datn[max1:min1]-dun1, na.rm=T)
dun3[nn] <- rnorm(1,dun1,dun2)
}
return(dun3=dun3)
}
```
##### Final effective dating error
The final effective dating error depends on a number of switches, which mainly decide on the strength of the relation between the error for two subsequent dates.
```{r function_obs_5,tidy=FALSE,echo=TRUE,eval=FALSE}
func.unc.err<-function(date.unc.err,cor.date.unc,switch.cor.date.unc,
switch.delta.cor.date.unc,sigma.unc.date,n.samples,
date.samples,switch.weak.cor.date.unc,help.data)
{
if (switch.cor.date.unc!=0){
## Re-Define Cor.Date.Unc
if( switch.delta.cor.date.unc!=0){
cor.date.unc<-pmax(0,1-(diff(date.samples)/(2*sigma.unc.date[2:(n.samples)])))
cor.date.unc.2<-0
help.1<-date.samples[3:n.samples]-date.samples[1:(n.samples-2)]
cor.date.unc.2<-pmax(0,1-help.1/(2*sigma.unc.date[3:(n.samples)]))
cor.date.unc.2<-c(cor.date.unc.2,0)
}
else if(switch.delta.cor.date.unc==0&switch.weak.cor.date.unc!=0){
cor.date.unc<-rep(cor.date.unc,n.samples)
}
else if(switch.delta.cor.date.unc==0&switch.weak.cor.date.unc==0){
print('the combination of switch.delta.cor.date.unc=0 &
switch.weak.cor.date.unc=0 is not implemented')
print('assume new constant rho_new=1/3*rho_old and continue')
cor.date.unc<-rep(1/3*cor.date.unc,n.samples)
}
date.unc.err.1.c<-date.unc.err
date.unc.err.d<-help.data[date.samples]
## Re-Define date.unc.err
if(switch.weak.cor.date.unc!=0&switch.cor.length!=2){
for( mm in (n.samples-1):1) {
date.unc.err.1.c[mm]<-cor.date.unc[mm]*
(date.unc.err[mm+1]+
(date.unc.err.d[mm+1]-date.unc.err.d[mm]))+
date.unc.err[mm]
}
}else if(switch.weak.cor.date.unc!=0&switch.cor.length==2){
for( mm in (n.samples-2):1) {
date.unc.err.1.c[mm]<-cor.date.unc[mm]*
(date.unc.err[mm+1]+
(date.unc.err.d[mm+1]-date.unc.err.d[mm]))+
cor.date.unc.2[mm]*
(date.unc.err[mm+2]+
(date.unc.err.d[mm+2]-date.unc.err.d[mm]))+
date.unc.err[mm]
}
date.unc.err.1.c[n.samples-1]<-cor.date.unc[n.samples-1]*
(date.unc.err[n.samples-1+1]+
(date.unc.err.d[n.samples-1+1]-date.unc.err.d[n.samples-1]))+
date.unc.err[n.samples-1]
}else if(switch.weak.cor.date.unc==0){
for( mm in (n.samples-1):1) {
date.unc.err.1.c[mm]<-cor.date.unc*
(date.unc.err.1.c[mm+1]+
(date.unc.err.d[mm+1]-date.unc.err.d[mm]))+
date.unc.err[mm]
}
}
date.unc.err<-date.unc.err.1.c
}
return(date.unc.err=date.unc.err)
}
```
##### Measurement noise
In a final step, we add measurement noise. However, since we also provide data for the full records plus measurement noise this step may also precede the sampling procedure above. The measurement noise is simply a potentially serially correlated noise based on AR-process modelling.
```{r function_obs_n,tidy=FALSE,echo=TRUE,eval=FALSE}
func.noise.meas<-function(sigma.noise.meas,model.noise.meas,help.data)
{
noise.meas<-
c(arima.sim(length(help.data),
model=list(ar=model.noise.meas),
sd=sigma.noise.meas))
return(noise.meas)
}
```
#### Code structure
##### Parameters
The subsampled proxies require to specify the number of samples, the noise standard deviation, and a noise model.
```{r para_obs_1,tidy=FALSE,echo=TRUE,eval=FALSE}
n.samp.pick <- 30
sd.noise.pick <- 0.5
model.noise.pick <- c()
```
We have to specify the number of samples for the final pseudoproxies. The sampling further depends on a switch as described for the function.
```{r para_obs_2,tidy=FALSE,echo=TRUE,eval=FALSE}
n.samples<-200
switch.sampling <- 1
```
Similarly, the dating uncertainty estimates depend on a switch as described for the function. For the procedure from the initial submission, parameters define mean and standard deviation of the random dating uncertainty.
```{r para_obs_3,tidy=FALSE,echo=TRUE,eval=FALSE}
switch.sampling.unc<- 1
mean.date.unc <-350
sd.date.unc <- 100
```
We include various options to estimate the effective dating uncertainty error. First, the question is, whether errors should be correlated. Secondly, we can decide whether the correlation should be uniform or not. If the correlation is meant to be uniform, we can modify the strength of the error. Please see the function for details. Another parameter influences whether correlations should only be between adjacent samples or extend to two samples.
There is another parameter for the initial strength of the uniform correlation.
```{r para_obs_4,tidy=FALSE,echo=TRUE,eval=FALSE}
## ###############################
## Input for dating uncertainty
## # switch.cor.date.unc: 0/1 switch for correlation in dating uncertainty
## # switch.weak.cor.date.unc: 0/1 switch for very weak correlation
## # switch.delta.cor.date.unc: 0/1 switch for dependent correlation
## ###############################
switch.cor.date.unc <- 1
switch.weak.cor.date.unc <- 1
switch.delta.cor.date.unc <- 1
switch.cor.length<-1
cor.date.unc <- 0.9
```
For the measurement noise, it is necessary to declare the 95\% limit of the noise and the noise AR-model.
```{r para_obs_5,tidy=FALSE,echo=TRUE,eval=FALSE}
lim.noise.meas <- 1.5
model.noise.meas <- c()
```
The subsampled pseudoproxies can use a different limit and model, but for the moment it is the same as for the other approaches.
```{r para_obs_6,tidy=FALSE,echo=TRUE,eval=FALSE}
model.seas.pick.noise.meas<-c()
lim.seas.pick.noise.meas<-1.5
```
##### Code
The first part in the observation model is the subsampled pseudoproxy. The subsampling relies on information from the random smoothing and then the subsample-function is applied.
```{r code_obs_1,tidy=FALSE,echo=TRUE,eval=FALSE}
ext.length.smooth<-c(rand.length.smooth,
rep(rand.length.smooth[length(rand.length.smooth)],
max.rand.length.smooth))
subsampled.data<-func.subsample(ext.length.smooth,help.noise.save,
j.input.data,subsampled.data,
model.noise.pick,sd.noise.pick,n.samp.pick)
```
This is saved.
```{r code_obs_2,tidy=FALSE,echo=TRUE,eval=FALSE}
subsampled.save[,j,i]<-subsampled.data
```
While we apply the measurement noise later, we calculate it now because we also provide pseudoproxies plus measurement noise, which are not sampled and do not include the effective dating uncertainty error.
First we identify the standard deviation of the desired noise process, then we use the measurement noise function for two different instances of the noise.
```{r code_obs_3,tidy=FALSE,echo=TRUE,eval=FALSE}
sigma.noise.meas <- lim.noise.meas/1.959964
help.data<-help.smooth.save
noise.meas<-func.noise.meas(sigma.noise.meas,model.noise.meas,help.data)
short.noise.meas<-func.noise.meas(sigma.noise.meas,model.noise.meas,
help.data)
```
A number of fields are saved.
```{r code_obs_4,tidy=FALSE,echo=TRUE,eval=FALSE}
meas.noise.smooth.save[,j,i]<-help.smooth.save+noise.meas
meas.noise.ar.smooth.save[,j,i]<- help.diff.values.smooth + ar.smooth +
noise.meas
meas.noise.short.smooth.save[,j,i]<-short.help.smooth.save+short.noise.meas
meas.noise.short.ar.smooth.save[,j,i]<-help.b.diff.values.smooth +
ar.b.smooth + short.noise.meas
```
This is repeated for the subsampled pseudoproxy.
```{r code_obs_5,tidy=FALSE,echo=TRUE,eval=FALSE}
sigma.seas.pick.noise.meas<-lim.seas.pick.noise.meas/1.959964
seas.pick.noise.meas<-func.noise.meas(sigma.seas.pick.noise.meas,
model.seas.pick.noise.meas,help.data)
help.meas.noise.subsampled<-subsampled.data+seas.pick.noise.meas
meas.noise.subsampled.save[,j,i]<-help.meas.noise.subsampled
```
Next, we identify the dates at which we want to sample the full records. This is done for various pseudoproxy types separately to be able to consider the smoothing lengths information in the sampling.
```{r code_obs_6,tidy=FALSE,echo=TRUE,eval=FALSE}
help.date.samples<-func.get.date.sampling(switch.sampling,
n.samples,j.input.data,
rand.length.smooth)
date.samples<-help.date.samples$date.samples
date.samples.r<-help.date.samples$date.samples.r
help.date.samples.short<-func.get.date.sampling(switch.sampling,
n.samples,j.input.data,
rand.2.length.smooth)
date.samples.short<-help.date.samples.short$date.samples
date.samples.r.short<-help.date.samples.short$date.samples.r
help.date.samples.ar<-func.get.date.sampling(switch.sampling,
n.samples,j.input.data,
rep(fix.length.smooth,
length(j.input.data)))
date.samples.ar<-help.date.samples.ar$date.samples
date.samples.r.ar<-help.date.samples.ar$date.samples.r
help.date.samples.ar.short<-func.get.date.sampling(switch.sampling,
n.samples,j.input.data,
rep(fix.length.smooth.2,
length(j.input.data)))
date.samples.ar.short<-help.date.samples.ar.short$date.samples
date.samples.r.ar.short<-help.date.samples.ar.short$date.samples.r
```
These informations are saved.
```{r code_obs_7,tidy=FALSE,echo=TRUE,eval=FALSE}
samp.dates.save[,j,i]<-date.samples
samp.dates.save.short[,j,i]<-date.samples.short
samp.dates.save.ar[,j,i]<-date.samples.ar
samp.dates.save.ar.short[,j,i]<-date.samples.ar.short
```
The previously produced pseudoproxies are sampled at the sample dates and saved as well.
```{r code_obs_8,tidy=FALSE,echo=TRUE,eval=FALSE}
samp.subsampled.save[,j,i]<-subsampled.data[date.samples]
samp.meas.noise.subsampled.save[,j,i]<-
subsampled.data[date.samples]+seas.pick.noise.meas[date.samples]
samp.meas.noise.smooth.save[,j,i]<-
help.smooth.save[date.samples]+noise.meas[date.samples]
samp.input.save[,j,i]<-j.input.data[date.samples]
samp.noise.save[,j,i]<-help.noise.save[date.samples]
samp.bias.noise.data.save[,j,i]<-
help.bias.noise.data.save[date.samples]
samp.ar.smooth.save[,j,i]<-
help.diff.values.smooth[date.samples.ar]+ar.smooth[date.samples.ar]
samp.smooth.save[,j,i]<-help.smooth.save[date.samples]
samp.short.smooth.save[,j,i]<-short.help.smooth.save[date.samples.short]
samp.short.ar.smooth.save[,j,i]<-
help.b.diff.values.smooth[date.samples.ar.short]+
ar.b.smooth[date.samples.ar.short]
samp.meas.noise.short.smooth.save[,j,i]<-
short.help.smooth.save[date.samples.short]+
short.noise.meas[date.samples.short]
samp.meas.noise.ar.smooth.save[,j,i]<-
ar.smooth.save[date.samples.ar]+noise.meas[date.samples.ar]
samp.meas.noise.short.ar.smooth.save[,j,i]<-
short.ar.smooth.save[date.samples.ar.short]+
short.noise.meas[date.samples.ar.short]
```
Then the random sigma uncertainties for the sample dates are produced.
```{r code_obs_9,tidy=FALSE,echo=TRUE,eval=FALSE}
sigma.unc.date<-func.get.dating.uncertainty(n.samples,mean.date.unc,
sd.date.unc,switch.sampling.unc,
rand.length.smooth,
date.samples,
date.samples.r)$sigma.unc.date
sigma.unc.date.short<-
func.get.dating.uncertainty(n.samples,mean.date.unc,sd.date.unc,
switch.sampling.unc,rand.2.length.smooth,
date.samples.short,
date.samples.r.short)$sigma.unc.date
sigma.unc.date.ar<-
func.get.dating.uncertainty(n.samples,mean.date.unc,sd.date.unc,
switch.sampling.unc,
rep(fix.length.smooth,length(j.input.data)),
date.samples.ar,
date.samples.r.ar)$sigma.unc.date
sigma.unc.date.ar.short<-
func.get.dating.uncertainty(n.samples,mean.date.unc,sd.date.unc,
switch.sampling.unc,
rep(fix.length.smooth.2,length(j.input.data)),
date.samples.ar.short,
date.samples.r.ar.short)$sigma.unc.date
```
This information is also saved.
```{r code_obs_10,tidy=FALSE,echo=TRUE,eval=FALSE}
unc.date.samp[,j,i]<-sigma.unc.date
unc.date.samp.short[,j,i]<-sigma.unc.date.short
unc.date.samp.ar[,j,i]<-sigma.unc.date.ar
unc.date.samp.ar.short[,j,i]<-sigma.unc.date.ar.short
```
Next, the code calculates the initial and the final effective dating uncertainty errors for the input data plus noise and bias only.
```{r code_obs_11,tidy=FALSE,echo=TRUE,eval=FALSE}
date.unc.err<-func.date.unc.err(help.bias.noise.data.save,n.samples,
date.samples,sigma.unc.date)
date.unc.err.1<-date.unc.err
help.data<-help.bias.noise.data.save
date.unc.err<-func.unc.err(date.unc.err,cor.date.unc,switch.cor.date.unc,
switch.delta.cor.date.unc,sigma.unc.date,
n.samples,date.samples,switch.weak.cor.date.unc,
help.bias.noise.data.save)
date.unc.err.1<-date.unc.err
```
This is saved.
```{r code_obs_12,tidy=FALSE,echo=TRUE,eval=FALSE}
unc.date.bias.noise.data.save[,j,i]<-
help.bias.noise.data.save[date.samples]+date.unc.err
```
This is repeated for other records including. Records are saved without and with added measurement noise.
```{r code_obs_13,tidy=FALSE,echo=TRUE,eval=FALSE}
date.unc.err<-func.date.unc.err(subsampled.data,n.samples,date.samples,
sigma.unc.date)
help.data<-subsampled.data
date.unc.err<-func.unc.err(date.unc.err,cor.date.unc,switch.cor.date.unc,
switch.delta.cor.date.unc,sigma.unc.date,
n.samples,date.samples,switch.weak.cor.date.unc,
help.data)
unc.date.subsampled.save[,j,i]<-subsampled.data[date.samples]+date.unc.err
unc.date.meas.noise.subsampled.save[,j,i]<-subsampled.data[date.samples]+
date.unc.err+seas.pick.noise.meas[date.samples]
```
```{r code_obs_14,tidy=FALSE,echo=TRUE,eval=FALSE}
date.unc.err<-func.date.unc.err(help.smooth.save,n.samples,date.samples,
sigma.unc.date)
help.data<-help.smooth.save
date.unc.err<-func.unc.err(date.unc.err,cor.date.unc,switch.cor.date.unc,
switch.delta.cor.date.unc,sigma.unc.date,
n.samples,date.samples,
switch.weak.cor.date.unc,help.data)
unc.date.smooth.save[,j,i]<-help.smooth.save[date.samples]+date.unc.err
unc.date.meas.noise.smooth.save[,j,i]<-help.smooth.save[date.samples]+
noise.meas[date.samples]+date.unc.err
```
```{r code_obs_15,tidy=FALSE,echo=TRUE,eval=FALSE}
help.data<-help.diff.values.smooth+ar.smooth
date.unc.err<-func.date.unc.err(help.data,n.samples,date.samples.ar,
sigma.unc.date.ar)
date.unc.err<-func.unc.err(date.unc.err,cor.date.unc,switch.cor.date.unc,
switch.delta.cor.date.unc,sigma.unc.date.ar,
n.samples,date.samples.ar,
switch.weak.cor.date.unc,help.data)
unc.date.ar.smooth.save[,j,i]<-help.diff.values.smooth[date.samples.ar]+
ar.smooth[date.samples.ar]+date.unc.err
unc.date.meas.noise.ar.smooth.save[,j,i]<-
help.diff.values.smooth[date.samples.ar]+
ar.smooth[date.samples.ar]+noise.meas[date.samples.ar]+date.unc.err
```
```{r code_obs_16,tidy=FALSE,echo=TRUE,eval=FALSE}
date.unc.err<-func.date.unc.err(short.help.smooth.save,n.samples,
date.samples.short,sigma.unc.date.short)
date.unc.err<-func.unc.err(date.unc.err,cor.date.unc,switch.cor.date.unc,
switch.delta.cor.date.unc,sigma.unc.date.short,
n.samples,date.samples.short,
switch.weak.cor.date.unc,help.data)
help.data<-short.help.smooth.save
unc.date.meas.noise.short.smooth.save[,j,i]<-
short.help.smooth.save[date.samples.short]+
short.noise.meas[date.samples.short]+date.unc.err
```
```{r code_obs_17,tidy=FALSE,echo=TRUE,eval=FALSE}
help.data<-help.b.diff.values.smooth+ar.b.smooth
date.unc.err<-func.date.unc.err(help.data,n.samples,date.samples.ar.short,
sigma.unc.date.ar.short)
date.unc.err<-func.unc.err(date.unc.err,cor.date.unc,switch.cor.date.unc,
switch.delta.cor.date.unc,
sigma.unc.date.ar.short,n.samples,
date.samples.ar.short,switch.weak.cor.date.unc,
help.data)
unc.date.short.ar.smooth.save[,j,i]<-help.data[date.samples.ar.short]+
date.unc.err
unc.date.meas.noise.short.ar.smooth.save[,j,i]<-
help.b.diff.values.smooth[date.samples.ar.short]+
ar.b.smooth[date.samples.ar.short]+
short.noise.meas[date.samples.ar.short]+
date.unc.err
```
For the generalized approach, these steps are applied as well.
```{r code_obs_18,tidy=FALSE,echo=TRUE,eval=FALSE}
sigma.unc.date.gen<-
func.get.dating.uncertainty(n.samples,mean.date.unc,sd.date.unc,
switch.sampling.unc,
rep(length.filter.uniform,
length(j.input.data)),
date.samples.gen,
date.samples.r.gen)$sigma.unc.date
help.date.samp.gen<-
func.get.date.sampling(switch.sampling,n.samples,j.input.data,
rep(length.filter.uniform,length(j.input.data)))
date.samples.gen<-help.date.samp.gen$date.samples
date.samples.r.gen<-help.date.samp.gen$date.samples.r
samp.dates.save.gen[,j,i] <- date.samples.gen
unc.date.samp.gen[,j,i]<-sigma.unc.date.gen
samp.ar.smooth.bias.noise.gen.dat[,j,i]<-
help.smooth.bias.noise.gen.dat[
(help.len.filter.uni+1):(length(help.bias.noise.gen.dat)+
help.len.filter.uni)][date.samples.gen]+
ar.2.smooth[date.samples.gen]
help.data<-
help.smooth.bias.noise.gen.dat[
(help.len.filter.uni+1):(length(help.bias.noise.gen.dat)+
help.len.filter.uni)]+
ar.2.smooth
gen.date.unc.err<-func.date.unc.err(help.data,n.samples,date.samples.gen,
sigma.unc.date.gen)
gen.date.unc.err.help<-gen.date.unc.err
gen.date.unc.err<-func.unc.err(gen.date.unc.err.help,cor.date.unc,
switch.cor.date.unc,switch.delta.cor.date.unc,
sigma.unc.date.gen,n.samples,date.samples.gen,
switch.weak.cor.date.unc,help.data)
unc.samp.ar.smooth.bias.noise.gen.dat[,j,i] <-
samp.ar.smooth.bias.noise.gen.dat[,j,i]+gen.date.unc.err
gen.sigma.noise.meas <- lim.noise.meas/1.959964
gen.noise.meas<-func.noise.meas(gen.sigma.noise.meas,model.noise.meas,
j.input.data)
meas.unc.samp.ar.smooth.bias.noise.gen.dat[,j,i] <-
samp.ar.smooth.bias.noise.gen.dat[,j,i]+gen.date.unc.err+
gen.noise.meas[date.samples.gen]
```
### Output
#### Declaration
Most intermediate steps are included in the output files. The following list may be incomplete, since a number of fields are provided for all different approaches. Please see tables in the manuscript for the full list.
```{r para_n,tidy=FALSE,echo=TRUE,eval=FALSE}
## Full fields
noise.save<-#just the first noise term
bias.noise.data.save <-#smoothed seasonal/orbital component
smooth.save <-#smoothed
ar.smooth.save <-#smooth plus ar
short.ar.smooth.save <- # shorter smoothing plus ar
short.smooth.save <- # shorter smoothing
subsampled.save<-#seasonally subsampled, e.g. like Mg/Ca
meas.noise.subsampled.save<-#seasonally subsampled plus measurement noise
meas.noise.smooth.save <-#final measurement noise added
meas.noise.short.smooth.save<- #shorter smoothing plus final measurement noise
meas.noise.ar.smooth.save <- #the ar plus final noise#not saved
meas.noise.short.ar.smooth.save <- #the other ar plus final noise#not saved
array(NA,dim=c(c(dim.noise.field)))
## Sampled proxy fields
samp.input.save<- #sampled input data
samp.input.save.short<- #sampled input data
samp.input.save.ar<- #sampled input data
samp.input.save.ar.short<- #sampled input data
samp.dates.save<- #sample dates
samp.dates.save.short<-
samp.dates.save.ar<-
samp.dates.save.ar.short<-
samp.noise.save<-#sampled noise.save
samp.noise.save.short<-#sampled noise.save
samp.noise.save.ar<-#sampled noise.save
samp.noise.save.ar.short<-#sampled noise.save
samp.bias.noise.data.save <-#sampled bias.noise.data.save
samp.bias.noise.data.save.short <-#sampled bias.noise.data.save
samp.bias.noise.data.save.ar <-#sampled bias.noise.data.save
samp.bias.noise.data.save.ar.short <-#sampled bias.noise.data.save
samp.smooth.save <- #sampled smooth.save
samp.ar.smooth.save <-#sampled ar.smooth.save
samp.short.smooth.save <-#sampled short.smooth.save
samp.short.ar.smooth.save <-#sampled short.ar.smooth.save
samp.subsampled.save<- #sampled subsampled.save
samp.meas.noise.subsampled.save<-#sampled seasonally subsampled + meas noise
samp.meas.noise.smooth.save <- #sampled meas.noise.smooth.save
samp.meas.noise.short.smooth.save<-#sampled meas.noise.short.smooth.save
samp.meas.noise.ar.smooth.save<-#sampled meas.noise.ar.smooth.save#not saved
samp.meas.noise.short.ar.smooth.save <- ##not saved
samp.meas.noise.short.ar.smooth.save <-
dated.date.samples <-
array(NA,dim=c(n.samples,c(dim.noise.field[c(2,3)])))#nsamar
## Sampled proxy fields plus dating uncertainty estimate
unc.date.samp<-#dating uncertainty
unc.date.samp.short<-
unc.date.samp.ar<-
unc.date.samp.ar.short<-
#unc.date.input.save<-
# also for other fields samplings?
unc.date.noise.save<-# samp.noise.save + dating uncertainty
unc.date.bias.noise.data.save <-#samp.bias.noise.data.save + dating uncertainty
unc.date.smooth.save <-#samp.smooth.save + dating uncertainty
unc.date.ar.smooth.save <-#samp.ar.smooth.save + dating uncertainty
unc.date.short.smooth.save <- #samp.short.smooth.save + dating uncertainty
unc.date.short.ar.smooth.save <- #samp.short.ar.smooth.save + dating uncertainty
unc.date.subsampled.save<-#samp.subsampled.save + dating uncertainty
unc.date.meas.noise.subsampled.save<-
#sampled seasonally subsampled+meas noise+dating uncertainty
unc.date.meas.noise.smooth.save <- #samp.meas.noise.smooth.save+dat uncertainty
unc.date.meas.noise.short.smooth.save<-#samp.meas.noise.short.smooth.save+datunc
unc.date.meas.noise.ar.smooth.save<-##not saved
unc.date.meas.noise.short.ar.smooth.save <- ##not saved
array(NA,dim=c(n.samples,c(dim.noise.field[c(2,3)])))
## essence fields
gen.noise.env<-#environmental noise
smooth.gen.dat <- #filtered temperature
noise.gen.dat <- #temperature plus noise
bias.noise.gen.dat <- #Temperature plus environmental noise plus orbital bias
smooth.bias.noise.gen.dat <- # filtered temperature plus noise plus bias
ar.gen.dat <- #archiving noise
ar.smooth.bias.noise.gen.dat <- #archived filtered temperature+noise+bias
array(NA,dim=c(c(dim.noise.field)))
samp.dates.save.gen <- #the sample dates
unc.date.samp.gen<-#dating uncertainty
samp.ar.smooth.bias.noise.gen.dat <- #sampled ar.smooth.bias.noise.gen.dat
unc.samp.ar.smooth.bias.noise.gen.dat <-
#sampled ar.smooth.bias.noise.gen.dat plus dating uncertainty
meas.unc.samp.ar.smooth.bias.noise.gen.dat <-
#sampled plus dating uncertainty plus measurement noise
array(NA,dim=c(n.samples,c(dim.noise.field[c(2,3)])))
```
#### General output format
Spatial dimensions for the netcdf output files stem from the netcdf input files where appropriate. This holds also for the time-dimension of output files for full records.
```{r output_1,tidy=FALSE,echo=TRUE,eval=FALSE}
output.dimension.longitude <-
ncdim_def("lon","degrees_east",nc.file.in$dim$lon$vals)
output.dimension.latitude <-
ncdim_def("lat","degrees_north",nc.file.in$dim$lat$vals)
output.dimension.time <-
ncdim_def("time",nc.file.in$dim$tim$units,nc.file.in$dim$tim$vals)
```
For the sampled records, the temporal dimension simply takes the first `n.samples` number of dates from the input file. The dates of the samples are provided as additional fields of an output file.
```{r output_2,tidy=FALSE,echo=TRUE,eval=FALSE}
output.dimension.sampled.time<-
ncdim_def("time",nc.file.in$dim$tim$units,nc.file.in$dim$tim$vals[1:n.samples])
```
Output variables are generally defined in one of the following two versions for full records and sampled records respectively.
```{r output_3,tidy=FALSE,echo=TRUE,eval=FALSE}
output.variable.x<-
ncvar_def("...","...",list(output.dimension.longitude,output.dimension.latitude,
output.dimension.time),longname="...")
```
```{r output_4,tidy=FALSE,echo=TRUE,eval=FALSE}
output.variable.y<-
ncvar_def("...","...",list(output.dimension.longitude,output.dimension.latitude,
output.dimension.sampled.time),longname="...")
```
## References
Berger, A.: Long-Term Variations of Daily Insolation and Quaternary Climatic Changes, J. Atmos. Sci., 35, 2362–2367,
doi: 10.1175/1520-0469(1978)035\%3C2362:LTVODI\%3E2.0.CO;2, 1978.
Crucifix, M.: palinsol: Insolation for Palaeoclimate Studies, URL https://CRAN.R-project.org/package=palinsol, r package
version 0.93, 2016.
[1]: https://files.osf.io/v1/resources/bkw6p/providers/osfstorage/5c9a1be5246078001afb735e?mode=render