AssetsModelling          package:fPortfolio          R Documentation

_M_o_d_e_l_l_i_n_g _o_f _M_u_l_t_i_v_a_r_i_a_t_e _A_s_s_e_t _S_e_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     A collection and description of functions which  generate
     multivariate artificial data sets of assets,  which fit the
     parameters to a multivariate normal,  skew normal, or (skew)
     Student-t distribution and  which compute some benchmark
     statistics. In addition  a function is provided which allows for
     the selection  and clustering of individual assets from portfolios
      using hierarchical and k-means clustering approaches. 

     The functions are:

       'assetsSim'      Simulates a data set of assets,
       'assetsSelect'   Asset Selection from Portfolios,
       'assetsFit'      Fits the parameter of a data set of assets,
       'assetsStats'    Computes benchmark statistics of asset sets,
       'assetsMeanCov'  Computes mean and covariance matri,
       'assetsTest'     Test for multivariate Normal distribution,
       'print'          S3 print method for an object of class 'fASSETS',
       'plot'           S3 Plot method for an object of class 'fASSETS",
       'summary'        S3 summary method for an object of class 'fASSETS'.

_U_s_a_g_e:

     assetsSim(n, dim = 2, model = list(mu = rep(0, dim), Omega = diag(dim), 
         alpha = rep(0, dim), df = Inf), assetNames = NULL) 
     assetsSelect(x, method = c("hclust", "kmeans"), 
         kmeans.centers = 5, kmeans.maxiter = 10, doplot = TRUE, ...)
         
     assetsFit(x, method = c("st", "snorm", "norm"), title = NULL, 
         description = NULL, fixed.df = NA, ...)

     assetsMeanCov(x, method = c("cov", "mve", "mcd", "nnve", "shrink", "bagged"), 
         check = TRUE, force = TRUE, baggedR = 100, ...)
     assetsStats(x)
     assetsTest(x, method = c("shapiro", "energy"), Replicates = 100, 
         title = NULL, description = NULL)

     ## S3 method for class 'fASSETS':
     print(x, ...)
     ## S3 method for class 'fASSETS':
     plot(x, which = "ask", ...)
     ## S3 method for class 'fASSETS':
     summary(object, which = "all", ...)

_A_r_g_u_m_e_n_t_s:

assetNames: [assetsSim] - 
           a vector of character strings of length 'dim' allowing for
          modifying the names of the individual assets. 

 baggedR: [assetsMeanCov] - 
           an integer value, the number of bootstrap replicates, by
          default 100. This value is only used if 'method="bagged"'. 

   check: [assetsMeanCov] - 
           a logical flag. Should the covariance matrix be tested to be
          positive definite? By default 'TRUE'. 

description: [assetsFit] - 
           a character string, assigning a brief description to an 
          '"fASSETS"' object. 

  doplot: [assetsSelect] - 
           a logical, should a plot be displayed? 

fixed.df: [assetsFit] - 
           either 'NA', the default, or a numeric value assigning the
          number of degrees of freedom to the model. In the case that 
          'fixed.df=NA' the value of 'df' will be included in the
          optimization process, otherwise not. 

   force: [assetsMeanCov] - 
           a logical flag. Should the covariance matrix be forced to be
          positive definite? By default 'TRUE'. 

kmeans.centers: [assetsSelect] - 
           either the number of clusters or a set of initial cluster
          centers.  If the first, a random set of rows in 'x' are
          chosen as the  initial centers.    

kmeans.maxiter: [assetsSelect] - 
           the maximum number of iterations allowed. 

  method: [assetsFit] - 
           a character string, which type of distribution should be
          fitted? 'method="st"' denotes a multivariate skew-Student-t
          distribution, 'method="snorm"' a multivariate skew-Normal
          distribution, and 'method="norm"' a multivariate Normel
          distribution.   By default a multivariate normal distribution
          will be fitted to the empirical market data.
           [assetsMeanVar] - 
           a character string, whicht determines how to compute the
          covariance matix. If 'method="cov"' is selected then the
          standard  covariance will be computed by R's base function
          'cov', if  'method="shrink"' is selected then the covariance
          will be computed using the shrinkage approach as suggested in
          Schaefer and Strimmer [2005], if 'method="bagged"' is
          selected then the  covariance will be calculated from the
          bootstrap aggregated (bagged) version of the covariance
          estimator.
           [assetsSelect] - 
           a character string, which clustering method should be
          applied?  Either 'hclust' for hierarchical clustering of
          dissimilarities, or 'kmeans' for k-means clustering.
           [assetsTest] - 
           a character string, which the selects which test should be
          applied. If 'method="shapiro"' then Shapiro's multivariate
          Normality  test will be applied as implemented in R's
          contributed package 'mvnormtest'. If 'method="energy"' then
          the E-statistic  (energy) for testing multivariate Normality
          will be used as proposed  and implemented by Szekely and
          Rizzo [2005] using parametric  bootstrap. 

   model: [assetsSim] - 
           a list of model parameters: 
           'mu' a vector of mean values, one for each asset series, 
           'Omega' the covariance matrix of assets, 
           'alpha' the skewness vector, and 
           'df' the number of degrees of freedom which is a measure for
          the fatness of the tails (excess kurtosis). 
           For a symmetric distribution 'alpha' is a vector of zeros.
          For the normal distributions 'df' is not used and set to 
          infinity, 'Inf'. Note that all assets have the same value 
          for 'df'. 

  n, dim: [assetsSim] - 
           integer values giving the number of data records to be
          simulated,  and the dimension of the assets set. 

  object: [summary] - 
           An object of class 'fASSETS'. 

Replicates: [assetsTest] - 
           an integer value, the number of bootstrap replicates, by
          default 100. This value is only used if 'method="energy"'. 

   title: [assetsFit] - 
           a character string, assigning a title to an  '"fASSETS"'
          object. 

   which: which of the five plots should be displayed? 'which' can  be
          either a character string, '"all"' (displays all plots)  or
          '"ask"' (interactively asks which one to display), or a 
          vector of 5 logical values, for those elements which are set 
          'TRUE' the correponding plot will be displayed. 

       x: [assetsFit][assetsStats][assetsMeanVar] - 
           a numeric matrix of returns or any other rectangular object
          like a data.frame or a multivariate time series object which
          can be  transformed by the function 'as.matrix' to an object
          of  class 'matrix'. 
           [plot][print] - 
           An object of class 'fASSETS'. 

     ...: optional arguments to be passed. 

_D_e_t_a_i_l_s:

     Data sets of assets 'x' can be expressed as multivariate 
     'timeSeries' objects, as 'data.frame' objects, or any other
     rectangular  object which can be transformed into an object of
     class 'matrix'. 

     *Parameter Estimation:* 

      The function 'assetsFit' for the parameter estimation and 
     'assetsSim' for the simulation of assets sets use code based on 
     functions from the contributed packages '"mtvnorm"' and '"sn"'. 
     The required functionality for fitting data to a multivariate
     Normal,  skew-Normal, or skew-Student-t is available from builtin
     functions, so  it is not necessary to load the packages
     '"mtvnorm"' and '"sn"'.  

     *Assets Mean and Covariance:* 

         The function 'assetsMeanCov' computes the mean vector and
     covariance matrix of an assets set. For the covariance matrix one
     can select from three choicdes: The standard covariance
     computation through R's base function 'cov' and a shrinked and
     bagged version for the covariance. The latter two choices
     implement the covariance computation from the functions
     'cov.shrink()' and 'cov.bagged()' which are part of the
     contributed R package 'corpcov'. 

     *Assets Statistics:* 

         The function 'assetsStats' implements benchmark formulas and
     statistics as reported in the help page of the hedge fund software
      from _www.AlternativeSoft.com_. The computed statistics are
     listed in the 'Value' section below. Note, that the functions were
     written for monthly recorded data sets. Be aware of this when you
     use or generate  asset sets on different time scales, then you
     have to scale them properly. 

     *Assets Selection:* 

         The function 'assetsSelect' calls the functions 'hclust' or
     'kmeans' from R's '"stats"' package. 'hclust' performs a
     hierarchical cluster analysis on the set of dissimilarities 
     'hclust(dist(t(x)))' and 'kmeans' performs a k-means clustering on
     the data matrix itself.  

     *Assets Tests:* 

         The function 'assetsTest' performs two tests for multivariate
     Normality of an assets Set.

_V_a_l_u_e:

     'assetsFit'  
      returns a S4 object class of class '"fASSETS"', with the
     following  slots:

   @call: the matched function call. 

   @data: the input data in form of a data.frame. 

@description: allows for a brief project description. 

    @fit: the results as a list returned from the underlying fitting
          function.  

 @method: the selected method to fit the distribution, one  of
          '"norm"', '"snorm"', '"st"'. 

  @model: the model parameters describing the fitted parameters in 
          form of a list, 'model=list(mu, Omega, alpha, df'. 

  @title: a title string. 

 @fit$dp: a list containing the direct parameters beta, Omega, alpha. 
          Here, beta is a matrix of regression coefficients with 
          'dim(beta)=c(nrow(X), ncol(y))', 'Omega' is a  covariance
          matrix of order 'dim', 'alpha' is  a vector of shape
          parameters of length 'dim'.   

 @fit$se: a list containing the components beta, alpha, info. Here, 
          beta and alpha are the standard errors for the corresponding 
          point estimates; info is the observed information matrix  for
          the working parameter, as explained below. 

fit@optim: the list returned by the optimizer 'optim'; see the 
          documentation of this function for explanation of its 
          components.   


     Note that the '@fit$model' slot can be used as input to the 
     function 'assetsSim' for simulating a similar portfolio of  assets
     compared with the original portfolio data, usually market assets. 

     'assetsMeanCov' 
      returns a list with two entries named 'mu' and Sigma{Sigma}. The
     first denotes the vector of assets means, and the second the 
     covariance matrix. Note, that the output of this function can be
     used as data input for the portfolio functions to compute the
     efficient frontier. 

     'assetsSelect' 
      if 'method="hclust"' was selected then the function returns a S3
     object of class "hclust", otherwise if 'method="kmeans"' was 
     selected then the function returns an obkject of class list. For
     details we refer to the help pages of 'hclust' and 'kmeans'. 

     'assetsSim'  
      returns a matrix, the artifical data records represent the assets
      of the portfolio. Row names and column names are not created,
     they have to be added afterwards. 

     'assetsStats' 
      returns a data frame with the following entries per column and
     asset:  
      'Records' - number of records (length of time series), 
      'paMean' - annualized (pa, per annum) Mean of Returns, 
      'paAve' - annualized Average of Returns, 
      'paVola' - annualized Volatility (standard Deviation), 
      'paSkew' - Skewness of Returns, 
      'paKurt' - Kurtosis of Returns, 
      'maxDD' - maximum Drawdown, 
      'TUW' - Time under Water, 
      'mMaxLoss' - Monthly maximum Loss, 
      'mVaR' - Monthly 99 'mModVaR' - Monthly 99 'mSharpe' - Monthly
     Sharpe Ratio, 
      'mModSharpe' - Monthly Modified Sharpe Ratio, and 
      'skPrice' - Skewness/Kurtosis Price. 

     'assetsTest' 
      returns an object of class 'fHTEST'.

_A_u_t_h_o_r(_s):

     Adelchi Azzalini for R's 'sn' package, 
      Torsten Hothorn for R's 'mtvnorm' package, 
      Juliane Schaefer and Korbinian Strimmer for R's 'corpcov'
     package, 
      Alan Ganz and Frank Bretz for the underlying Fortran Code, 
      Maria Rizzoand Gabor Szekely for R's 'energy' package, 
      Diethelm Wuertz for the Rmetrics port.

_R_e_f_e_r_e_n_c_e_s:

     Azzalini A. (1985); _A Class of Distributions Which Includes the
     Normal Ones_, Scandinavian Journal of Statistics 12, 171-178. 

     Azzalini A. (1986); _Further Results on a Class of Distributions
     Which Includes  the Normal Ones_, Statistica 46, 199-208. 

     Azzalini A., Dalla Valle A. (1996); _The Multivariate Skew-normal
     Distribution_, Biometrika 83, 715-726. 

     Azzalini A., Capitanio A. (1999); _Statistical Applications of the
     Multivariate Skew-normal  Distribution_, Journal Roy. Statist.
     Soc. B61, 579-602. 

     Azzalini A., Capitanio A. (2003); _Distributions Generated by
     Perturbation of Symmetry with  Emphasis on a Multivariate Skew-t
     Distribution_, Journal Roy. Statist. Soc. B65, 367-389. 

     Breiman L. (1996);  _Bagging Predictors_, Machine Learning 24,
     123-140.

     Genz A., Bretz F. (1999); _Numerical Computation of Multivariate
     t-Probabilities with Application to Power Calculation of Multiple
     Contrasts_,  Journal of Statistical Computation and Simulation 63,
     361-378.

     Genz A. (1992); _Numerical Computation of Multivariate Normal
     Probabilities_, Journal of Computational and Graphical Statistics
     1, 141-149.

     Genz A. (1993);  _Comparison of Methods for the Computation of
     Multivariate Normal Probabilities_, Computing Science and
     Statistics 25, 400-405.

     Hothorn T., Bretz F., Genz A. (2001); _On Multivariate t and Gauss
     Probabilities in R_, R News 1/2, 27-29.

     Ledoit O., Wolf. M. (2003); _ImprovedEestimation of the Covariance
     Matrix of Stock Returns  with an Application to Portfolio
     Selection_, Journal of Empirical Finance 10, 503-621. 

     Rizzo M.L. (2002); _A New Rotation Invariant Goodness-of-Fit
     Test_,  PhD dissertation, Bowling Green State University.

     Schaefer J., Strimmer K. (2005);   _A Shrinkage Approach to
     Large-Scale Covariance Estimation and Implications for Functional
     Genomics_, Statist. Appl. Genet. Mol. Biol. 4, 32.

     Szekely G.J., Rizzo, M.L. (2005);  _A New Test for Multivariate
     Normality_, Journal of Multivariate Analysis 93, 58-80.

     Szekely G.J. (1989);  _Potential and Kinetic Energy in
     Statistics_, Lecture Notes, Budapest Institute of Technology,
     TechnicalUniversity.

_S_e_e _A_l_s_o:

     'MultivariateDistribution', 
      'hclust' and 'kmeans'.

_E_x_a_m_p_l_e_s:

     ## Not run: 
     ## SOURCE("fPortfolio.101A-AssetsModelling")

     ## berndtInvest -
        xmpPortfolio("\nStart: Load monthly data set of returns > ")
        data(berndtInvest)
        # Exclude Date, Market and Interest Rate columns from data frame,
        # then multiply by 100 for percentual returns ...
        berndtAssets = berndtInvest[, -c(1, 11, 18)]
        rownames(berndtAssets) = berndtInvest[, 1]
        head(berndtAssets)
         
     ## assetsSelect -
        xmpPortfolio("\nNext: Select 4 most dissimilar assets from hclust > ")
        clustered = assetsSelect(berndtAssets, doplot = FALSE)
        myAssets = berndtAssets[, c(clustered$order[1:4])]
        colnames(myAssets)
        # Scatter and time series plot:
        par(mfrow = c(2, 1), cex = 0.7)
        plot(clustered)  
        myPrices = apply(myAssets, 2, cumsum)
        ts.plot(myPrices, main = "Selected Assets", 
          xlab = "Months starting 1978", ylab = "Price", col = 1:4)
        legend(0, 3, legend = colnames(myAssets), pch = "----", col = 1:4, cex = 1)
        
     ## assetsStats -
        if (require(fBasics)) assetsStats(myAssets)
        
     ## assetsSim -
        xmpPortfolio("\nNext: Fit a Skew Student-t > ")
        fit = assetsFit(myAssets)
        # Show Model Slot:
        fit @model
        # Simulate set with same properties:
        set.seed(1953)
        simAssets = assetsSim(n = 120, dim = 4, model = fit@model)
        head(simAssets)
        simPrices = apply(simAssets, 2, cumsum)
        ts.plot(simPrices, main = "Simulated Assets", 
          xlab = "Number of Months", ylab = "Simulated Price", col = 1:4)
        legend(0, 3, legend = colnames(simAssets), pch = "----", col = 1:4, cex = 1)
        
     ## plot -
        xmpPortfolio("\nNext: Show Simulated Assets Plots > ")
        if (require(fExtremes)) {
          # Show Scatterplot:
          par(mfrow = c(1, 1), cex = 0.7)
          plot(fit, which = c(TRUE, FALSE, FALSE, FALSE, FALSE))
          # Show  QQ and PP Plots:
          par(mfrow = c(2, 2), cex = 0.7)
          plot(fit, which = !c(TRUE, FALSE, FALSE, FALSE, FALSE))
        }
     ## End(Not run)

