ISSN 1842-4562
Member of DOAJ

Measuring Distance to Integrate High-Throughput Datasets with the Supplementary Variable



Integration,Multifactor analysis,Distance measure,Manifest variables,Supplementary variable


Biological systems cannot be understood by the analysis of single-type of dataset. So, integrative analysis has been considered as an essential tool to combine different types of datasets with the biological factors to improve biological knowledge. Several methods and approaches have been developed and updated time to time to integrate high-throughput datasets with the biological factors or supplementary variables. Among the several methods, multifactor analysis gives a rough idea about the association between gene to gene as well as gene to other biological factors. Therefore, using distance measure we aimed to develop an approach to find out the association between genes (manifest variables) and body weight gain (supplementary variable) more precisely. In order to conduct this study we used a secondary dataset. As multifactor analysis gives loadings of manifest variables and using loading plot we got approximate relationship between genes and body weight gain, therefore, we used distance formula to calculate distance between coordinates (loadings of first principal component as well as second principal component) of body weight gain and other gene(s). Taken together, we may conclude that distance measure gives better insight to find out the strength of association of two sets of transcriptomics datasets with the body weight gain compare to just observing loadings. The approach developed in this study is not only applicable in biological field but also can be applied any field of research where researcher wants to integrate several manifest variables with one or more supplementary variables.