This paper is an extension of Hanif, Hamad and Shahbaz estimator [1] for two-phase sampling. The aim of this paper is to develop a regression type estimator with two auxiliary variables for two-phase sampling when we ...This paper is an extension of Hanif, Hamad and Shahbaz estimator [1] for two-phase sampling. The aim of this paper is to develop a regression type estimator with two auxiliary variables for two-phase sampling when we don’t have any type of information about auxiliary variables at population level. To avoid multi-collinearity, it is assumed that both auxiliary variables have minimum correlation. Mean square error and bias of proposed estimator in two-phase sampling is derived. Mean square error of proposed estimator shows an improvement over other well known estimators under the same case.展开更多
Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed...Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed by McIntyre[1952.A method for unbiased selective sampling,using ranked sets.Australian Journal of Agricultural Research 3,385-390]as an effective way to estimate the pasture mean.In the current paper,a modification of ranked set sampling called moving extremes ranked set sampling(MERSS)is considered for the best linear unbiased estimators(BLUEs)for the simple linear regression model.The BLUEs for this model under MERSS are derived.The BLUEs under MERSS are shown to be markedly more efficient for normal data when compared with the BLUEs under simple random sampling.展开更多
While an auxiliary information in double sampling increases the precision of an estimate and solves the problem of bias caused by non-response in sample survey, the question is that, does the level of correlation betw...While an auxiliary information in double sampling increases the precision of an estimate and solves the problem of bias caused by non-response in sample survey, the question is that, does the level of correlation between the auxiliary information x and the study variable y ease in the accomplishment of the objectives of using double sampling? In this research, investigation was conducted through empirical study to ascertain the importance of correlation level between the auxiliary variable and the study variable to maximally accomplish the importance of auxiliary variable(s) in double sampling. Based on the Statistics criteria employed, which are minimum variance, coefficient of variation and relative efficiency, it was established that the higher the correlation level between the study and auxiliary variable(s) is, the better the estimator is.展开更多
In this paper, we have developed estimators of finite population mean using Mixture Regression estimators using multi-auxiliary variables and attributes in two-phase sampling and investigated its finite sample propert...In this paper, we have developed estimators of finite population mean using Mixture Regression estimators using multi-auxiliary variables and attributes in two-phase sampling and investigated its finite sample properties in full, partial and no information cases. An empirical study using natural data is given to compare the performance of the proposed estimators with the existing estimators that utilizes either auxiliary variables or attributes or both for finite population mean. The Mixture Regression estimators in full information case using multiple auxiliary variables and attributes are more efficient than mean per unit, Regression estimator using one auxiliary variable or attribute, Regression estimator using multiple auxiliary variable or attributes and Mixture Regression estimators in both partial and no information case in two-phase sampling. A Mixture Regression estimator in partial information case is more efficient than Mixture Regression estimators in no information case.展开更多
In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by ...In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.展开更多
In this paper, a regression method of estimation has been used to derive the mean estimate of the survey variable using simple random sampling without replacement in the presence of observational errors. Two covariate...In this paper, a regression method of estimation has been used to derive the mean estimate of the survey variable using simple random sampling without replacement in the presence of observational errors. Two covariates were used and a case where the observational errors were in both the survey variable and the covariates was considered. The inclusion of observational errors was due to the fact that data collected through surveys are often not free from errors that occur during observation. These errors can occur due to over-reporting, under-reporting, memory failure by the respondents or use of imprecise tools of data collection. The expression of mean squared error (MSE) based on the obtained estimator has been derived to the first degree of approximation. The results of a simulation study show that the derived modified regression mean estimator under observational errors is more efficient than the mean per unit estimator and some other existing estimators. The proposed estimator can therefore be used in estimating a finite population mean, while considering observational errors that may occur during a study.展开更多
In this paper, we have proposed a class of mixture regression-cum-ratio estimator for estimating population mean by using information on multiple auxiliary variables and attributes simultaneously in single-phase sampl...In this paper, we have proposed a class of mixture regression-cum-ratio estimator for estimating population mean by using information on multiple auxiliary variables and attributes simultaneously in single-phase sampling and analyzed the properties of the estimator. An empirical was carried out to compare the performance of the proposed estimator with the existing estimators of finite population mean using simulated population. It was found that the mixture regression-cum-ratio estimator was more efficient than ratio and regression estimators using one auxiliary variable and attribute, ratio and regression estimators using multiple auxiliary variables and attributes and regression-cum-ratio estimators using multiple auxiliary variables and attributes in single-phase sampling for finite population.展开更多
The proposed techniques investigate the strength of support vector regression(SVR)in cancer prognosis using imaging features.Cancer image features were extracted from patients and recorded into censored data.To employ...The proposed techniques investigate the strength of support vector regression(SVR)in cancer prognosis using imaging features.Cancer image features were extracted from patients and recorded into censored data.To employ censored data for prognosis,SVR methods are needed to be adapted to uncertain targets.The effectiveness of two principle breast features,tumor size and lymph node status,was demonstrated by the combination of sampling and feature selection methods.In sampling,breast data were stratified according to tumor size and lymph node status.Three types of feature selection methods comprised of no selection,individual feature selection,and feature subset forward selection,were employed.The prognosis results were evaluated by comparative study using the following performance metrics:concordance index(CI)and Brier score(BS).Cox regression was employed to compare the results.The support vector regression method(SVCR)performs similarly to Cox regression in three feature selection methods and better than Cox regression in non-feature selection methods measured by CI and BS.Feature selection methods can improve the performance of Cox regression measured by CI.Among all cross validation results,stratified sampling of tumor size achieves the best regression results for both feature selection and non-feature selection methods.The SVCR regression results,perform better than Cox regression when the techniques are used with either CI or BS.The best CI value in the validation results is 0.6845.The best CI value corresponds to the best BS value 0.2065,which were obtained in the combination of SVCR,individual feature selection,and stratified sampling of the number of positive lymph nodes.In addition,we also observe that SVCR performs more consistently than Cox regression in all prognosis studies.The feature selection method does not have a significant impact on the metric values,especially on CI.We conclude that the combinational methods of SVCR,feature selection,and sampling can improve cancer prognosis,but more significant features may further enhance cancer prognosis accuracy.展开更多
This paper uses a grouping-adjusting procedure to the data from a median linear regression model, and estimtes the regression coefficients by the method of weighted least squares. This method simplifies computation an...This paper uses a grouping-adjusting procedure to the data from a median linear regression model, and estimtes the regression coefficients by the method of weighted least squares. This method simplifies computation and in the meantime, preserves the same asymptotic normal distribution for the estimator, as in the ordinary minimum L_1-norm estimates.展开更多
Assume that in the nonlinear regression model, independent variable sequence {xi, i ≥ 1} is a known constant-vector sequence. This article proposes a condition on {xi}, which can be tested and verified easily. The co...Assume that in the nonlinear regression model, independent variable sequence {xi, i ≥ 1} is a known constant-vector sequence. This article proposes a condition on {xi}, which can be tested and verified easily. The condition is essential for proving the consistency and asymptotic normality of the estimator.展开更多
Let Y_i=M(X_i)+ei, where M(x)=E(Y|X=x) is an unknown realfunction on B(? R), {(X_1,Y_i)} is a stationary and m(n)-dependent sample from(X, Y), the residuals {e_i} are independent of {X_i} and have unknown common densi...Let Y_i=M(X_i)+ei, where M(x)=E(Y|X=x) is an unknown realfunction on B(? R), {(X_1,Y_i)} is a stationary and m(n)-dependent sample from(X, Y), the residuals {e_i} are independent of {X_i} and have unknown common densityf(x). In [2] a nonparametric estimate f_n(x) for f(x) has been proposed on the basisof the residuals estimates. In this paper, we further obtain the asymptotic normalityand the law of the iterated logarithm of f_n(x) under some suitable conditions. Theseresults together with those in [2] bring the asymptotic theory for the residuals densityestimate in nonparametric regression under m(n)-dependent sample to completion.展开更多
文摘This paper is an extension of Hanif, Hamad and Shahbaz estimator [1] for two-phase sampling. The aim of this paper is to develop a regression type estimator with two auxiliary variables for two-phase sampling when we don’t have any type of information about auxiliary variables at population level. To avoid multi-collinearity, it is assumed that both auxiliary variables have minimum correlation. Mean square error and bias of proposed estimator in two-phase sampling is derived. Mean square error of proposed estimator shows an improvement over other well known estimators under the same case.
基金Supported by the National Natural Science Foundation of China(11901236)the Scientific Research Fund of Hunan Provincial Science and Technology Department(2019JJ50479)+3 种基金the Scientific Research Fund of Hunan Provincial Education Department(18B322)the Winning Bid Project of Hunan Province for the 4th National Economic Census([2020]1)the Young Core Teacher Foundation of Hunan Province([2020]43)the Funda-mental Research Fund of Xiangxi Autonomous Prefecture(2018SF5026)。
文摘Cost effective sampling design is a major concern in some experiments especially when the measurement of the characteristic of interest is costly or painful or time consuming.Ranked set sampling(RSS)was first proposed by McIntyre[1952.A method for unbiased selective sampling,using ranked sets.Australian Journal of Agricultural Research 3,385-390]as an effective way to estimate the pasture mean.In the current paper,a modification of ranked set sampling called moving extremes ranked set sampling(MERSS)is considered for the best linear unbiased estimators(BLUEs)for the simple linear regression model.The BLUEs for this model under MERSS are derived.The BLUEs under MERSS are shown to be markedly more efficient for normal data when compared with the BLUEs under simple random sampling.
文摘While an auxiliary information in double sampling increases the precision of an estimate and solves the problem of bias caused by non-response in sample survey, the question is that, does the level of correlation between the auxiliary information x and the study variable y ease in the accomplishment of the objectives of using double sampling? In this research, investigation was conducted through empirical study to ascertain the importance of correlation level between the auxiliary variable and the study variable to maximally accomplish the importance of auxiliary variable(s) in double sampling. Based on the Statistics criteria employed, which are minimum variance, coefficient of variation and relative efficiency, it was established that the higher the correlation level between the study and auxiliary variable(s) is, the better the estimator is.
文摘In this paper, we have developed estimators of finite population mean using Mixture Regression estimators using multi-auxiliary variables and attributes in two-phase sampling and investigated its finite sample properties in full, partial and no information cases. An empirical study using natural data is given to compare the performance of the proposed estimators with the existing estimators that utilizes either auxiliary variables or attributes or both for finite population mean. The Mixture Regression estimators in full information case using multiple auxiliary variables and attributes are more efficient than mean per unit, Regression estimator using one auxiliary variable or attribute, Regression estimator using multiple auxiliary variable or attributes and Mixture Regression estimators in both partial and no information case in two-phase sampling. A Mixture Regression estimator in partial information case is more efficient than Mixture Regression estimators in no information case.
文摘In this paper, auxiliary information is used to determine an estimator of finite population total using nonparametric regression under stratified random sampling. To achieve this, a model-based approach is adopted by making use of the local polynomial regression estimation to predict the nonsampled values of the survey variable y. The performance of the proposed estimator is investigated against some design-based and model-based regression estimators. The simulation experiments show that the resulting estimator exhibits good properties. Generally, good confidence intervals are seen for the nonparametric regression estimators, and use of the proposed estimator leads to relatively smaller values of RE compared to other estimators.
文摘In this paper, a regression method of estimation has been used to derive the mean estimate of the survey variable using simple random sampling without replacement in the presence of observational errors. Two covariates were used and a case where the observational errors were in both the survey variable and the covariates was considered. The inclusion of observational errors was due to the fact that data collected through surveys are often not free from errors that occur during observation. These errors can occur due to over-reporting, under-reporting, memory failure by the respondents or use of imprecise tools of data collection. The expression of mean squared error (MSE) based on the obtained estimator has been derived to the first degree of approximation. The results of a simulation study show that the derived modified regression mean estimator under observational errors is more efficient than the mean per unit estimator and some other existing estimators. The proposed estimator can therefore be used in estimating a finite population mean, while considering observational errors that may occur during a study.
文摘In this paper, we have proposed a class of mixture regression-cum-ratio estimator for estimating population mean by using information on multiple auxiliary variables and attributes simultaneously in single-phase sampling and analyzed the properties of the estimator. An empirical was carried out to compare the performance of the proposed estimator with the existing estimators of finite population mean using simulated population. It was found that the mixture regression-cum-ratio estimator was more efficient than ratio and regression estimators using one auxiliary variable and attribute, ratio and regression estimators using multiple auxiliary variables and attributes and regression-cum-ratio estimators using multiple auxiliary variables and attributes in single-phase sampling for finite population.
文摘The proposed techniques investigate the strength of support vector regression(SVR)in cancer prognosis using imaging features.Cancer image features were extracted from patients and recorded into censored data.To employ censored data for prognosis,SVR methods are needed to be adapted to uncertain targets.The effectiveness of two principle breast features,tumor size and lymph node status,was demonstrated by the combination of sampling and feature selection methods.In sampling,breast data were stratified according to tumor size and lymph node status.Three types of feature selection methods comprised of no selection,individual feature selection,and feature subset forward selection,were employed.The prognosis results were evaluated by comparative study using the following performance metrics:concordance index(CI)and Brier score(BS).Cox regression was employed to compare the results.The support vector regression method(SVCR)performs similarly to Cox regression in three feature selection methods and better than Cox regression in non-feature selection methods measured by CI and BS.Feature selection methods can improve the performance of Cox regression measured by CI.Among all cross validation results,stratified sampling of tumor size achieves the best regression results for both feature selection and non-feature selection methods.The SVCR regression results,perform better than Cox regression when the techniques are used with either CI or BS.The best CI value in the validation results is 0.6845.The best CI value corresponds to the best BS value 0.2065,which were obtained in the combination of SVCR,individual feature selection,and stratified sampling of the number of positive lymph nodes.In addition,we also observe that SVCR performs more consistently than Cox regression in all prognosis studies.The feature selection method does not have a significant impact on the metric values,especially on CI.We conclude that the combinational methods of SVCR,feature selection,and sampling can improve cancer prognosis,but more significant features may further enhance cancer prognosis accuracy.
基金Research supported By AFOSC, USA, under Contract F49620-85-0008oy NNSFC of China.
文摘This paper uses a grouping-adjusting procedure to the data from a median linear regression model, and estimtes the regression coefficients by the method of weighted least squares. This method simplifies computation and in the meantime, preserves the same asymptotic normal distribution for the estimator, as in the ordinary minimum L_1-norm estimates.
文摘Assume that in the nonlinear regression model, independent variable sequence {xi, i ≥ 1} is a known constant-vector sequence. This article proposes a condition on {xi}, which can be tested and verified easily. The condition is essential for proving the consistency and asymptotic normality of the estimator.
基金Project Supported by National Natural Science Foundation of China.
文摘Let Y_i=M(X_i)+ei, where M(x)=E(Y|X=x) is an unknown realfunction on B(? R), {(X_1,Y_i)} is a stationary and m(n)-dependent sample from(X, Y), the residuals {e_i} are independent of {X_i} and have unknown common densityf(x). In [2] a nonparametric estimate f_n(x) for f(x) has been proposed on the basisof the residuals estimates. In this paper, we further obtain the asymptotic normalityand the law of the iterated logarithm of f_n(x) under some suitable conditions. Theseresults together with those in [2] bring the asymptotic theory for the residuals densityestimate in nonparametric regression under m(n)-dependent sample to completion.