Title: | Multiple UniDimensional unFOLDing |
---|---|
Description: | Nonparametric unfolding item response theory (IRT) model for dichotomous data (see W.H. Van Schuur (1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists, and W.J.Post (1992). Nonparametric Unfolding Models: A Latent Structure Approach). The package implements MUDFOLD (Multiple UniDimensional unFOLDing), an iterative item selection algorithm that constructs unfolding scales from dichotomous preferential-choice data without explicitly assuming a parametric form of the item response functions. Scale diagnostics from Post(1992) and estimates for the person locations proposed by Johnson(2006) and Van Schuur(1984) are also available. This model can be seen as the unfolding variant of Mokken(1971) scaling method. |
Authors: | Spyros Balafas [aut, cre], Wim Krijnen [aut], Wendy Post [ctb], Ernst Wit [aut] |
Maintainer: | Spyros Balafas <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1.21 |
Built: | 2025-02-22 03:39:39 UTC |
Source: | https://github.com/cran/mudfold |
This package can be used for the purpose of finding unfolding structures from selected items in tests or questionnaires. Such structures, represent the underlying ordering on a latent scale of those items. The main function of this package is called mudfold
and fits the Van Schuur's scaling method to binary valued preference items. The method is called Multiple UniDimensional unFOLDing (MUDFOLD) and is an item selection algorithm belonging in the class of Nonparametric Item Response Theory (IRT) models.
MUDFOLD is a nonparametric probabilistic model for unidimensional unfolding. Originally developed by W. Van Schuur (1984) and further extended following ideas by W.J. Post (1992) who derived testable properties for the model fit. This method can be used to analyse the categorical (binary) responses of individuals to a set of questionnaire items pressumably generated from a nonmonotonic (unimodal) Item Response Function (IRF). The package incorporates the main function mudfold
which is used to estimate the MUDFOLD scale from binary valued unfolding items. The output of the main function is a list of S3 class "mdf"
, for which print()
, summary()
and plot()
generic functions are available to the user. The package provides the user also with the function mudfoldsim
that simulates unfolding scales using an item response function (IRF) with flexible parametrization.
The data must be given in an binary
matrix
or data.frame
with respondents in the rows and
items in the columns. Each row of the data corresponds to the selections of the
-th individual on a set of
items. Missing values must be coded as
NA
and the user can choose whether to apply list-wise deletion or impute the missing values using logistic regression multiple imputation by chained equations (logreg MICE).
Ultimate goal for MUDFOLD is to determine a unidimensional rank order of a (sub)set of items such that, they constitute an appropriate scale for measuring a common latent trait of the respondents. The estimation of the item order is done through an heuristic item selection algorithm, which tests iteratively the item fit to the scale with the use of scalability coefficients.
MUDFOLD's H coefficients of scalability are based to Loevinger's coefficient of homogeneity. In MUDFOLD, H coefficients utilize a scalability measure that is used in several criteria in the item selection algorithm. This coefficient in MUDFOLD can be calculated for triples of items, individual items, and the total scale. Diagnostic statistics are used to assess how well the unfolding scale conforms to the assumptions of unfolding response processeses. Uncertainty estimates for the scalability measures and the diagnostic statistics both at the item and scale level are obtained by exploiting nonparametric ordinary bootstrap. A bootstrap estimate of the unfolding scale is also available.
After an unfolding scale is obtained, it can be used to estimate item locations. Two estimators are available to the user of the mudfold package who can choose between an estimator proposed by Van Schuur and an estimator derived by Johnson.
For assessing the unfolding properties of the obtained scale based on the MUDFOLD assumptions, scale diagnostics
such as the ISO
and MAX
statistics, as well as diagnostic matrices for visual inspection of the conditional independence and moving maxima assumptions are available to the user.
Spyros E. Balafas (auth.), Wim P. Krijnen (auth.), Wendy J. Post (contr.), Ernst C. Wit (auth.)
Maintainer: Spyros E. Balafas ([email protected])
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders (1993). Nonparametric unfolding models for dichotomous data. Methodika.
M.S. Johnson. (2006). Nonparametric Estimation of Item and Respondent Locations from Unfolding-type Items. Psychometrica
## Not run: # Install the R package mudfold install.packages("mudfold") # Load the R package mudfold library(mudfold) ## End(Not run)
## Not run: # Install the R package mudfold install.packages("mudfold") # Load the R package mudfold library(mudfold) ## End(Not run)
D. Andrich's (1988) scale designed to measure the attitude from a sample of students towards capital punishment. The data set contains the dichotomous responses of 54 students on 8 statements concerning capital punishment.
data(ANDRICH)
data(ANDRICH)
A data frame with 54 observations on the following 8 variables.
HIDEOUS
a column vector containing the binary responses on the statement:
"Capital punishment is one of the most hideous practices of our time"
LIFESACRED
a column vector containing the binary responses on the statement:
"The state cannot teach the sacredness of human life by destroying it"
INEFFECTIV
a column vector containing the binary responses on the statement:
"Capital punishment is not an effective deterrent to crime"
DONTBELIEV
a column vector containing the binary responses on the statement:
"I do not believe in capital punishment but i am not sure it is not necessary"
WISHNOTNEC
a column vector containing the binary responses on the statement:
"I think capital punishment is necessary but i wish it were not"
MUSTHAVEIT
a column vector containing the binary responses on the statement:
"Until we find a more civilized way to prevent crime we must have capital punishment"
DETERRENT
a column vector containing the binary responses on the statement:
"Capital punishment is justified because it does act as a deterrent to crime"
CRIMDESERV
a column vector containing the binary responses on the statement:
"Capital punishment gives the criminal what he deserves"
The persons who responded to the statements for the analysis were 54 graduate students taking an introductory course in educational measurement and statistics. They responded simply by agreeing (1) or disagreeing (0) with each statement, with no restrictions placed on how many statements should receive an Agree response.
D. Andrich. (1988). The Application of an Unfolding Model of the PIRT Type to the Measurement of Attitude. Applied psychological measurement 12.1: 33-51.
D. Andrich. (1988). The Application of an Unfolding Model of the PIRT Type to the Measurement of Attitude. Applied psychological measurement 12.1 (1988): 33-51.
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders. (1993). Nonparametric unfolding models for dichotomous data. Methodika.
## Not run: data(ANDRICH) str(ANDRICH) ## End(Not run)
## Not run: data(ANDRICH) str(ANDRICH) ## End(Not run)
This function calculates the MUDFOLD statistics for data whose columns are assumed to be ranked to the order they are provided. The resulting object from the as.mudfold
function is an object of S3 class "mdf"
, for which generic functions print
, summary
, and plot
are available.
as.mudfold(data,estimation="rank")
as.mudfold(data,estimation="rank")
data |
: A binary |
estimation |
: This argument controls the nonparametric estimation method for person locations. By deafult this argument equals to |
The function as.mudfold
calculates MUDFOLD statistics for a given scale. Descriptive statistics, observed errors, expected errors, scalability coefficients, iso statistic values, are calculated for items and the scale. The user can obtain a summary table for the given scale with the summary
function which is designed for "mdf"
class objects.
The function as.mudfold
returns a list with the same components as the mudfold
function except the information that concerns the item selection algorithm. The list contains the following:
CALL |
A list where its components provide information for the function call. |
CHECK |
A list where its components provide information from the data checking step. |
DESCRIPTIVES |
A list with descriptive statistics for the |
MUDFOLD_INFO |
A list with three main components. The first component is called |
Spyros E. Balafas (auth.), Wim P. Krijnen (auth.), Wendy J. Post (contr.), Ernst C. Wit (auth.)
Maintainer: Spyros E. Balafas ([email protected])
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders. (1993).Nonparametric unfolding models for dichotomous data. Methodika.
M.S. Johnson. (2006). Nonparametric Estimation of Item and Respondent Locations from Unfolding-type Items. Psychometrica
## Not run: ## pick a number for setting the seed n.seed <- 11 ## Simulate an unfolding scale simulation <- mudfoldsim(N=6, n=100, seed=n.seed) ## get the data dat <- simulation$dat ## true order true_order <- simulation$true_ord ## check MUDFOLD statistics for the random simulated rank order mud_stats1 <- as.mudfold(dat) # get the summary summary(mud_stats1) ## check MUDFOLD statistics for the true item rank order mud_stats2 <- as.mudfold(dat[,true_order]) # get the summary for the true item rank order summary(mud_stats2) ## End(Not run)
## Not run: ## pick a number for setting the seed n.seed <- 11 ## Simulate an unfolding scale simulation <- mudfoldsim(N=6, n=100, seed=n.seed) ## get the data dat <- simulation$dat ## true order true_order <- simulation$true_ord ## check MUDFOLD statistics for the random simulated rank order mud_stats1 <- as.mudfold(dat) # get the summary summary(mud_stats1) ## check MUDFOLD statistics for the true item rank order mud_stats2 <- as.mudfold(dat[,true_order]) # get the summary for the true item rank order summary(mud_stats2) ## End(Not run)
This function is used to calculate the conditional adjacency matrix (CAM) from a binary valued matrix with the responses of n individuals to N items (Post,1992). CAM in its (i,j)th element contains the conditional frequency that a subject from the sample will choose the row item i given that the column item j is chosen. The probability is estimated from the data by dividing the joint frequency of choosing both items i and j by the relative frequency of choosing item j. Different orderings of the columns of the input matrix will result into different CAM matrices.
CAM(x)
CAM(x)
x |
: A binary matrix or data frame containing the responses of |
It calculates the CAM based on the following equation,
A matrix of class 'cam.mdf', with ncol(x)
rows and ncol(x)
columns with missing values on the diagonal elements when x
is a matrix or data frame. When x
is an object of class "mdf"
the dimension of the output matrix depends on the length of the obtained MUDFOLD scale. Rows and columns of the resulting CAM are ordered in the order of the columns of x
when x
is a matrix. When x
is a fitted MUDFOLD object then the rows and columns of CAM are ordered in the obtained MUDFOLD order.
Spyros E. Balafas ([email protected])
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders. (1993). Nonparametric unfolding models for dichotomous data. Methodika.
## load the ANDRICH data data("ANDRICH") ## Calculate the CAM for the ANDRICH scale CAM_andrch <- CAM(ANDRICH) ## Extract CAM from a fitted mudfold object mudf_andrich <- mudfold(ANDRICH) CAM_andrch_mudfold <- CAM(mudf_andrich)
## load the ANDRICH data data("ANDRICH") ## Calculate the CAM for the ANDRICH scale CAM_andrch <- CAM(ANDRICH) ## Extract CAM from a fitted mudfold object mudf_andrich <- mudfold(ANDRICH) CAM_andrch_mudfold <- CAM(mudf_andrich)
coef
method for S3 class "mdf"
objects.
This function extracts person and/or item parameters obtained after fitting MUDFOLD to binary preferential-choice data.
## S3 method for class 'mdf' coef(object, type, ...)
## S3 method for class 'mdf' coef(object, type, ...)
object |
: A fitted object of class |
type |
: Argument that controls the type of parameters to be returned. If |
... |
: not in use at the current version of the package. |
A vector when type="persons"
or type="items"
. Alist when type="all"
.
Spyros E. Balafas ([email protected])
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders. (1993). Nonparametric unfolding models for dichotomous data. Methodika.
## load the ANDRICH data data("ANDRICH") ## fit a MUDFOLD scale to the ANDRICH data mudf_andrich <- mudfold(ANDRICH) ## obtain the parameters from the fitted object coef(mudf_andrich)
## load the ANDRICH data data("ANDRICH") ## fit a MUDFOLD scale to the ANDRICH data mudf_andrich <- mudfold(ANDRICH) ## obtain the parameters from the fitted object coef(mudf_andrich)
This function returns diagnostics for a fitted MUDFOLD scale. Specifically, it returns the iso statistic (see ISO
) the max statistic (see MAX
), the matrix with stars at the maximum of each row, as well as a test for conditional independence.
diagnostics(x, boot, nlambda, lambda.crit, type, k, which, plot)
diagnostics(x, boot, nlambda, lambda.crit, type, k, which, plot)
x |
: A fitted object of class |
boot |
: logical argument that controls if bootstrap confidence intervals and summary for the H coefficients and the ISO and MAX statistics will be returned. If |
nlambda |
: The number of regularization parameters to be used in |
lambda.crit |
: String that specifies the criterion to be used by cross-validation for choosing the optimal regularization parameter. Available options are "class" (default), "deviance", "auc", "mse", "mae". See the argument |
type |
: The type of bootstrap confidence intervals to be calculated if the argumnet |
k |
: The dimension of the basis in the thin plate regression spline that is used when testing for IRF unimodality. The default value of |
which |
: Which diagnostic should be returned by the function. Available options are |
plot |
: Logical. Should plots be returned for the diagnostics that can be plotted? Default value is |
a list of length six where each component is a diagnostic when which="all"
. A list equal to length(which)
when which != "all"
.
Spyros E. Balafas ([email protected])
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders. (1993). Nonparametric unfolding models for dichotomous data. Methodika.
## load the ANDRICH data data("ANDRICH") ## Fit a MUDFOLD scale to the ANDRICH data mudf_andrich <- mudfold(ANDRICH) ## Get the diagnostics diagnostics(mudf_andrich, which = "UM")
## load the ANDRICH data data("ANDRICH") ## Fit a MUDFOLD scale to the ANDRICH data mudf_andrich <- mudfold(ANDRICH) ## Get the diagnostics diagnostics(mudf_andrich, which = "UM")
European party activists preferences for two political parties in the European parliament in 1980. A sample consisted of 1786 individuals are asked to pick out of
political parties from the European parliament.
data("EURPAR2")
data("EURPAR2")
A data frame with 1786 observations (responses) on the following 6 binary valued items.
communists
Communistic political party;
socdemocr
Social Democratic political party;
demprogres
Progressive Democratic political party;
liberals
Liberal Democratic political party;
christians
Christian Democratic political party;
conservat
Conservative political party;
The data have been first studied by Van Schuur (1984) and further by W. J. Post (1992).
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
data(EURPAR2) str(EURPAR2)
data(EURPAR2) str(EURPAR2)
This function calculates the iso statistic based on the conditional adjacency matrix (CAM) of a given scale. In order to quantify if the rows of the CAM show a weakly unimodal pattern, the iso statistic was introduced (Post, 1992). Iso statistic (ISO), is a measure for the degree of unimodality violation in the rows of CAM. ISO can be obtained for each item () and their summation results in the total ISO for the scale (
).
To come up with an ISO value for an item j, one should first locate the maximum in each row of the CAM. If we index the maximum in row j of CAM, the ISO measures deviations from unimodality to the left and right of
. The function takes as input objects of class
"cam.mdf"
obtained from the function CAM
or objects of class "mdf"
obtained from the function mudfold
ISO(x, type)
ISO(x, type)
x |
: A matrix of class 'cam.mdf' obtained from the function |
type |
: This argument controls the type of the statistic that is returned. If |
A vector with the ISO statistic for each item. The sum of the individual ISO statistics for each of the items yield the ISO statistic for the whole scale.
Spyros E. Balafas ([email protected])
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
## load the ANDRICH data data("ANDRICH") ## Calculate the CAM for the ANDRICH scale CAM_andrch <- CAM(ANDRICH) ## Use the CAM to calculate the ISO statistic ## for the ANDRICH scale ISO(CAM_andrch)
## load the ANDRICH data data("ANDRICH") ## Calculate the CAM for the ANDRICH scale CAM_andrch <- CAM(ANDRICH) ## Use the CAM to calculate the ISO statistic ## for the ANDRICH scale ISO(CAM_andrch)
De Jong-Gierveld loneliness scale that consists of eleven ordinal items. Five of these items are positively formulated and six are negatively formulated. Each of the items has three possible response categories.
data(Loneliness)
data(Loneliness)
A data frame with 3987 observations on the following 11 variables.
A
: a column vector containing the ordinal responses on the statement:
"There is always someone I can talk to about my day to day problems (+)"
B
a column vector containing the ordinal responses on the statement:
"I miss having a really close friend (-)"
C
a column vector containing the ordinal responses on the statement:
"I experience a general sense of emptiness (-)"
D
a column vector containing the ordinal responses on the statement:
"There are plenty of people I can lean on in case of trouble (+)"
E
a column vector containing the ordinal responses on the statement:
"I miss the pleasure of company of others (-)"
F
a column vector containing the ordinal responses on the statement:
"I find my circle of friends and acquaintances too limited (-)"
G
a column vector containing the ordinal responses on the statement:
"There are many people that I can count on completely (+)"
H
a column vector containing the ordinal responses on the statement:
"There are enough people that I feel close to (+)"
I
a column vector containing the ordinal responses on the statement:
"I miss having people around (-)"
J
a column vector containing the ordinal responses on the statement:
"Often I feel rejected (-)"
K
a column vector containing the ordinal responses on the statement:
"I can call on my friends whenever I need them (+)"
Each item in the scale has three possible levels of response, i.e., "no" (=1), "more or less" (=2), "yes" (=3). The data is a subset of the NESTOR study (see C. P. Knipscheer, J. d. Jong-Gierveld, T. G. van Tilburg, P. A. Dykstra, et al. (1995))
G. J. De Jong and T. van Tilburg (1999). Manual of the loneliness scale. Amsterdam: VU University Amsterdam.
C. P. Knipscheer, J. d. Jong-Gierveld, T. G. van Tilburg, P. A. Dykstra, et al. (1995). Living arrange-ments and social networks of older adults.Amsterdam: VU University Amsterdam.
J. de Jong-Gierveld and F. Kamphuls (1985). The development of a rasch-type loneliness scale.Applied psychological measurement, 9(3):289-299.
G. J. De Jong and T. van Tilburg (1999). Manual of the loneliness scale. Amsterdam: VU University Amsterdam.
W. J. Post, M. A. van Duijn, and B. van Baarsen (2001). Single-peaked or monotone tracelines? onthe choice of an irt model for scaling data. InEssays on item response theory, pages 391-414.Springer.
## Not run: data(Loneliness) str(Loneliness) ## End(Not run)
## Not run: data(Loneliness) str(Loneliness) ## End(Not run)
This function calculates the max statistic based on the conditional adjacency matrix (CAM) of a given scale. This statistic quantifies violations of the moving maxima property for the item response functions (Post,1992) and it can be calculated for each item and the whole scale. For each row of the CAM, the max statistic is calculated using both a top-down and a bottom-up method.
Both methods yield the same max statistic value for the scale, however, the number of items with non-zero max statistisc may change. In this case, the method that yields the smaller number of items with zero max statististic will be prefered.
MAX(X, type)
MAX(X, type)
X |
: A matrix of class 'cam.mdf' obtained from the function |
type |
: This argument controls the type of the statistic that is returned. If |
To come up with a value of the max statistic for each item in a scale with N items in total, we need first to locate the maximum position in each row of the CAM . Then the max statistic for the item i is calculated using a top-down method according to which,
and a bottom-up method according to which,
A vector with the MAX statistic for each item. The sum of the individual MAX statistics for each of the items yields the MAX statistic for the whole scale.
Spyros E. Balafas ([email protected])
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
## load the ANDRICH data data("ANDRICH") ## Calculate the CAM for the ANDRICH scale CAM_andrch <- CAM(ANDRICH) ## Use the CAM to calculate the MAX statistic ## for each item in the ANDRICH scale MAX(CAM_andrch) ## and the whole scale MAX(CAM_andrch, type="scale")
## load the ANDRICH data data("ANDRICH") ## Calculate the CAM for the ANDRICH scale CAM_andrch <- CAM(ANDRICH) ## Use the CAM to calculate the MAX statistic ## for each item in the ANDRICH scale MAX(CAM_andrch) ## and the whole scale MAX(CAM_andrch, type="scale")
This function is used to fit a unidimensional unfolding scale to the responses of individuals on a set of categorically scored attitudinal items. Fitting is done through Van Schuur's scaling algorithm that determines if a set of items are indicators of the same unobserved latent contstruct such as preference, attitude, ideology etc. Core in this model are the scalability coefficients that are used to assess the fit of the scale and the items to the data.
Diagnostic statistics that are used to test the model assumptions are borrowed from the nonparametric unfolding model of Post(1992). Uncertainty estimates for the scalability coefficients and the diagnostic statistics both for the scale and the individual items are obtained using nonparametric ordinary bootstrap. A bootstrap estimate of the scale is obtained as the most frequently observed scale in bootstrap iterations.
mudfold( data, estimation, lambda1, lambda2, start.scale, nboot, missings, nmice, seed, mincor, ...)
mudfold( data, estimation, lambda1, lambda2, start.scale, nboot, missings, nmice, seed, mincor, ...)
data |
: A binary matrix or data frame containing the responses of |
estimation |
: This argument controls the nonparametric estimation method for person locations. By deafult this argument equals to |
lambda1 |
: User specified numerical value that is used as a lower boundary for the scalability criterion of the first step of the item selection algorithm, and in the item scalability criterion at the end of the scale expansion. Default value is |
lambda2 |
: User specified numerical value that controls explicitly the first scalability criterion of the scale expansion. In the default settings |
start.scale |
: An ordered character vector with item names from |
nboot |
: Argument that controls the number of bootstrap iterations. If |
missings |
: Argument that controls how the missing values should be treated. If |
nmice |
: Argument that controls the number of mice imputations (This argument is used only when |
seed |
: Argument that is used for reproducibility of bootstrap results. |
mincor |
: This can be scalar, numeric vector (of size |
... |
: Any additional arguments that are passed to the |
This function incorporates a two-step algorithm that determines an unfolding scale from observed binary data
. In the first step of the algorithm the best minimal scale that consists of three items is determined. In the second step, the minimal scale from the first step is expanded iteratively by adding the best fitting item in each iteration. The first step of the algorithm can be skiped with the argument start
which can be used for setting manually an item rank order that will be extended in the second step of the item selection algorithm. The resulting scale consists of the best m
fitting items based on scalability criteria (where m
ncol(data)
).
In mudfold
function, the user can specify a value that will be used as a lower bound in the scalability criteria of the MUDFOLD algorithm. By default, the lower bound for the scalability coefficients is
lambda1=0.3
. The user can choose a second value that will be used as a lower bound only for the second step of the algorithm (by default,
lambda2=0
). The parameter is used mostly, in order to relax the first scalability criterion of the second step. Generally, values greater than
for
, and
lead to very strict criteria while negative values relax these criteria.
Uncertainty estimates of the MUDFOLD statistics can be calculated with the argument nboot
of the mudfold
function. When nboot
is an integer then nboot
bootstrap iterations will run to obtain the variance parameter for each MUDFOLD statistic. Missing values are either list-wise deleted or they are imputed nmice
times when nboot=NULL
and missings="impute"
. If the argument nboot
is not NULL
and missings="impute"
then each resampled dataset in bootstrap iterations is imputed once before we fit a MUDFOLD scale.
Moreover, the user is able to choose between two nonparametric estimation methods in order to obtain person parameters that are estimated using the item ranks from the MUDFOLD algorithm. The default setting (i.e., estimation="rank"
) uses an estimation proposed by Van Schuur(1984) based on item ranks. Alternatively, an estimation method described by Johnson(2005), which uses item quantiles for estimating person parameters, can be used by setting estimation="quantile"
.
The function mudfold
returns a list of class "mdf"
with the following components:
CALL |
A list where its components provide information for the function call. |
CHECK |
A list where its components provide information from the data checking step. |
DESCRIPTIVES |
A list with descriptive statistics for the |
MUDFOLD_INFO |
A list with three main components. The first component is called |
If bootstrap is applied, then, an additional component is included in the output. This component is called BOOTSTRAP
and is a list that contains the output of nboot
bootstrap iterations.
Spyros E. Balafas (auth.), Wim P. Krijnen (auth.), Wendy J. Post (contr.), Ernst C. Wit (auth.)
Maintainer: Spyros E. Balafas ([email protected])
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders. (1993).Nonparametric unfolding models for dichotomous data. Methodika.
M.S. Johnson. (2006). Nonparametric Estimation of Item and Respondent Locations from Unfolding-type Items. Psychometrica
## Not run: ##################################### #### MUDFOLD method on real data #### ##################################### ########################################################################### ###### MUDFOLD method on ANDRICH data (see Post and Snijders pp.147) ###### ########################################################################### data(ANDRICH) ## fit MUDFOLD on ANDRICH data ## fit_andr <- mudfold(ANDRICH) ## generic functions for the S3 class .mdf object fit ## ## print.mdf print(fit_andr) ## summary.mdf summary(fit_andr) ## plot.mdf plot(fit_andr) ## fit MUDFOLD on ANDRICH data with bootsrap ## fit_andr_boot <- mudfold(ANDRICH, nboot=100) ## generic functions for the S3 class .mdf object fit ## ## print.mdf print(fit_andr_boot) ## summary.mdf summary(fit_andr_boot, boot=TRUE) ## plot.mdf plot(fit_andr_boot) ############################################ ###### MUDFOLD method on EURPAR2 data ###### ############################################ data("EURPAR2") ## fit MUDFOLD on EURPAR2 data ## fit_eurp <- mudfold(EURPAR2) ## print print(fit_eurp) ## summary summary(fit_eurp) ## plot plot(fit_eurp) ########################################### ###### MUDFOLD method on Plato7 data ###### ########################################### data("Plato7") ## transform to binary data ## using as threshold the mean ## per row of Plato7 dat_plato <- pick(Plato7) ## fit MUDFOLD on Plato7 data ## fit_plato <- mudfold(dat_plato, nboot=1000) ## print print(fit_plato) ## summary summary(fit_plato, boot=TRUE) ## plot plot(fit_plato, plot.type="scale") plot(fit_plato, plot.type="IRF") plot(fit_plato, plot.type="persons") ########################################## #### MUDFOLD method on simulated data #### ########################################## ### Data with the responses of ### n=3000 on p=20 items simulation1 <- mudfoldsim(N=20, n=3000, gamma1=2, gamma2=-10, zeros=FALSE,seed = 1) dat_sim1 <- simulation1$dat ## fit MUDFOLD on simulated data ## fit.sim1 <- mudfold(dat_sim1) # print fit.sim1 # summary summary(fit.sim1) # plot plot(fit.sim1) ### Data with the responses of ### n=3000 on N=26 items simulation2 <- mudfoldsim(N=26, n=3000, gamma1=2, gamma2=-10, zeros=FALSE,seed = 1) dat_sim2 <- simulation2$dat ## fit MUDFOLD on simulated data ## fit.sim2 <- mudfold(dat_sim2) # print fit.sim2 # summary summary(fit.sim2) # plot plot(fit.sim2, plot.type="scale") plot(fit.sim2, plot.type="IRF") plot(fit.sim2, plot.type="persons") ## End(Not run)
## Not run: ##################################### #### MUDFOLD method on real data #### ##################################### ########################################################################### ###### MUDFOLD method on ANDRICH data (see Post and Snijders pp.147) ###### ########################################################################### data(ANDRICH) ## fit MUDFOLD on ANDRICH data ## fit_andr <- mudfold(ANDRICH) ## generic functions for the S3 class .mdf object fit ## ## print.mdf print(fit_andr) ## summary.mdf summary(fit_andr) ## plot.mdf plot(fit_andr) ## fit MUDFOLD on ANDRICH data with bootsrap ## fit_andr_boot <- mudfold(ANDRICH, nboot=100) ## generic functions for the S3 class .mdf object fit ## ## print.mdf print(fit_andr_boot) ## summary.mdf summary(fit_andr_boot, boot=TRUE) ## plot.mdf plot(fit_andr_boot) ############################################ ###### MUDFOLD method on EURPAR2 data ###### ############################################ data("EURPAR2") ## fit MUDFOLD on EURPAR2 data ## fit_eurp <- mudfold(EURPAR2) ## print print(fit_eurp) ## summary summary(fit_eurp) ## plot plot(fit_eurp) ########################################### ###### MUDFOLD method on Plato7 data ###### ########################################### data("Plato7") ## transform to binary data ## using as threshold the mean ## per row of Plato7 dat_plato <- pick(Plato7) ## fit MUDFOLD on Plato7 data ## fit_plato <- mudfold(dat_plato, nboot=1000) ## print print(fit_plato) ## summary summary(fit_plato, boot=TRUE) ## plot plot(fit_plato, plot.type="scale") plot(fit_plato, plot.type="IRF") plot(fit_plato, plot.type="persons") ########################################## #### MUDFOLD method on simulated data #### ########################################## ### Data with the responses of ### n=3000 on p=20 items simulation1 <- mudfoldsim(N=20, n=3000, gamma1=2, gamma2=-10, zeros=FALSE,seed = 1) dat_sim1 <- simulation1$dat ## fit MUDFOLD on simulated data ## fit.sim1 <- mudfold(dat_sim1) # print fit.sim1 # summary summary(fit.sim1) # plot plot(fit.sim1) ### Data with the responses of ### n=3000 on N=26 items simulation2 <- mudfoldsim(N=26, n=3000, gamma1=2, gamma2=-10, zeros=FALSE,seed = 1) dat_sim2 <- simulation2$dat ## fit MUDFOLD on simulated data ## fit.sim2 <- mudfold(dat_sim2) # print fit.sim2 # summary summary(fit.sim2) # plot plot(fit.sim2, plot.type="scale") plot(fit.sim2, plot.type="IRF") plot(fit.sim2, plot.type="persons") ## End(Not run)
mudfoldsim
function simulates unfolding data following a unimodal parametric function with flexible set up. User can control the number of respondents, the number of items and fixed parameters of the Item Response Function (IRF) under which the responses are generated. Moreover, the user of the mudfold package can allow (or not) individuals that are endorsing no items.
mudfoldsim(N, n, gamma1=5, gamma2=-10, zeros=FALSE, parameters="normal", seed=NULL)
mudfoldsim(N, n, gamma1=5, gamma2=-10, zeros=FALSE, parameters="normal", seed=NULL)
N |
: This argument specifies the number of items (stimuli). |
n |
: Argument which allows the user to specify the number of respondents in the simulated data. |
gamma1 |
: Parameter which is used in the IRF under which the data is generated. Default value is 5. |
gamma2 |
: Parameter which is used in the IRF under which the data is generated. Default value is -10. |
zeros |
: Logical argument. If |
parameters |
: A character string that controls the distribution of the person parameters. If |
seed |
: An integer to be used in the |
For simulating the response of an individual with scale parameter
to an item
with scale parameter
we use the function
. The parameters
can be samples sampled both from a standard normal distribution, i.e.,
, and
or the the person parameters will be sampled uniformly within the range of the item parameters.
a list with 11 components.
obs_ord |
: A character vector with the items in the simulated order. |
true_ord |
: A character vector with the items in the true order in which they constitute an unfolding scale. |
items |
: An integer corresponding to the number of the simulated items. |
sample |
: An integer corresponding to the number of the simulated respondents. |
gamma1 |
: A value that corresponds to the parameter |
gamma2 |
: A value that corresponds to the parameter |
seed |
: An integer that corresponds to the seed number that is going to be used in the |
dat |
: data frame containing the binary responses of |
probs |
: A matrix containing the probabilities of positive response from |
item.patameters |
: The simulated item parameters that have been used for sampling the data. |
subject.parameters |
: The simulated subject parameters that have been used for sampling the data. |
Spyros E. Balafas (auth.), Wim P. Krijnen (auth.), Wendy J. Post (contr.), Ernst C. Wit (auth.)
Maintainer: Spyros E. Balafas ([email protected])
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.J. Post. (1992). Non parametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders. (1993).Non parametric unfolding models for dichotomous data. Methodika.
## Not run: ## Simulate 5 different scenarios n.seed <- 10 sim1 <- mudfoldsim(N=6, n=100, gamma1=5, gamma2=-10, zeros=FALSE,seed=n.seed) sim2 <- mudfoldsim(N=10,n=1000,gamma1=10,gamma2=-100,zeros=FALSE,seed=n.seed) sim3 <- mudfoldsim(N=15,n=2000,gamma1=50,gamma2=-100,zeros=FALSE,seed=n.seed) sim4 <- mudfoldsim(N=30,n=2000,gamma1=50,gamma2=-100,zeros=FALSE,seed=n.seed) sim5 <- mudfoldsim(N=50,n=2000,gamma1=50,gamma2=-100,zeros=FALSE,seed=n.seed) dat1 <- sim1$dat dat2 <- sim2$dat dat3 <- sim3$dat dat4 <- sim4$dat dat5 <- sim5$dat fit1 <- mudfold(dat1) fit1 fit2 <- mudfold(dat2) fit2 fit3 <- mudfold(dat3) fit3 fit4 <- mudfold(dat4) fit4 fit5 <- mudfold(dat5) fit5 ## End(Not run)
## Not run: ## Simulate 5 different scenarios n.seed <- 10 sim1 <- mudfoldsim(N=6, n=100, gamma1=5, gamma2=-10, zeros=FALSE,seed=n.seed) sim2 <- mudfoldsim(N=10,n=1000,gamma1=10,gamma2=-100,zeros=FALSE,seed=n.seed) sim3 <- mudfoldsim(N=15,n=2000,gamma1=50,gamma2=-100,zeros=FALSE,seed=n.seed) sim4 <- mudfoldsim(N=30,n=2000,gamma1=50,gamma2=-100,zeros=FALSE,seed=n.seed) sim5 <- mudfoldsim(N=50,n=2000,gamma1=50,gamma2=-100,zeros=FALSE,seed=n.seed) dat1 <- sim1$dat dat2 <- sim2$dat dat3 <- sim3$dat dat4 <- sim4$dat dat5 <- sim5$dat fit1 <- mudfold(dat1) fit1 fit2 <- mudfold(dat2) fit2 fit3 <- mudfold(dat3) fit3 fit4 <- mudfold(dat4) fit4 fit5 <- mudfold(dat5) fit5 ## End(Not run)
Function pick
can be used to transform quantitative or ordinal type of variables, into binary form (i.e., 0
,1
). When byItem=FALSE
, then the underlying idea is that the individual selects those items with the higher preference. This is done through user provided cut-off values, or by assuming a pick k
out of N
response process, where, each continuous response vector takes a 1
at its k
higher values. Dichotomization can be performed row-wise (default) or column-wise.
pick(data , k=NULL, cutoff=NULL, byItem=FALSE)
pick(data , k=NULL, cutoff=NULL, byItem=FALSE)
data |
: A matrix or data frame containing the continuous or discrete responses of |
k |
: An integer ( |
cutoff |
:The value(s) that will be used as thresholds. The length of this argument should be equal to 1 (the same threshold for all rows (or columns) of |
byItem |
: logical argument. If byItem=TRUE, the dichotomization is performed columnwise. In the default byItem=FALSE, the function determines the ones rowise. |
Binary transformation of continuous or discrete variables with number of levels. Two different methods are available for the transformation.
The first method uses the argument k
in the pick
function, and assumes a pick k
out of N
response process. Such type of response processes are met in surveys and questionnaires, in which respondents are asked to pick exactly the k
most preferred items. The value for k
is an integer between 1 and ncol(data)
. By choosing an integer for k
, this function ”picks” the k
higher values in each row (if byItem=FALSE
) of data
. The k
higher values in each row become 1 and the rest ncol(data)-k
elements are set to 0. Obviously, if k=ncol(data)
, then the resulting matrix will only consists of 1's and no 0's.
The second method is based on thresholding in order to binarize the data. For this method, the user should provide threshold(s) with the parameter cutoff
in the pick
function (default cutoff=NULL
). If one value is provided in the cutoff
parameter, i.e., cutoff=
, then
is used as threshold in each row
(if
byItem=FALSE
) of the data matrix data
such that, any value greater than or equal to cutoff
in row becomes 1 and 0 else. Additionally, the user can provide row (or column) specific cut off values, i.e.,
cutoff=
with
where
is the cut-off value for the row or column
. In this case, if
then
and
else.
The two methods cannot be used simultaneously. Only one of the parameters k
and cutoff
can be different than NULL
each time. If both parameters are equal NULL
(default), then a row specific cut off is determined automatically for each row of
data
, such that, . The dichotomization is performed by row of
data
, except the case, byItem=TRUE
.
When the argument k
is used, it can be the case that more than k
values can be picked (i.e., ties). In this case, the choice on which item will be picked is being made after we add a small amount of noise in each observation of row or column . This is done with the function
jitter
.
Binary valued (i.e., 0-1) data with the same dimensions as the input.
!!! This function should be used with care. Dichotomization may distort the data structure and lead to potential information loss. In the case of polytomous items, the user is suggested to consider polytomous unfolding models that take into account different levels of measurement. !!!
Spyros E. Balafas (auth.), Wim P. Krijnen (auth.), Wendy J. Post (contr.), Ernst C. Wit (auth.)
Maintainer: Spyros E. Balafas ([email protected])
## Not run: ### simulate some data with 3 discrete variables with three levels ### and 1 variable with 4 levels d1 <- cbind(sample(1:3,20,replace = TRUE), sample(1:3,20,replace = TRUE,prob = c(0.3,0.3,0.4)), sample(1:3,20,replace = TRUE,prob = c(0.2,0.4,0.4)), sample(1:4,20,replace = TRUE,prob = c(.1,.3,.4,.2))) ### apply pick on d1 ### # binarize at the mean of # each row and column d1_rowmean <- pick(d1) d1_colmean <- pick(d1,byItem = TRUE) # binarize at the cutoff=2 d1_cut <- pick(d1,cutoff = 2,byItem = TRUE) # binarize at different cutoffs (per row) # for example at the median of each row med_cuts <- apply(d1,1,median) d1_cuts <- pick(d1,cutoff = med_cuts) # binarize at different cutoffs (per column) # for example at the median of each column med_cuts_col <- apply(d1,2,median) d1_cuts_col <- pick(d1,cutoff = med_cuts_col,byItem = TRUE) # binarize at the k=2 higher values # per row and column d1_krow <- pick(d1,k = 2) d1_kcol <- pick(d1,k = 2,byItem = TRUE) ## End(Not run)
## Not run: ### simulate some data with 3 discrete variables with three levels ### and 1 variable with 4 levels d1 <- cbind(sample(1:3,20,replace = TRUE), sample(1:3,20,replace = TRUE,prob = c(0.3,0.3,0.4)), sample(1:3,20,replace = TRUE,prob = c(0.2,0.4,0.4)), sample(1:4,20,replace = TRUE,prob = c(.1,.3,.4,.2))) ### apply pick on d1 ### # binarize at the mean of # each row and column d1_rowmean <- pick(d1) d1_colmean <- pick(d1,byItem = TRUE) # binarize at the cutoff=2 d1_cut <- pick(d1,cutoff = 2,byItem = TRUE) # binarize at different cutoffs (per row) # for example at the median of each row med_cuts <- apply(d1,1,median) d1_cuts <- pick(d1,cutoff = med_cuts) # binarize at different cutoffs (per column) # for example at the median of each column med_cuts_col <- apply(d1,2,median) d1_cuts_col <- pick(d1,cutoff = med_cuts_col,byItem = TRUE) # binarize at the k=2 higher values # per row and column d1_krow <- pick(d1,k = 2) d1_kcol <- pick(d1,k = 2,byItem = TRUE) ## End(Not run)
This dataset contains statistical information about Plato's seven works. The underlying problem to this dataset is the fact that the chronological order of Plato's works is unknown. Scholars only know that Republic was his first work, and Laws his last work. For each work, Cox and Brandwood (1959) extracted the last five syllables of each sentence. Each syllable is classified as long or short which gives 32 types. Consequently, we obtain a percentage distribution across the 32 scenarios for each of the seven works. The dataset has been borrowed from the package smacof (De Leeuw and Mair, 2009).
data(Plato7)
data(Plato7)
Data frame containing syllable percentages of Plato's 7 works.
Cox, D. R. & Brandwood, L. (1959). On a discriminatory problem connected with the work of Plato. Journal of the Royal Statistical Society (Series B), 21, 195-200.
De Leeuw, J.& Mair, P. (2009). Multidimensional Scaling Using Majorization: SMACOF in R. Journal of Statistical Software, 31(3), 1-30. URL http://www.jstatsoft.org/v31/i03/.
## Not run: data(Plato7) str(Plato7) ## End(Not run)
## Not run: data(Plato7) str(Plato7) ## End(Not run)
plot
function for "mdf"
class objects.
Generic function for plotting S3 class "mdf"
objects. This function, is plotting the rows of the conditional adjacency matrix (CAM) which are nonparametric estimates of the item response functions. The plot is produced using the ggplot
function from the package ggplot2.
## S3 method for class 'mdf' plot(x, select, plot.type, ...)
## S3 method for class 'mdf' plot(x, select, plot.type, ...)
x |
Object of class |
select |
: in this argument the user can provide a subset of items he would like them to be explicitly plotted. If the |
plot.type |
: Determines the type of plot that is returned. By default, |
... |
Other arguments passed on to |
The plot
method is used to obtain a graphical representation of the estimated rank order of the items, the item response functions, and the distribution of the person parameters. As estimates of the IRFs are considered the rows of the CAM. For interpolating the missing diagonal elements of the CAM, we make use of the na.approx
function from the package zoo.
Spyros E. Balafas (auth.), Wim P. Krijnen (auth.), Wendy J. Post (contr.), Ernst C. Wit (auth.)
Maintainer: Spyros E. Balafas ([email protected])
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post and T.AB. Snijders. (1993).Nonparametric unfolding models for dichotomous data. Methodika.
A. Zeileis and G. Grothendieck. (2005). zoo: S3 Infrastructure for Regular and Irregular Time Series. Journal of Statistical Software, 14(6), 1-27. doi:10.18637/jss.v014.i06
H. Wickham. (2009). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.
H. Wickham. (2007). Reshaping Data with the reshape Package. Journal of Statistical Software, 21(12), 1-20. URL http://www.jstatsoft.org/v21/i12/.
## Not run: data(ANDRICH) fit <- mudfold(ANDRICH) plot(fit, plot.type= "scale") plot(fit, plot.type= "IRF") plot(fit, plot.type= "persons") plot(fit, select="DONTBELIEV", plot.type= "IRF") ## End(Not run)
## Not run: data(ANDRICH) fit <- mudfold(ANDRICH) plot(fit, plot.type= "scale") plot(fit, plot.type= "IRF") plot(fit, plot.type= "persons") plot(fit, select="DONTBELIEV", plot.type= "IRF") ## End(Not run)
print
method for "mdf"
class objects resulted from the mudfold
function.
S3 generic function for printing "mdf"
class objects.
## S3 method for class 'mdf' print(x, ...)
## S3 method for class 'mdf' print(x, ...)
x |
Object of class |
... |
further arguments passed on to the |
Spyros E. Balafas (auth.), Wim P. Krijnen (auth.), Wendy J. Post (contr.), Ernst C. Wit (auth.)
Maintainer: Spyros E. Balafas ([email protected])
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders (1993). Nonparametric unfolding models for dichotomous data. Methodika.
## Not run: data(ANDRICH) fit <- mudfold(ANDRICH) fit print(fit) ## End(Not run)
## Not run: data(ANDRICH) fit <- mudfold(ANDRICH) fit print(fit) ## End(Not run)
summary
method for S3 class "mdf"
objects.
Generic function that is used in order to summarize information from "mdf"
class objects.
## S3 method for class 'mdf' summary(object, boot=FALSE, type="perc", ...)
## S3 method for class 'mdf' summary(object, boot=FALSE, type="perc", ...)
object |
: Object of class |
boot |
: This argument applies when the |
type |
: A string that determines the type of confidence intervals that will be calculated. This argument is passed to the |
... |
Other arguments passed on to the function |
A summary of the MUDFOLD scale that has been calculated with the mudfold
function.
The output of the summary.mdf()
is a list with two main components. The first component of the list is a data.frame
with scale statistics and the second component is a list with item statistics. If diagnostics=TRUE
another component with diagnostic matrices is also included in the output. When the bootstrap scale estimate does not agree with the obtained MUDFOLD estimate a summary of the bootstrap scale will be given in the output.
Spyros E. Balafas (auth.), Wim P. Krijnen (auth.), Wendy J. Post (contr.), Ernst C. Wit (auth.)
Maintainer: Spyros E. Balafas ([email protected])
W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.
W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.
W.J. Post. and T.AB. Snijders (1993). Nonparametric unfolding models for dichotomous data. Methodika.
## Not run: data(ANDRICH) fit <- mudfold(ANDRICH, nboot=100) summary(fit, boot=TRUE) summary(fit, boot=FALSE) ## End(Not run)
## Not run: data(ANDRICH) fit <- mudfold(ANDRICH, nboot=100) summary(fit, boot=TRUE) summary(fit, boot=FALSE) ## End(Not run)