The study was conducted in two phases using the instrument development multiphase mixed-method design (exploratory sequential) from February 2022 to July 2022 (Fig. 2). Ethical approval was obtained from Riphah International University (Riphah/IIMC/IRC/22/2001) and Islamabad Medical and Dental College, Pakistan (No. 56/IMDCIIRB-2022). The participants were HCPs (medical doctors, dental clinicians, nurses, physiotherapists, speech therapists, clinical and community pharmacists). Written informed consent was obtained from all participants during various phases of the study.
Phase 1- Instrument development and qualitative content validation
Table 1 shows the eight domains of digital professionalism that were identified from GMC social media guidelines. The items were constructed using multiple social media guidelines as shown in Table 2. The guidelines were searched using PubMed, ERIC, BioMed Central and Google Scholar. Only full text, freely accessible guidelines regarding online/digital professionalism of HCPs (medical and allied healthcare professionals) were included while those for undergraduate medical, dental, and allied sciences students were excluded.
Items written in statement form were matched to response anchors with a 5-point Likert scale. The first version of the instrument was emailed to 15 experts, including HCPs and medical educationists with five years of experience for modification, deletion, and addition of items. Experts’ feedback was analysed, and changes were made based on the following criteria: (1) relevance of the item to construct, (2) ease of understanding, (3) removal of duplicate or ambiguous items, and (4) elimination of spelling and grammatical errors [22].
Phase 2: Instrument validation
Content Validity
Content validity was established through a) the consensus-building modified Delphi technique and b) the content validity index (CVI). Thirty-five national & international experts were selected based on the following criteria: HCPs who had worked on digital professionalism and/or professionalism and medical educationists with master’s degrees or above with more than five years of experience.
Modified Delphi Round 1
The content validation Google forms were emailed to 35 experts. They included a summary of the project and informed consent. Moreover, each domain was defined to facilitate scoring along with a short video explaining the instructions. The experts were requested to rank each item based on its importance in measuring the construct on a 5-point Likert scale (very important = 5, important = 4, moderately important = 3, less important = 2, and unimportant = 1). An open-ended question was included at the end of every section of the instrument, and the participants were requested to justify the extreme options.
Data Analysis
Data were analysed using SPSS version 26. The median and interquartile ranges (IQRs) were calculated for each item. The criteria for the acceptability of an item in Delphi rounds were decided beforehand [23];
-
Agreement of ≥ 75% of the experts on the upper two measures (very important or important)
-
Median score of ≥ 4
-
An interquartile range of ≤ 1 on a 5-point Likert scale
Modified Delphi Round 2
The forms in Word format with percentage agreement of all participants on very important and important, median, and IQR, and the response of the expert in the previous round were emailed individually to the respondents of round 1. Stability refers to the consistency of responses and is established if the responses obtained in two successive rounds do not significantly differ from each other [24]. Experts were requested to review their responses in round 1 and to rank the items again on the previous scale if they wanted to change them.
Data Analysis
Data were analysed using SPSS 26, and stability was calculated through the McNemar change test using nonparametric chi-square statistics to calculate the p value of each item [25, 26]. The value was set at 0.05.
Modified Delphi Round 3
Google forms were emailed to respondents of previous rounds, who were requested to rate each item on a 4-point Likert scale on relevance (highly relevant = 4, quite relevant = 3, somewhat relevant = 2, and not relevant = 1) and a 3-point Likert scale on clarity of the items (very clear = 3, item needs revision = 2, and not clear = 1).
Data Analysis
The ratings of 3 or 4 on the relevance scale were recorded as “1”, and items ranked 1 or 2 were recorded as “0”. The content validity index of individual items (I-CVI) was calculated by adding 1 s for each item and dividing by the total number of experts (n = 24) [22]. The average CVI scores across all the items gave the content validity index of scale (S-CVI) [27, 28]. Items having an I-CVI of ≥ 0.90 were included. Those between 0.78 and 0.90 were revised, and items with I-CVI ≤ 0.78 were removed [22]. The content clarity average (CCA) was calculated, and items with CCA values above 2.4 (80%) were marked as very clear [22].
Response process validity
Cognitive pretesting of the instrument was performed through in-person semi-structured interviews of ten participants using convenience sampling. Pilot testing was performed to identify and resolve any potential issues. Think-aloud and verbal-probing techniques were used with concurrent probes. Notes were taken by the researcher during interviews, which were also audio recorded after taking the participants’s consent for later analysis.
Data analysis
Audiotaped interviews were transcribed and segmented. Analytic memos were created and coded using predefined categories: (1) items with no problems, (2) with minor problems, and (3) with major problems [29]. This coding was performed by two co-authors independently to assure inter-rater reliability. Moreover, the principal author analysed the coding to solve any differences.
Pilot testing
Piloting was performed to establish the construct validity and internal consistency of the instrument. Many criteria are used to calculate the sample size of pilot testing, such as a subject-to-variable ratio (SVT) of 10:1 [30] and ranges: N ≥ 1000 is excellent, ≥ 500 is good, 100–500 is fair, and < 100 is poor for factor analysis [31], where N is the number of participants. However, a larger sample size decreases sampling error, and it must increase with an increase in the number of factors [32]. Thus, for this study, a sample size of 550 was used for pilot testing and factor analysis, and participants were emailed Google forms. Reminders were sent on Day 5 and Day 10 through email and WhatsApp to increase the response rate.
Data analysis
Data were analysed by SPSS for descriptive statistics and internal consistency. Construct validity was established through confirmatory factor analysis (CFA) using Analysis of Moment Structure (AMOS) 24.0. Exploratory factor analysis (EFA) was not performed, as there were specific expectations regarding (a) the number of constructs or factors, (b) which items or variables reflect given factors, and (c) whether the factors or constructs were correlated [33]. EFA is performed when the factors are not known or are yet to be determined. While CFA is preferred when there is a strong model based on past evidence regarding the number of factors and which items are related to which factors. The GMC guidelines are comprehensive, evidence-based, and constantly updated based on new research and rapidly evolving digital norms and trends. Thus, the domains of digital professionalism from “Doctors use of social media” by GMC were used, and CFA was done to examine the latent structure and item-factor relationship [34].
None of the items was reverse coded. While entering the data in SPSS, all the items were considered as continuous variables, as all were on the same Likert scale, and the choices were taken as “Always, Usually, About half the time, Seldom, and Never” from 5 to 1, respectively.
This post was originally published on this site be sure to check out more of their content