Documentation for fxl

Visualizing Single-case Design Research Synthesis: SCARF Plots

Written by Shawn P. Gilroy (Last Updated: 2024-08-14)Research SynthesisSingle-case DataSystematic Review

There are long-standing barriers to synthesizing results of studies using single-case research designs (e.g., statistical issues, data variability issues). In recent years, approaches such as multilevel modeling have lessened the impact of specific challenges (e.g., robustness to certain deviations from statistical assumptions) but numerous challenges still remain. For example, the presence of a quantitative effect (i.e., statistical difference from baseline contrasts) does not necessarily mean that a functional relationship was demonstrated. Because of this of, there remains a disconnect between statistical and clinical/experimental interpretations of single-case design data and this has significant implications for meta-analyses in education, special education, and behavior analysis.

The Single Case Assessment and Review Framework (SCARF) has emerged in recent years as a sort of a ‘middle ground’ between visual analysis and meta-analysis. Specifically, the focus is on characterizing the strength/presence of a functional relationship rather than ascribing some quantitative effect (i.e., not a effect size as traditionally defined). By adopting this stance, it retains the interpretation present in source studies (i.e., visual analysis) but also allows a means of aggregating and appraising outcomes across a range of related studies (e.g., outcomes from baseline vs. intervention comparisons). Additionally, the SCARF uses simple visuals to inspect these outcomes as a function of relevant study features and characteristics (e.g., internal validity/rigor, degree of maintenance assessed), which circumvents historical challenges associated with statistically summarizing and comparing these data an outcomes.

Visuals from the Single Case Assessment and Review Framework (SCARF)

The SCARF has historically facilitated the generation of three specific figures related to systematically appraising single-case research. These include a visualization of the strength/direction/presence of functional relations as a function of study rigor (i.e., internal validity), the degree to which outcomes achieved during intervention are maintained as a function of the window of time explored (i.e., evaluation of short- and long-term maintenance), and degree to which outcomes generalized as a function of the how generalization was assessed (i.e., near vs. far transfer). As with all parts of SCARF, the specific approach used is specific to the questions posed by the analyst, and the specific visuals included in the approach are matched to relevant questions (e.g., may not have a question related to generalization).

For the sake of simplicity, the most commonly-included figure (Functional Relations ~ Rigor/Internal Validity) will be used as the primary example. An example of this figure with hypothetical and simulated data is presented below:

In this figure, the X-Axis (i.e., indicators of Internal Validity from low-to-high) and Y-Axis (i.e., functional relations from clear and undesirable, to unclear, to clear and desirable) represent increasingly desirable reflections of empirical research. Dashed lines illustrated quadrants are presented here to orient readers to different relevant sectors on the plane. For instance, the upper right quadrant most reflects studies that have clear and desirable outcomes (i.e., desired change in behavior) as well as many indicators associated with strong Internal Validity (i.e., rigorous design). In contrast, the lower left portion would reflect the opposite of this. The number (i.e., size; larger points = more data), position (i.e., X/Y coordinate), and type of data (e.g., “gray” literature) are all relevant to the interpretation as well.

Format of Data for SCARF Visual

As a general default, data for SCARF is specific to an individual research design. This makes good sense, since a functional relationship is determined using an accepted research design, but this introduces some variability depending on how studies are conducted. Should a study answer questions at the participant-level using respective Reversal Designs or a Multiple Baseline Designs (e.g., across settings for individual participants), each data point would be specific to individuals (i.e., more than one data point per study). In contrast, if researchers used a single Multiple Baseline Design across three participants for a study, just one data point would be included from this study.

In working with R and the fxl package, the simplest and most straightforward representation of this data would look as follows (Note: two frames displayed to distinguish between simulated published and “gray” outcomes):


# Note: This would be the "published" or peer-reviewed data
head(data_frame_example)
##   id  x y
## 1  1  6 2
## 2  2 15 5
## 3  3 12 3
## 4  4 11 1
## 5  5 15 2
## 6  6 15 3

# Note: These data would be the "gray" or unpublished outcomes in the literature
head(data_frame_example_gray)
##   id  x y
## 1 26 15 2
## 2 27 14 3
## 3 28 10 3
## 4 29 14 1
## 5 30  7 3
## 6 31 14 2

Presenting Individual Outcomes

As a starting point to graphing outcome data using fxl, we can begin with simply plotting the data as individual-level points as we would have done in prior posts (e.g., changing colors to distinguished ‘gray’ from published literature with multiple scr_point calls).

A straightforward plotting of the data displayed above reveals the following:


scr_plot(data_frame_example,
         aesthetics = var_map(x = x,
                              y = y),
         mai = c(0.5, 1.75, 0.0, 0.0),
         omi = c(0.25, 0.25, 0.25, 0.25),
         family = "Times New Roman") |>
  # Note: We will add a bit of space at the bottom and rename y-axis levels
  scr_yoverride(c(0.5, 5),
                yticks = 1:5,
                ytickscex = 1.25,
                ytickslabs = c(
                  "Counter-Therapeutic",
                  "Null",
                  "Inconsistent",
                  "Weak",
                  "Strong")) |>
  # Note: Same as y-axis, but for the x-axis
  scr_xoverride(c(-0.5, 15),
                xticks = 0:15,
                xtickscex = 1.25) |>
  scr_xlabel("Indicators of Internal Validity", cex = 1.5, adj = 0.65) |>
  scr_ylabel("Functional Relation", cex = 1.5, adj = 0.6) |>
  # Note: Optional quandrant lines (not really all that common)
  scr_anno_guide_line(lty = 1,
                      coords = list(
                        "1" = list(
                          x0 = 0,  x1 = 15,
                          y0 = 3,  y1 = 3,
                          lty = 2, lwd = 1
                        ),
                        "2" = list(
                          x0 = 7.5,  x1 = 7.5,
                          y0 = 0.75, y1 = 5,
                          lty = 2,   lwd = 1
                        ))) |>
  scr_points(pch = 21, cex = 3, fill = "lightblue") |>
  scr_points(pch = 21, cex = 3, fill = "gray",
             data = data_frame_example_gray) |>
  scr_legend(position = list(x = 0.0, y = 5.25),
             legend = c("Published Literature", "Gray Literature"),
             col = c("black", "black"),
             pt_bg = c("lightblue", "gray"),
             lty = c(0, 0),
             pch = c(21, 21),
             bty = "y",
             pt_cex = 2.25,
             cex = 1.25,
             text_col = "black",
             horiz = FALSE,
             box_lty = 0)

The default representation of these data are problematic for a few reasons. First, there is clear evidence of overplotting (i.e., multiple points overlapping) and this would at best only partially communicate the available data to the analyst. Fortunately, there are a few useful approaches for addressing these issues.

First, jittering would help to distinguish overlapping edges/types. We can accomplish this by simply adding a small random noise to prevent perfect overlapping. Second, we can address partially overlapping data points by adding some transparency to points drawn on the plot. This would help especially if a larger data point completely overshadowed a smaller one. Third, we could simplify plotting and limit the potentially for overlapping data points by combining data and reflect more dense information by augmenting. That is, larger points = more data at a given point.

Each of these changes would help an analyst more easily interpret data, but would require a slight change to how points are typically drawn in the package.

Overriding Point Drawing Behavior to Group Data Visually

We can accomplish the visuals common in SCARF by building upon logic covered in earlier points. Specifically, something similar was done when dynamically drawing bar charts to make colors dependent on specific values. We can do something similar by working with a “hook” to modify (i.e., override) the default behavior when drawing points. In overriding this behavior, we hope to accomplish a few things.

First, we want to plot unique X/Y coordinate pairs rather than plot every single individual data point. We want to do this in order to simplify the presentation by grouping relevant data together (e.g., by size). Second, we will need to loop through the available data to determine exactly how large an X/Y data point will be (i.e., it will be larger for points with more data). Third, we will need to construct these unique X/Y pairs but also slightly jitter the positions and add a bit of transparency to help with inspecting the full range of data. Lastly, we will need to take all of this derived data and inherent the colors and markers passed from the scr_plot function to make sure the respective styling is maintained (i.e., a gray color for the “gray” literature).

This seems like a lot to accomplish, but its no more than several lines of code in a simply function. The function we need to prepare and supply to the “styler” argument to override drawing behavior is presented below:


point_styler <- function(data_frame, ...) {
  # Note: we're capturing all the arguments we passed to each call (e.g., color)
  input_list <- list(...)

  # Note: we're getting each distinct X/Y pair in the data
  unique_entries = unique(data_frame)
  
  # Note: We'll default the size of *each* to one
  unique_entries$size = 1
  
  for (row in 1:nrow(unique_entries)) {
    # Note: Here, we get each unique row and find how many times it occurs in the relevant data
    current_row = unique_entries[row, ]
    
    n_matches = nrow(data_frame[
      data_frame$X == current_row$X & 
        data_frame$Y == current_row$Y,
    ])
    
    # Note: We link the amount of data found to the size
    unique_entries[row, "size"] = n_matches
  }

  # Note: cex = how R typically reflect/labels size and expansion
  # Note: We will normalize size up to a cex of 5 (so as not to get too big)
  unique_entries$cex <- (unique_entries$size / max(unique_entries$size)) * 5
  
  # Note: We add some random jitter to x/y points to prevent overlap
  jitter_x = rnorm(nrow(unique_entries), 0, 0.1)
  jitter_y = rnorm(nrow(unique_entries), 0, 0.1)
  

  points(unique_entries$X + jitter_x, 
         unique_entries$Y + jitter_y,
    col = "black",
    # Note: we pass along the arguments from each scr_plot line
    pch = input_list[["pch"]],
    bg  = alpha(input_list[["bg"]], 0.8),
    # Note: we supply the size for each unique point here
    cex = unique_entries$cex,
  )
}

scr_plot(data_frame_example,
         aesthetics = var_map(x = x,
                              y = y),
         mai = c(0.5, 1.75, 0.05, 0.05),
         omi = c(0.25, 0.25, 0.25, 0.25),
         family = "Times New Roman") |>
  scr_yoverride(c(0.5, 5),
                yticks = 1:5,
                ytickscex = 1.25,
                ytickslabs = c(
                  "Counter-Therapeutic",
                  "Null",
                  "Inconsistent",
                  "Weak",
                  "Strong")) |>
  scr_xoverride(c(-0.5, 15),
                xticks = 0:15,
                xtickscex = 1.25) |>
  scr_xlabel("Indicators of Internal Validity", cex = 1.5, adj = 0.65) |>
  scr_ylabel("Functional Relation", cex = 1.5, adj = 0.6) |>
  scr_anno_guide_line(lty = 1,
                      coords = list(
                        "1" = list(
                          x0 = 0,  x1 = 15,
                          y0 = 3,  y1 = 3,
                          lty = 2, lwd = 1
                        ),
                        "2" = list(
                          x0 = 7.5,  x1 = 7.5,
                          y0 = 0.75, y1 = 5,
                          lty = 2,   lwd = 1
                        ))) |>
  # Note: See override for styler
  scr_points(pch = 21, cex = 3, fill = "lightblue",
             styler = point_styler) |>
  # Note: Same for here
  scr_points(pch = 21, cex = 3, fill = "gray",
             styler = point_styler,
             data = data_frame_example_gray) |>
  scr_legend(position = list(x = 0.0, y = 5.25),
             legend = c("Published Literature", "Gray Literature"),
             col = c("black", "black"),
             pt_bg = c("lightblue", "gray"),
             lty = c(0, 0),
             pch = c(21, 21),
             bty = "y",
             pt_cex = 2.25,
             cex = 1.25,
             text_col = "black",
             horiz = FALSE,
             box_lty = 0)

In this finalized example, we can see that we achieved the core goals of representing outcome data in a manner consistent with SCARF visuals. Although a simple example with simulated data, the templated code provided here is easily adjusted to incorporate “real-world” review data and can be used to generate publication-quality figures as interested/necessary.

Final Notes and Relevant Resources

In closing on this one, it is good to note that the SCARF is just one of several options for research synthesis for single-case data. Numerous tools and methods are available (e.g., multilevel modeling), though SCARF does appear to be one of the more polished and established approaches available to single-case design researchers. Furthermore, there are options for formal training and mentorship for SCARF as well.

A range of relevant information and resources are provided by the authors of SCARF on their website and interested readers should consider reviewing these resources as well as pursuing formal training should they wish to include SCARF in their research.