How Can Research Help Design More Effective Youth Programs?

 An afterschool program CEO reflects on the risks and rewards of intensive program evaluations


N​​onprofits that work with young people are always looking for ways to assess their effectiveness, and randomized controlled trials—which randomly place eligible young people into “treatment” and “control” groups to draw comparisons between them—are generally considered the most rigorous approach. Implementation studies, by contrast, examine how an effort is carried out, pinpointing strengths and weaknesses in operations.

In tandem, randomized controlled trials, or RCTs, and implementation studies can help organizations answer two major questions: What is the impact of our work? What can we do to improve?  

As informative as such studies can be, they are also challenging to pull off and act on. Just ask Lynsey Wood Jeffries, CEO of Washington, D.C.-based Higher Achievement, one of the organizations that took part in Wallace’s now-concluded expanded learning effort. Higher Achievement, which provides academically focused afterschool programs for more than 1,000 middle schoolers in the D.C. metro area, Baltimore and Richmond, Va., has participated in two RCTs, the most recent one accompanied by an implementation study.

The first RCT, which was partially funded by Wallace and ran from 2006 to 2013, showed statistically significant effects for Higher Achievement students—known as “scholars” within the program—on math and reading test scores and in high school placement and family engagement. The second, completed last year (also with some Wallace support), found positive results, too, with the implementation study revealing some program delivery issues to be addressed in order for Higher Achievement to reach its full potential. (Readers can find the research and more information here.) The organization was in the process of making changes when COVID-19 hit and turned everything upside down, but as the pandemic eases, the hope is to use the findings to help pave the path forward.

This is part two of our interview with Jeffries. See the first post on running an afterschool program during a pandemic. This interview has been edited for length and clarity.

Why did you decide to participate in the second RCT, especially having already done one?

There were two main reasons. One is that the first study only focused on what has been our home base in the D.C. metro area. So, it showed statistically significant positive impacts on academics for D.C. and also Alexandria, Virginia. But since that study was conducted, we have expanded to other locations, and our effectiveness hadn't been empirically proven in those places. That was important to understand. A number of programs may be able to show impacts in their home base, but replicating that through all the complications that come with expansion is a next level of efficacy.

Second, it was suggested to us that the way to be most competitive for the major federal i3 grant we ultimately won was to offer an RCT. It's the highest level of evidence and worth the most points on the application.

Were there risks versus rewards that you had to weigh in making the decision to go ahead with the second RCT?

We very carefully considered it because we knew from past experience the strains an RCT puts on the community and the organization.

The reward is that if you win the dollars you can learn a lot and serve more students. Our grant application was about adapting our academic mentoring to help accelerate learning towards Common Core standards. That's something we wouldn’t have been able to do, at least not at the intensity we wanted, without a multi-million-dollar investment.

Were there any results of either the RCT or the implementation study that caught you by surprise?

The positive effect size for report card grades was greater in this second study than it was for test scores in a previous study. And that level of confidence did surprise me frankly, because I’ve lived and breathed Higher Achievement every day for many years now, and it's been messy. It hasn't just been a simple expansion process. There have been lots of questions along the way, adaptations to local communities, staffing changes, and more. So, to see that positive effect size for our scholars was encouraging.

You mentioned the strain an RCT can put on community relationships and the organization itself. What does that look like?

Only accepting 50 percent of the students you recruit strains community relationships; it strains relationships with families and scholars most importantly but also with schools. It also fatigues the staff, who have to interview twice as many students as we can serve. They get to know the students and their families, knowing that we have to turn away half of them.

Here’s are example of how an RCT can distort perceptions in the community: I'll never forget talking to a middle schooler who had applied for our program but was assigned to the control group. She said, "Oh, yeah, I know Higher Achievement. It's that group that pays you $100 to take a test on a Saturday." [As part of the first RCT] we did pay students to take this test, and so that’s what we were to her.

Additionally, when you’re recruiting for an RCT, you have to cast twice as wide a net [because you need a sufficient number of students in both the treatment and control groups]. Because there was such a push for a larger sample, the interview process for Higher Achievement became pro forma, and our retention rate ended up dipping because the overall level of commitment of the scholars and families recruited for the RCT was lower than it would be otherwise. And both studies showed that we don't have statistically significant effects until scholars get through the second year. So, when scholar retention dips, you're distorting the program.

Did you approach the second RCT differently in terms of recruitment or communications to try to avoid or address that potential for strain?

We were very cognizant of our school relationships the second time. Principals really value the service we provide, which makes it quite hard for them to agree to a study, knowing half the students won’t actually get the benefit of that service. So, we gave each of our principals three to five wild cards for particular students they wanted to be exempt from the lottery process in order to make sure that they got into the program. That hurt our sample size because those students couldn’t be part of the study, but it helped preserve the school relationships. We also deepened training for the staff interviewing potential scholars, which helped a bit with retention.

How did Higher Achievement go about putting the research findings into practice? In order to make changes at the program level, were there also changes that had to be made at the administrative level?

The implementation study was really helpful, and I'm so grateful we were able to bring in $300,000 in additional support from Venture Philanthropy Partners [a D.C.-based philanthropy] to support it. One of the things we took away from the implementation study was that there was more heterogeneity in our program delivery than we desired. We knew that internally, but to read it from these external researchers made us pause, consider the implications, and develop a new approach—Higher Achievement 2.0.

Higher Achievement 2.0 consisted of a refined program model and staffing structure to support it. We shifted our organizational chart pretty dramatically. Previously, program implementation was managed by the local executive directors [with a program director for each city and directors of individual centers within each city reporting to the executive director]. Program research, evaluation and design were under a chief strategy officer, who was not in a direct reporting line with the program implementation. It wasn't seamless, and it led to inconsistencies in program delivery.

The big change we made was to create a new position, a central chief program officer who manages both the R&D department, which we now call the center support team, and the local program directors, with the center directors reporting to those program directors. What that does functionally is lift the local center directors a full step or two or three, depending on the city, up in the organization chart and in the decision-making process [because they no longer report to a local executive director or deputy director]. Everything we're doing as an organization is much closer to the ground.

What were the main changes at the program level as a result of the implementation study?

One of the key takeaways from the implementation research was that our Summer Academy, which was a six-week, 40-hours-a-week program, was important for culture building but the academic instruction wasn’t consistently high quality or driving scholar retention or academic outcomes. That prompted us to take a very different approach to summer and to make afterschool the centerpiece of what we do. The plan was to focus on college-preparatory high school placement and to expand afterschool by seven weeks and go from three to four days a week. That’s a big change in how we operate, which we were just beginning to actualize in January 2020. Then COVID hit, and we had to pivot to a virtual, streamlined program, but now we’re exploring how to go back to a version of Higher Achievement 2.0 post-COVID.

High school placement has always been part of Higher Achievement’s model, but we elevated it to be our anchor indicator, so all the other performance indicators need to lead back to high school readiness and placement. While our direct service ends in eighth grade, we have long-term intended impacts of 100 percent on-time high school graduation and 65 percent post-secondary credential attainment. [Therefore], the biggest lever we can pull is helping our scholars choose a great fit for high school and making sure they’re prepared to get into those schools. Instead of running programs in the summer, we are referring scholars to other strong programs and spending much more time on family engagement in the summer to support high school placement. This starts in fifth grade, with increasingly robust conversations year after year about report cards and test scores and what different high school options can mean for career paths and post-secondary goals. We are building our scholars’ and families’ navigational capital. That discipline is being more uniformly implemented across our sites; it had been very scattered in the past.

The other thing we set out to do, which has been delayed because all our design capacity has been re-routed to virtual learning, is to build out a ninth-grade transition program. We know how important ninth grade is; the research is undeniable. The individual data from our scholars says sometimes it goes smoothly and in other cases it's really rocky. Students who’ve been placed in a competitive high school may shift later because they didn't feel welcome or supported in that school.

What challenges have you faced as you’ve gone about making these big changes? Were there any obstacles in translating the decisions of your leadership team into action?

The biggest obstacle is COVID. We haven't been able to put much of our plan into action in the way intended. The other obstacle we’ve faced is what any change faces: emotional and intellectual ties to the way things have always been done. I was one of the staff members who had a great emotional attachment to our Summer Academy.

​There are rituals that have been a part of our Summer Academy that are beloved rites of passage for young people. We are building these rites of passage, college trips and other culture-building aspects of Summer Academy into our Afterschool Academy. That way, we can focus in the summer on intentionally engaging our scholars and families to prepare them for college-preparatory high schools and increase our overall organizational sustainability and effectiveness.

What advice would you give to an organization that’s considering participating in an RCT and implementation study or other major research of this kind?

Proceed with caution. Before undertaking an RCT, review the studies that already exist in the field and learn from those to increase the effectiveness of your program. Let’s not reinvent the wheel here. If you do decide to proceed with an RCT, be really clear on what your model is and is not. And then be prepared to add temporary capacity during the study, particularly for recruitment, program observation and support. It takes a lot of internal and external communication to preserve relationships while also having a valid RCT.

There's a larger field question about equity—who is able to raise the money to actually conduct these very extensive and expensive studies? It tends to be white-led organizations and philanthropic dollars tend to consolidate to support those proven programs. Too few nonprofits have been proven effective with RCTs—for a host of reasons, including that these studies are cost-prohibitive for most organizations and that they strain community relations. And most RCT-proven models are difficult and expensive to scale.

However, just because an organization has not been proven effective with an RCT should not mean that it is prohibited from attracting game-changing investment.  If there were a more rigorous way for organizations to truly demonstrate being evidenced-based (not just a well-written and research-cited proposal paragraph), perhaps there would be a way to bring more community-based solutions to scale. With that approach, we could begin to solve challenges at the magnitude that they exist.