On randomised trials of surgical timings for cleft palate repair

Selena Ee-Li Young; Seng Teik Lee; David Machin; Say Beng Tan; Qingshu Lu

doi:10.34239/ajops.v4n2.217

Introduction

Clefts of the lip and palate create problems in feeding, speech, hearing, dental development and facial growth. Despite surgical and multidisciplinary advances in cleft care, associated speech difficulties and facial appearance represent serious barriers to social integration. As a consequence of the multifaceted nature of the condition, there are numerous decisions with respect to the possible clinical, surgical, audiological, orthodontic and speech interventions that have to be made at different times extending from the birth of the child to adolescence and beyond. For certain individuals, the specific intervention chosen at a given time may be non-controversial and firmly evidence-based, whereas at other times there may be options available that have not yet been rigorously evaluated. In general, there is a paucity of evidence from RCTs for many aspects of the management choices to be made for patients with cleft lip and palate. Indeed Berkowitz argues,¹ in the context of the use of nasoalveolar moulding, for rigorous evaluation, as did Lee² much earlier in the wider setting of cleft lip and palate patients.

Despite the plethora of potential questions arising from the complexities of care in patients with cleft lip and palate, a review by Hardwicke and colleagues³ identified only 62 published RCTs from 1 January 2004 to 31 December 2013. Among the 62 RCTs identified, only 10 concerned surgical techniques, with a median trial size of 86 (range 47–376). The largest of these, with 467 infants, was conducted by Williams and colleagues,⁴ who compared two different lip repairs with two types of palate repair conducted at two different ages in a so-called factorial design. Lu and colleagues⁵ commented on the advantages and difficulties associated with this design. The other RCT that evaluated repair at different ages was that of Ysunza and colleagues⁶ who compared results following surgery at two different ages in 76 infants.

We review information from the RCTs concerning palatal surgery at different ages in infants and discuss some of the logistical and statistical problems associated with the design and conduct of such trials. For the purpose of this article, we focus on speech outcomes with respect to VPC. Finally, we propose a format for an international multicentre trial of a very flexible design to investigate the appropriate timing of surgery in this context.

Method

A 2017 review by Hardwicke and colleagues³ listed all the RCTs conducted in cleft lip and palate over a 10-year period from 1 January 2004 to 31 December 2013 in English using the Cochrane Central Register of Controlled Trials, MEDLINE® and EMBASE with key words ‘cleft lip’ or ‘cleft palate’. From this review we were able to identify two RCTs that compared different ages at the time of surgery.^4,6 A literature search from 1 January 2014 to 29 February 2020 identified four further RCTs, two of which compare surgical timings.^7,8 Surgical timings investigated by Shaffer and colleagues⁹ describe a retrospective (hence non-randomised) study of the experience from a single craniofacial clinic and was excluded. A survey by Slator and colleagues of 18 cleft centres in the United Kingdom concluded ‘there remains considerable variation in both the sequence and timing of surgical repair of cleft lip and palate in infancy’ and was also excluded.¹⁰ See Figure 1.

Fig 1.PRISMA flowchart

Results

Clinical outcomes

Hardwicke and colleagues’ review³ identified RCTs comparing times of palatal surgery conducted by Ysunza and colleagues6 and Williams and colleagues.⁴ Since that review, the Scandcleft Consortium¹¹ has reported on three surgical trials (Scandcleft 1, 2 and 3) conducted in parallel in 163, 162 and 154 infants respectively. Only Scandcleft 1, reported by Willadsen and colleagues,¹² compared palatal surgery at different ages. In 2019 Yeow and colleagues⁸ compared timings of palatal surgery in 76 infants with isolated cleft palate. The details of these four RCTs, that investigated surgical-related outcomes following different surgical timing for palatal repair and the corresponding VP insufficiency (VPI) rates quoted at a variety of ages, are given in Table 1.

Table 1.Summary of VPI rates from published RCTs comparing palatal surgery conducted at different ages

Lip	Palate	Age at surgery (m)	Initially randomised (n)	VPI present	VPI absent	Total analysed	VPI present (%)
Williams and colleagues⁴ —speech assessed at age four or more years
Spina	von Langenbeck	9–12	51	14	37	51	27.5
Millard	von Langenbeck	9–12	52	15	37	52	28.8
Spina	von Langenbeck	15–18	46	12	34	46	26.1
Millard	von Langenbeck	15–18	54	18	36	54	33.3
Spina	Furlow	9–12	35	2	33	35	5.7
Millard	Furlow	9–12	43	7	36	43	16.3
Spina	Furlow	15–18	48	12	36	48	25.0
Millard	Furlow	15–18	47	11	36	47	23.4
Total			376	91	285	376	24.2
Ysunza and colleagues⁶ —speech assessed at age four years
	San Venero	6	35	6	29	35	17.1
	Roselli pharyngoplasty§	12	41	8	33	41	19.5
Total			76	14	62	76	18.4
Willadsen and colleagues⁷ —speech assessed at age five years
	Gothenburg flap	12	83	13	59	72	18.1
	Vomer flap§§	36	80	16	55	71	22.5
Total			163	29	114	143	20.2
Yeow and colleagues⁸ —speech assessed at age three years
	VWK	6	18*	1	15	16	6.3
	VWK	12	20**	3	11	14	21.4
	2F-IVV	6	18*	2	14	16	12.5
	2F-IVV	12	20**	1	13	14	7.1
Total			76	7	53	60	11.7

2F-IVV=2-flap palatoplasty with intra-velar veloplasty; RCT=randomised controlled trial; VPI=velopharyngeal insufficiency; VWK=Veau-Wardill-Kilner type palatoplasty. §Details are given by Trigos and colleagues¹³ §§Details are given by Rautio and colleagues¹⁴ *One from each group randomised to timing only. **One randomised to type of palatal surgery only

All patients in Ysunza and colleagues’ trial received San Venero Roselli (SVR) pharyngoplasty and all those in Willadsen and colleagues’ trial received Gothenberg and Vomer flap hard palate closure. In contrast, Williams and colleagues’ trial compared von Langenbeck and Furlow palatoplasties, while Yeow and colleagues’ trial compared Veau-Wardill-Kilner (VWK) palatoplasty with two-flap (2F) palatoplasty in conjunction with intra-velar veloplasty (IVV), denoted 2F-IVV.

All trials compared ‘early’ and ‘late’ age at palatal surgery (which we classify later into four age categories as ‘very early’, ‘early’, ‘late’ and ‘very late’). Two trials compared outcomes following surgery at six months and 12 months,^6,8 and one at 12 months versus 36 months,⁷ while the fourth considered two ranges of timings of nine to 12 months versus 15–18 months.⁴ Figure 2 shows the VPI rates by early and late surgery by type of palatal repair for the four RCTs. Apart from 2F-IVV in Yeow and colleagues’ trial,⁸ late repair was associated with higher rates of VPI.

Fig 2.VPI rates reported from the four published RCTs comparing ‘early’ and ‘late’ palatal repair

2F-IVV=2-flap palatoplasty with intra-velar veloplasty; RCT=randomised controlled trial; VPI=velopharyngeal insufficiency; VWK=Veau-Wardill-Kilner type palatoplasty.

The difference in VPI rates, together with the associated 95 per cent confidence intervals (CI), between the broad categories ‘early’ and ‘late’ timings are summarised in Table 2. However, the different ages at which VPC was assessed in the four trials prevent a reliable overall synthesis. Nevertheless, all but one (Furlow technique in Williams and colleagues’ trial)⁴ of the wide CIs in the final column of Table 2 include zero difference (no effect), so there remains much uncertainty with respect to the influence of surgical timing.

Table 2.Differences in VPI rates between types of palatal surgery conducted at ‘early’ and ‘late’ ages within each trial

RCT	Surgical technique	Age at speech assessment (y)	Early	Late	Late–Early
RCT	Surgical technique	Age at speech assessment (y)	VPI (%)	VPI (%)	(%)	95% CI (%)
Williams and colleagues⁴	Furlow von Langenberg	>4 >4	11.5 28.2	24.2 30.0	12.7 1.8	+1.8 to +24.4 -10.5 to +14.2
Ysunza and colleagues⁶	San Venero Roselli pharyngoplasty	4	17.1	19.5	2.4	-14.7 to +20.5
Willadsen and colleagues⁷ and Semb and colleagues¹¹	Gothenburg and Vomer flap	5	18.1	22.5	4.5	-8.7 to +17.7
Yeow et al⁸	VWK 2F-IVV	3 3	6.3 12.5	21.4 7.1	15.2 -5.4	-11.5 to +41.3 -31.3 to +18.9

2F-IVV=2-flap palatoplasty with intra-velar veloplasty; RCT=randomised controlled trial; VP=velopharyngeal; VPI=velopharyngeal insufficiency; VWK=Veau-Wardill-Kilner type palatoplasty.

Timing (age at palatal surgery)

As we have indicated, the four RCTs used different definitions for ‘early’ and ‘late’ palatal surgery. In practice, despite a specification of six months and 12 months in Yeow and colleagues’ trial,⁸ Figure 3 shows that the variation at actual age of surgery was quite considerable within each timing group. Thus, the median age of ‘very early’ surgery was close to six months (6.13 m; range 5.32–7.59 m), whereas for ‘late’ the median was 11.⁴³ months (range 10.11–12.87 m) with 82 per cent of 33 infants receiving scheduled surgery before the protocol stipulation of age 12 months.

Fig 3.Age of infants (months) at the time of surgery by randomised allocation* (*unpublished data from Yeow and colleagues⁸)

2F-IVV06=two-flap palatoplasty with intra-velar veloplasty at six months; 2F-IVV12=two-flap palatoplasty with intra-velar veloplasty at 12 months; VWK06=Veau-Wardill-Kilner type palatoplasty at six months; VWK12=Veau-Wardill-Kilner type palatoplasty at 12 months.

Williams and colleagues state that ‘Palatal repairs were performed between the ninth and 30th month, with a mean age of 12.85 m (SD=3.3)’. The summary data suggest a skewed distribution towards the lower age at surgery, implying that relatively few children were actually operated at, or close to, 30 months. However, Ysunza and colleagues give no indication of departure from six months (very early) and 12 months (late) scheduling,⁶ and neither do Rautio and colleagues with respect to 12 months (late) and 36 months (very late) in Scandcleft.¹⁴ Figure 4 suggests that the VPI rate does not rise as the age at palatal surgery increases. We note again that the age when VPI assessments were made differs between the four RCTs concerned.

Fig 4.Reported VPI rates categorised into four age-at-palatal-surgery groups.

VPI=velopharyngeal insufficiency.

Randomisation

In a clinical trial the usual method is to allocate equal numbers of patients at random to the respective alternatives ensuring balance in the patients recruited by the end of the trial. In general, equal numbers in each group are statistically the most efficient. Only Yeow and colleagues give details of their randomisation process and, although the trial closed prematurely, the numbers in the four groups are close to equal.⁸ This appears to be the case for the two groups in Rautio and colleagues’ trial,¹⁴ less so for Ysunza and colleagues’ trial⁶ and far from the case for Williams and colleagues’ trial.⁴ No explanation is provided for the large disparity (ranging from 35 to 51) between infant numbers in the eight groups of Williams and colleagues’ trial.⁴ As Rautio and colleagues¹⁴ used dice to generate the randomisation (and opening of sealed envelopes to reveal the allocation), the close proximity of the numbers in each group (83 and 80) seems fortuitous.

Current standards would tend to proscribe the use of dice for generating the randomisation list and the use of sealed envelopes for implementation. As stated by Suresh, ‘it is better to use […] computer programming to do the randomization’¹⁵ particularly for large trials and those of a complex design. Additionally, some regulatory bodies overseeing RCTs insist that the randomisation sequence must be reproducible.

Although numbers are small, one consequence of randomising the infants when six months of age to all four groups in Yeow and colleagues’ trial was that although all 36 allocated palatal surgery at six months received their surgery close to that time, among the 40 allocated to the ‘late’ 12 month group, seven (17.5%) withdrew from the trial before the scheduled surgery could be activated. Furthermore, as Figure 3 indicates, palatal surgery was conducted as early as 10 months of age in this 12 month group. It cannot be deduced whether comparable delays and losses occurred in the other three RCTs.

Trial size

The period covered by this review extends over 20 years during with 782 infants recruited to address the particular role of age at palatal surgery with respect to VPC at a later age. Of these patients, results from 655 (84%) have been reported. Nevertheless, it appears that no firm inferences can be drawn from the resulting data so a truly evidence-based conclusion remains elusive. If we take the results from the four RCTs considered for VPI with ‘early’ as opposed to ‘late’ palatal surgery, Table 3 gives the corresponding sample sizes required for a randomised parallel group trial if these finding were to be anticipated in future trials.

Table 3.Sample sizes required, assuming two-sided test size 5 per cent and power 80 per cent for confirmatory trials of two palatal surgery timings using the outcomes from the four RCTs as planning values

RCT	Actual trial recruitment (VPI assessed)	VPI (%)		Observed effect size	Odds ratio	Confirmatory trial size
RCT	Actual trial recruitment (VPI assessed)	Early	Late	Late–Early	OR	Confirmatory trial size
Ysunza and colleagues⁶	76 (76)	17.1	19.5	2.4	0.8515	8,200
Williams and colleagues⁴*	467 (376)	21.0	27.2	6.2	0.7115	1,500
Willadsen and colleagues⁷ and Semb and colleagues¹¹	163 (143)	18.1	22.5	4.4	0.7612	2,600
Yeow and colleagues⁸*	76 (60)	9.4	14.3	4.9	0.6218	1,400

RCT=randomised controlled trial; VPI=velopharyngeal insufficiency. *VPI calculated from both surgical techniques of Table 2 combined.

Critically, the size of trial depends on the anticipated effect size (the larger the effect size, the smaller the trial) and on the prevalence of VPI (the binary variable).¹⁶ Although the effect size from Yeow and colleagues’ trial⁸ is smaller than that of Williams and colleagues’ trial,4 the possible confirmatory trial is smaller as the VPI rate for ‘early’ is lower (9.4% as compared to 21%). It is clear from Table 3 that the trial sizes are too large to permit the possibility of any of these confirmatory trials being conducted.

However, Williams and colleagues’ trial4 indicates that VPC is assessed on an 11 point scale graded from 0 to 10 with scores ranging from three to 10 interpreted as being indicative of VPI. Alternatively, Lohmander and colleagues¹⁷ have suggested a seven point categorical scale which they subsequently collapsed to a three point classification in their Table 4. In general, if a trial endpoint is defined as an ordered categorical variable, the corresponding sample sizes tend to be smaller than if a binary endpoint is concerned.

Table 4.Sample sizes required, assuming two-sided test size 5 per cent and power 80 per cent for possible trial of two infant age-at-palatal surgery options using VPI measures on a binary (A) and a categorical three point (B) scale

A: Binary (two point) scale
Surgery	Late	Early	Early	Early
Planning: (Late–early)		0.05	0.075	0.10
Planning: odds ratio (OR)		0.75	0.64	0.53
VPI proportion
Incompetent	0.25	0.2	0.175	0.15
Marginal/competent	0.75	0.8	0.825	0.85
Trial size		2188	986	500
B: Categorical (three point) scale
Surgery	Late	Early	Early	Early
Planning: odds ratio (OR)		0.75	0.64	0.53
VPI proportion
Incompetent	0.25	0.200	0.175	0.150
Marginal	0.35	0.329	0.314	0.293
Competent	0.40	0.471	0.510	0.566
Trial size		1314	552	276

VPI=velopharyngeal insufficiency

Lohmander and colleagues¹⁷ reported the results of VPC rates categorised on a three point scale as ‘incompetent’ (VPI) (23.9%), ‘marginally incompetent’ (34.8%), and ‘competent’ (41.3%) in 339 five-year-old infants with repaired cleft palate. We assume a RCT is planned with the aim of reducing VPI levels in infants who have ‘late’ surgery (25%) to a lower rate with early surgery. Then, with a binary endpoint with two-sided test size 5 per cent and power 80 per cent for different (reduced) rates for early (20.0%, 17.5% and 15.0% VPI), the corresponding sample sizes are given in Table 4. These calculations suggest the RCT will range in size from 500 to more than 2000 patients depending on the planning assumption made.

However, using the sample size methods¹⁶ concerning a three point categorical scale variable, Table 4 shows for VPI that the proportions of ‘incompetent’, ‘marginal’ and ‘competent’ (25%, 35% and 40% respectively) for ‘late’ surgery, with an assumed planning OR=0.75, would potentially improve (to 20%, 33% and 47%) with ‘early’. What is more, the required sample size is approximately half that for the corresponding binary endpoint. This implies that regarding VPC as a categorical rather than a binary variable would reduce the size of any future trial considerably.

Proposed structure of a collaborative RCT with a pragmatic design followed by a prospective individual patient data meta-analysis

On the basis that the issue of the timing of infant age at palatal surgery remains an open question, it is clear that further trials are required to answer this question, although it is important not to underestimate the challenges that this presents. Indeed, the International Confederation for Cleft Lip and Palate and Related Craniofacial Anomalies Task Force recommended ‘that a prospective international controlled trial is conducted’.¹⁸

In the knowledge that there are many individual centres and collaborative groups capable of recruiting substantial numbers of infants with cleft palate/lip anomalies, we propose a pragmatic way forward.

Rather than a single trial, we propose that groups capable of recruiting, for example, 50 eligible infants within a framework of two years, each conduct their own RCT with a view to a future international collaboration to organise a prospective, individual data meta-analysis of these many trials. The possible framework of such a collaboration is summarised in Table 5 which allows individual centres to make specific choices concerning the eligibility of the infants to be included and the surgical options, but with some provisos imposed by the overall design such as an agreed endpoint and how it is to be assessed.

Table 5.Proposed structure of a pragmatic trial to compare surgical timings with VPC as the primary concern and assessed by means of an ordered categorical variable

Surgical timing–early versus delayed
To avoid some of the problems associated with the ‘late’ classification, rather than having two fixed surgical time options, the proposal is to randomise to ‘early’ or ‘delayed’ surgery. Early is defined as current practice within the centre concerned. When delayed surgery is conducted, it will be left to the responsible clinical team (including the parents) to decide but delay should be as long as possible after the early timing but not beyond (for example) one year of age.
Eligibility
Each group is to make its own choice but it is likely to include the ranges of non-syndromic cleft infants such as those covered by the four trials considered in this review. If lip repair is also required, early palate surgery would be conducted as soon as is practical after lip surgery.
Surgical techniques
Each group is to make its own choice of the surgical technique(s) to use. This could be, for example, a randomised option between two surgical approaches.
Clinical endpoint
VPC assessed using a standard approach (such as by Lohmander and colleagues¹⁵) but recording and reporting of the individual variable scores rather than merely the transfer values suggested.
VPC to be assessed as close as possible (± four weeks) to the birthdays at three and five years.
Minimal data to be recorded and retained by the group
Design features	On-study	Follow-up
Centre name and contact	Infant birthdate	VPC at age three years
Surgical option(s)	Cleft lip (if relevant)	VPC at age five years
General eligibility	Date of lip surgery (if relevant)
	Cleft palate
	Date of randomisation
	Surgical timing (immediate or delayed)
	Date of palatal surgery
Data exchange
To monitor overall progress of the multi-group trial, groups would send their anonymised data annually (at a fixed date) to be checked for completeness by the coordinating centre, which would need to be identified.
Meta-analysis
Although the minimum duration of recruitment to this trial may be set at six years, interim analysis of VPC at age three years could be published after (for example) five years from commencement of the international collaboration, provided sufficient data have been accumulated.

VPC=velopharyngeal competence

The proposal envisages that the ongoing timing of primary surgery (TOPS) for cleft palate trial by Shaw and colleagues would eventually form part of the prospective meta-analysis.¹⁹ Their trial relates to non-syndromic isolated cleft palate participants who have received the Sommerlad surgical technique either at six or 12 months.²⁰ The main outcome variable is VPC at five years but also includes an assessment at three years. Their trial is closed for recruitment with the final follow-up assessments due in July 2020.

Although this article has been written in the context of the unresolved question with regard to surgical timings, the general structure of the proposal allows for a similar approach to be adapted to accommodate other unanswered aspects of cleft management which require RCTs to be conducted. As Bekisz and colleagues conclude, following their review of RCTs in cleft and craniofacial surgery, ‘Our community should consider methods by which more RCTs can be performed’.²¹

Conclusion

The objective of this review article was to review the available evidence from RCTs concerning the age at which palatal repair is best conducted in infants. Our review suggests no firm conclusions can yet be drawn with respect to the rates of later VPC. As a consequence, we outline the structure of a pragmatic RCT as a basis for further investigation of the optimal of age at surgery (or other relevant research questions).

Conflict of interest

The authors have no conflicts of interest to disclose.

Financial declaration

The authors received no financial support for the research, authorship, and/or publication of this article.

Revised: 23 May 2022