1
Electrochemical multi-tagging of cysteinyl peptides during microspray mass spectrometry: numerical simulation of consecutive reactions in a microchannel

2
On-line electrogeneration of mass tags in a microspray emitter is used to quantify the number of cysteine groups in a given peptide.

3
A finite-element simulation of the multi-step process yields the relative distribution and concentration of tags, untagged and tagged species in the microchannel before the spray event.

4
The work focuses on the tagging of cysteine moieties in peptides or proteins by electrogenerated quinone mass probes.

5
The main chemical parameters determining the kinetics of the labelling are assessed and discussed considering the microfluidic aspects of the process.

6
The control of the tagging extent allows the simultaneous MS analysis of both the unmodified and modified peptide(s).

7
The number of cysteine groups corresponds to the number of characteristic mass shifts observed from the unmodified peptide.

8
The present theoretical work establishes the range of optimum conditions for the determination of the number of cysteine groups in peptides containing up to five cysteine groups.

Introduction

9
In proteomics, the identities of most proteins represented in sequence databases can be determined by correlating mass spectrometric data with databases.1

10
To narrow down possible matching candidates, specific searching constraints are needed.

11
The mass mapping or fingerprinting of peptides derived from proteolytic digestion of a protein provides the basic constraint.2–7

12
However, a sufficient coverage of the protein (i.e. the determination of a sufficient number of proteolytic peptides from a particular protein) is required to unambiguously identify a protein.

13
Tandem mass spectrometric (MS/MS) analysis of peptides in mixtures is the most common and restrictive constraint used in addition to mass mapping.8,9

14
In most cases a collision-induced dissociation (CID) spectrum from a single peptide through electrospray (ESI) MS/MS is then sufficient to conclusively identify a protein.

15
In this technique, peptides ions are sequentially selected for MS/MS from a mixture.10,11

16
However, in the case of a complex mixture, the generation of CID spectra for all of the components fails because of time limitations.

17
Automated analysis is thus routinely programmed to give priority to peptides having the highest ion current.

18
Therefore, when the mixture is complex, many detected ions with low intensities are not fragmented, thus reducing the dynamic range of the method.

19
Procedures capable of enhancing the identification procedure are of great value.

20
The accuracy in the mass determinations is a way of constraining the database search.12,13

21
The molecular weight of the protein, the isoelectric point of the protein or tryptic peptides,14and the presence of rare amino acids such as cysteine, methionine, or tryptophan in the peptide sequence provide powerful information to enhance the matching.

22
Counting the cysteine groups in a peptide using the isotopic signature of a specific tag for thiol12,15 or the differential analysis of unlabelled and labelled cysteinyl samples16 can be used to significantly improve protein identification by database searching.

23
Recently, the electrochemically induced tagging of peptides by probes electrogenerated at the microband electrode of a microspray emitter17 have enabled the on-line counting of cysteine units in peptides18 to improve the processes of identification of model proteins.19

24
The tagging reaction happens in the microspray emitter which behaves as an electrolytic-flow cell reactor and is quasi-instantaneous.

25
As a post-column treatment, the technique would provide powerful information on cysteine content, notably for low intensity peptides that would not be selected for MS/MS.

26
Here, the cysteine content determination relies on the control of the extent of the reaction to ensure that a minimum amount of the unmodified peptide together with a minimum amount of the fully tagged peptide is produced.

27
The number of cysteine units is determined by mass spectrometry as the number of mass shift(s) from the unmodified peptide.

28
Because the starting peptide should not be totally converted to guarantee the success of the analysis, understanding the kinetics of the flow reaction is relevant to control the process and to generalize the technique.

29
To numerically simulate the on-line electrochemical tagging of peptides, multi-stage chemical reactions should be considered in a fluid flow.20,21

30
Validated by comparison to previous works,22,23 a finite-element convection–diffusion reaction model for an electrochemical (EC) mechanism24 has been further developed for the consecutive tagging by markers electrogenerated at the bottom of a microchannel (2 cm in length).

31
Some numerical simulations are carried out for the addition of one to up to five tags (multi-step consecutive reactions) to determine the major chemical and kinetic parameters involved in the microfluidic process.

32
The present theoretical work establishes the range of optimal conditions to achieve the counting of cysteine units in peptides, in order to apply the counting/identification technique to complex protein mixtures.

Results and discussion

Microspray characteristics

33
The MS experimental set-up comprised the emitter shown in Fig. 1, which behaves as an electrolytic-flow cell reactor.

34
The analyte is mixed with the electro-active probe prior to pressure-driven infusion through a capillary tube to the microspray emitter.

35
The probes studied are 1,4-hydroquinone and methoxycarbonyl-1,4-hydroquinone (HQ in general).

36
The overall tagging process can proceed via an electrochemical–chemical–electrochemical (ECE) mechanism (Fig. 2) where:

37
(i) The first electrochemical reaction is the oxidation of HQ in 1,4-benzoquinone or methoxycarbonyl-1,4-benzoquinone (BQ in general).

38
(ii) The chemical reaction is the 1,4-Michael addition of the thiol functional group of the cysteine of a protein or peptide (P) on the BQ ring that yields the products (PQi), where i represents the number of cysteine groups being tagged.

39
(iii) The final electrochemical reaction involves the oxidation of the adduct, but under the present flow conditions this second electrochemical oxidation has no time to occur.

40
It implies that only one thiol addition on the BQ core can happen.

41
Therefore, a simple EC mechanism will be considered in the present study.

Numerical model

Model

42
The present model is developed for species containing up to five cysteine groups.

43
It addresses the convection–diffusion reaction of the eight species i considered here (HQ, BQ, P, PQ1–5 for the hydroquinone, the benzoquinone, the protein or peptide, and the five possible successive degrees of the tagged protein or peptide, respectively).

44
This transient model (eqn (1)) is applied in a steady state regime to a 2D cross section of the geometry (see section 2.2 and Fig. 1):where ci is the concentration of the species i, Di its diffusion coefficient, v is the fluid velocity vector and Ri is the rate of generation or consumption of i (Fig. 2).

45
The ∇ symbol is used to simplify the notation.

46
In the Appendix, the global forms of these local equations are described for each species, using the Galerkin formulation (finite-element method).

47
The tagging conditions and numerical parameters are given in Table 1 and in the computational section (Flux-Expert® software).

Assumptions

48
(i) The electrochemical oxidation of HQ is assumed to be limited by the diffusion (fast electrode reaction) and HQ is assumed to be only species oxidised at the electrode.

49
(ii) The solution is assumed to be sufficiently dilute and isothermal so that the viscosity and the density of the fluid can be considered to be unmodified by concentration or temperature variations.

50
The diffusion coefficients of the species are also treated as uniform in the entire study domain.

51
(iii) The channel walls are considered to be smooth and the eventual migration effects due to the applied potential are neglected.

52
(iv) The width d of the channel is assumed to be much larger than its height 2h so that the velocity gradient in the third dimension can be neglected (2D Cartesian assumption to overcome numerical limitations).

53
(v) The fluid is assumed to be Newtonian and its velocity is described according to a Poiseuille profile (laminar flow conditions, Re = 0.035).

54
(vi) The reactivity of cysteine is taken to be equal at every site of the biomolecule (equal to that of the cysteine amino acid) and any other parasite reactions are neglected.

55
(vii) The numerical simulations of the tagging reaction are considered only in the channel, thereby neglecting the reactions in the Taylor cone or in the ESI plume.

56
Previous experimental MS mono-tagging of the synthetic peptide AIKCTK carried out with microchannel emitters of variable channel length have clearly shown that the channel contribution is predominant for a channel length of 2 cm with the present flow rate.25

57
(viii) The flow rate of the fluid is a fixed parameter.

58
The simulations correspond to a flow rate of 250 nL min−1 (i.e. a mean flow velocity = 4 × 10−3 m s−1 before scaling).

Kinetics of multi-tagging

59
The parameters playing a key role in the tagging final efficiency are investigated using the finite-element model (see Computational methods).

60
The multi-tagging process in a laminar flow is evaluated in terms of tagging extent TE (= (Σn [PQn])/[P]0 = ([P]0 – [P])/[P]0, i.e. the consumption of the protein or peptide P).

61
The addition rate constants are assumed to be equal to those corresponding to the addition of l-cysteine on 1,4-benzoquinone and methoxycarbonyl-1,4-benzoquinone (210 and 5000 M−1 s−1, respectively, in methanol–water–acetic acid 50 : 49 : 1).26,27

62
For simplification, the diffusion coefficient of all the quinone probes is taken to be equal to 3.5 × 10−10 m2 s−1, which in fact corresponds to the diffusion coefficient of methoxycarbonyl-1,4-hydroquinone in methanol–water–acetic acid 50 : 49 : .128

63
The mean diffusion coefficient of the target biomolecule P was chosen to be 1 × 10−10 m2 s−1.29

64
As the target biomolecule concentration is uniform at the inlet, its diffusion coefficient does not affect the final adduct amount (the diffusion coefficient of BQ plays an important role as the BQ generated at the electrode diffuses from the bottom of the channel and along the flow).24

65
The initial species concentrations are taken in accordance to previous experimental works that showed valuable analyses of proteins and peptides (Table 1).19,26

Mono-tagging

66
Simulations were first run for a species containing a single thiol group.

67
In Fig. 3, a comparison of species distributions along the channel in its central portion (y = h) is proposed for the two probes considered for a single-cysteine target.

68
The formation of adduct PQ1 becomes efficient with methoxycarbonyl-1,4-hydroquinone (k = 5000 M−1 s−1), inducing subsequent consumption of both the biomolecule and the electrogenerated BQ.

Multi-tagging

69
The same kinetic comparison was made for a biomolecule with five cysteine units (each cysteine site is considered to have the same labelling rate constant).

70
For k = 210 M−1 s−1 (Fig. 4a), the first adduct PQ1 is the only species produced in a reasonable amount.

71
The production is higher than in the case of a single-cysteine target (15.3 μM instead of 5.0 μM at the outlet of the channel) since here it possesses five cysteine groups.

72
In fact, the probability for the probe to react with a cysteine in a target possessing five cysteine groups is five-fold higher than the probability to react with cysteine in a single-cysteine target.

73
The first step of the consecutive tagging reactions presents an apparent rate constant k1 = 5k since the rate law is here formulated as a function of the biomolecule concentration [P] (Fig. 2).

74
Therefore, the reaction rate is multiplied as shown below.Part of the first adduct PQ1 reacts with BQ to give the successive adducts, the production of which is limited to PQ2 for the present k value.

75
For k = 5000 M−1 s−1 (Fig. 4b), almost all of the BQ generated at the electrode is consumed at the end of the channel.

76
The production of all adducts is enhanced, apart from the first, whose concentration decreased beyond a distance of 5 mm from the electrode (i.e. 1.25 s of reaction) to feed the following additions.

77
The second adduct PQ2 is the most favoured species at the end of the channel (15 μM).

78
The fifth adduct PQ5 is also observable in small amounts.

79
Simulations with a rate constant of k = 20 000 M−1 s−1 were performed as shown in Fig. 4c.

80
The reaction is quite fast and the thermodynamic equilibrium is reached in about 2.5 s (i.e. 1 cm from the electrode).

81
The electrogenerated BQ, which is correlated to the HQ initial concentration, is in deficit with respect to cysteine units.

82
It implies that the species P, PQ1, PQ2, PQ3, PQ4 and PQ5 remain unchanged for x > 1 cm (no more BQ to react).

83
This result is not only the effect of the addition rate constant but also a consequence of the multi-tagging that amplifies this tendency.

84
Fig. 5 reports the influence of the HQ initial concentration on the TE (= (Σn [PQn])/[P]0 = ([P]0 – [P])/[P]0) of a species containing five cysteine units.

85
When the BQ is in excess ([HQ]0 ≥ 6 mM), a total conversion of P is obtained for k values above 2500 M−1 s−1.

86
On the other hand, for initial HQ concentrations of 1 and 2.5 mM, the electrochemically-produced BQ is in deficit with respect to the cysteine moieties and, for high k values, BQ is found to be totally consumed before the end of the channel when the protein is still present (see Fig. 4c).

87
The high kinetics limits the consumption of P because the following steps consume BQ rapidly.

Optimization of the multi-tagging of peptides

88
The initial concentration of HQ plays a key role in the final tagging degree since it controls the production of the BQ markers (in the present assumption of diffusion control of the current).

89
In Fig. 6a and b, the evolution of the species at the end of the channel is given according to [HQ]0 for biomolecules containing three and five cysteine groups, respectively.

90
As expected, the higher the [HQ]0, the higher the consumption of P and the production of the completely tagged species (i.e. PQ3 and PQ5 for peptides containing three and five cysteine groups, respectively).

91
In the previous section, a target with five cysteine groups was used for emphasizing kinetics and a better understanding of the multi-stage process.

92
A peptide with three cysteine groups serves as a reference since such peptides are more probable in proteomic analysis.

93
To enable the MS cysteine counting in peptides in future experiments, we have decided to impose the following criteria for the simulation: both the proportion of P (= 100 × [P]/[P]0) and of the completely tagged species PQn (= 100 × [PQn]/[P]0) should be above 10% in order to be detectable by MS.

94
In Fig. 6, the corresponding working domains mark these conditions.

95
For a three cysteine peptide (Fig. 6a) the two conditions are compatible (1.755 mM < [HQ]0 < 2.275 mM), but the two working domains do not intersect for a five cysteine peptide (Fig. 6b).

96
The initial concentration of HQ drives the tagging rate and thereby the proportion of the species at the end of the channel.

97
When the [HQ]0 is too high, there is not enough P left and when it is too low, there is not enough PQn produced.

98
As shown in Fig. 7, tagging reaction times t in the channel could thus be chosen to make the two conditions expressed above compatible.

99
The 10% working conditions are represented in Fig. 7a for the three cysteine peptide.

100
The domain shows that high concentrations of HQ imply working over small times to ensure that P is not totally consumed.

101
Nevertheless, a wide range of concentrations is then compatible with these short times: 0.6 s of reaction time allows working between 10 and 20 mM of HQ.

102
When working with a longer reaction time, the concentration possibilities are narrower but the analysis can be done at many times (for instance, the analysis is possible from 2 to more than 5 s for an initial concentration of 2 mM).

103
In other words, when the apparent kinetics become slower, the fixed conditions are satisfied for wider time ranges (the scale is widespread due to the second order kinetic law).

104
For the five-stage tagging (Fig. 7b), the conditions are almost never compatible except at high concentrations, and for very restrictive time ranges (for [HQ]0 = 25 mM, the reaction time must be fixed at 0.3 s).

105
When the criteria level is reduced to 5%, a much wider working domain appears for the five-stage reaction (Fig. 7c).

106
The technique appears applicable to peptides containing five or less cysteine groups since the co-existence of P and PQn is no more likely when more cysteine groups are present as BQ is stocked in the intermediate species.

107
For successful cysteine counting, the initial concentration of P is also a key feature to consider, all the more so as [P]0 is not really controlled by the manipulator.

108
Indeed, in proteomic analysis, the amounts of tryptic peptides derive from the protein concentrations, which are quite variable from one to another in a complex mixture.

109
In the simulated data of Fig. 8, for a given reaction time t = 5 s (i.e. Lch = 2 cm), concentration [P]0 varies from 0.1 to 75 μM.

110
The initial concentrations [HQ]0 that provide 10% of P and PQ3 according to each initial concentration [P]0 are reported.

111
Between the two curves, the hatched domain indicates that both conditions are satisfied.

112
The reference point (◆) shows that [HQ]0 = 2.5 mM, which was used in the previous section, provides good analysis of three-cysteine peptides from 55 to 100 μM.

113
Defining the ratio Δ[HQ]0/[HQ]0 for the 10% working domain, 0.51 and 0.31 are obtained for [P]0 = 0.l and 75 μM, respectively, showing that the working interval slightly decreases with the concentration.

114
The nature of the curve proves that the ratio [HQ]0/[P]0 can not be used as a term to predict the tagging extent in general.

115
This is further confirmed by several numerical studies at different concentrations while keeping the ratio [HQ]0/[P]0 constant (see supplementary information Fig. S1).

Computational methods

Numerical technique

116
The finite-element formulation was generated on the numerical software Flux-Expert® (Astek Rhône-Alpes, Grenoble, France), in a 2-D Cartesian form.

117
It was operated on a Dell PC, 2 Gb RAM (Red Hat Linux).

118
The mesh size adopted for the channel was δ = 5 μm and it was reduced (i.e. to 0.4 μm) at the electrode extremities in order to take into account the edge effects.

119
The resulting mesh Péclet number PS = δ/D was found to be 50 and 14 for the P and the BQ respectively.

120
The error induced by a 7 μm channel mesh (compared to a 4 μm channel) was checked and found to be 0.01 and 0.05% for BQ and P respectively at the end of the channel (see Fig. S2 in the ESI).

121
It confirms the mesh Péclet number limit of 100 determined in the literature.30

122
The total mesh number is 30 000, leading to 240 000 degrees of freedom for the eight unknowns in the five-tagging case.

123
The conservation of species flux (BQ and P) was verified.

Scaling

124
Because of the 2 cm channel experimental length, scaling was necessary to have an acceptable mesh Péclet number while keeping a meshing grid and matrix size that could be numerically treated.

125
Then, a 0.5 cm long geometry was used but the channel height (2h) and the electrode length were kept the same (the electrode length remains quasi-negligible compared to channel length, i.e. the ratio is 1 : 50 instead of 1 : 200 for the experimental conditions).

126
The flow rate was divided by four to ensure the same residence time in the channel and the same transversal diffusion time.

127
The incoming HQ concentration was adapted to have the same flux ratio between the incoming protein or peptide P and the BQ generated at the electrode.

128
With the Levich equation,31,32 one can express a proportionality relationship between the flux rate of produced BQ at the electrode NBQ and the flow rate FV = 2hd (where is the mean flow velocity, 2h and d are the height and width of the channel)NBQ ∝ [HQ]0FV1/3From Relation 1, if the flow rate is divided by four, the flux of BQ is divided by 41/3 = 1.6.

129
It has been checked by simulation that the ratio of flux of BQ at the electrode for the normal case ( = 4 × 10−3 m s−1) over the scaling by 4 ( = 1 × 10−3 m s−1) is 1.589 (≈1.6).

130
Moreover, the flux of the protein NP coming into the channel is given byNP = [P]0FVWhatever the flow rate scaling, the ratio of these fluxes must be conservedTo conserve the ratio NBQ/NP if the flow rate is multiplied by 4, the initial concentration [HQ]0 must be multiplied by 42/3.

131
The mean concentration [BQ] remains unchanged by this operation asThe scaling was fully validated by the numerical comparisons of the concentration of BQ and PQ3 along the channel for a scaling by 2 (channel length of 10 mm) and 4 (channel length of 5 mm), which appear in Figs. S3 and S4 in the ESI.

132
Numerical simulation studies were pursued with a scaling of 4.

Conclusions

133
The influence of the tagging rate constant as well as the impact of the probe and target concentrations have been simulated for a single and a five-step tagging reaction in a microchannel.

134
The finite-element model has shown how strongly the tagging efficiency derives from the number of consecutive tagging reactions (i.e. the number of cysteine moieties in the protein or peptide determining the apparent kinetics of each consecutive reaction).

135
The non-suppression of the unmodified molecule signal is essential to guarantee the success of the on-line MS counting of cysteine moieties.

136
On the other hand, the residence time and the probe concentration should be sufficient to achieve all the tagging degrees.

137
Numerical simulations have been used to determine the optimal conditions for this.

138
It has been shown that the cysteine counting is possible for up to five units.

139
When the molecule possesses more cysteine residues, the co-existence of both the untagged and fully tagged molecule is no more likely whatever the residence time.

140
To overcome this, the introduction of a recognition signal could be added to the tag.

141
Some mass tags presenting characteristic isotopic patterns could thus be employed.