Dr. Eric Bechhoefer, Chief Engineer/CEO GPMS
Last week, a very bright engineering leader at a major helicopter OEM asked ‘How can you determine the vibration tolerance of a component without the manufacturer’s data?” Put another way, how do we tell ‘good’ from ‘bad’ and set our thresholds? Because this topic has come up before, I wanted to post my reply. Be warned — this is a bit technical.
This question really has four answers because, like many things, it depends on circumstance. First, when the OEM provides a limit, such as on the main rotor, tail rotor, or short shaft, we use their guidance. We can put the vibration limit into our config and trigger alerts when measured levels exceed it.
But OEMs don’t typically have limits for most components in the aircraft. (Instead, the OEM establishes maintenance practices and TBO based on the spectrum of usage that they design the aircraft around. This is the process used for safe life design.) Therefore, we effectively need to determine those thresholds for ourselves.
Happily, in the field of condition monitoring, this is an old problem with fifty plus years of research to inform us. What follows is a discussion of how we at GPMS approach each of the dynamic components on an aircraft — shafts, bearings and gears. Woven into the narrative is our common approach across component types and aircraft. This can be summarized as follows:
1. We look at physical measurements eg size of bearing, number of teeth in a gear, etc. These can come from OEMs or through our own investigations;
2. Based on these physical measurements, we use established methods from the field of vibration-based condition monitoring to compute what nominal looks like;
3. As we get data we then ask when the component is “no longer good” eg when it has sufficiently departed from nominal to pass our thresholds. This last step is critical: we take a random sample of 50 data points from all CIs from as many separate aircraft of the same make and model as possible. Iinstead of simply using standard deviations from nominal, we take note of the unique distribution of Condition Indicator data (Raleigh vs Gaussian) and set the thresholds based on deviation using a probability of false alarm (PFA) of 10^-6.
In the case of the shaft, for which we have physical features such as shaft order one (SO1) inches per second (IPS) we typically set limits based on user experience that are tighter than the OEM limits. For example, the threshold for SO1 on the Bell 407 main rotor is 1 IPS. As a maintainer, you know this is a HUGE amount of vibration. One of our customers requested to set the threshold for warning at 0.25 IPS, and alarm at 0.38 IPS. Because we can always can provide an adjustment after a flight, and the operator trusts our adjustment, this customer will typically make an adjustment at the end of the day when the vibration gets close to 0.25 IPS. They schedule no test flight. As a result, their fleets average SO1 is about 0.11 IPS, whereas the average for Bell 407’s we monitor as a whole is 0.4 IPS.
We treat internal shafts (within the gearbox) different than outer shafts (gearbox input shaft, tail rotor drive shaft). For internal shaft, we look at SO1 and SO2 (second shaft harmonic, which is sensitive to a bent shaft) and use the within and between aircraft variance to establish a threshold based on a 1 in a million probability of false alarm. Typically, this process results in SO1 thresholds on the order 0.02 IPS or so (which is pretty smooth). For external shafts, we use SO1, SO2, and SO3 (third harmonic, which is sensitive to coupling failures). We typically set an IEC limit for SO1 of 0.25 IPS, and set SO2, SO3 statically, again, based on the within and between aircraft.
The process is mathematically robust and defines a hypothesis test. In contrast, most HUMS manufacturers set thresholds by asking the questions “when is it bad,” which is very hard to determine. We ask the subtlety different question “When is it no longer good?” This construction of the question allows us to define a hypothesis test, for which I have developed a theory to support. Our system uses Condition Indicators (CIs) to establish evidence that the component is no longer good. The false alarm rate when we recommend a warning is small, about 1 in 100 million. You can be very well assured that if we say the component is no longer good, it isn’t.
The process for bearings is somewhat different because there is no OEM guidance. Because the vibration signals of a faulted bearing are small compared to shaft order and gear mesh, detection of fault at the bearing rate frequencies using Fourier analysis is difficult. Fault detection of the baseband frequencies of the bearing rate is “stage 1” fault detection. Bearing faults detected using these types of analyses are late-stage — that is, the bearing can be close to catastrophic failure. At the very least, a bearing in this state is generating metal, which can cause damage to other components within the gearbox.
Ultrasonic emission can detect bearing inner and outer race roughness (a “stage 3” fault). Still, the remaining useful life of a bearing at this stage is relatively long compared to the overall life of the bearing. Bearing envelope analysis (BEA) can typically detect bearing faults 100s of hours before it is appropriate to do maintenance.
BEA is based on demodulation of high-frequency resonance associated with bearing element impacts. For rolling element bearings, when the rolling elements strike a local fault on the inner or outer race or a fault on a rolling element strikes the inner or outer race, an impact is produced. These impacts modulate a signal at the associated bearing pass frequencies, such as Cage Pass Frequency (CPF), Ball Pass Frequency Outer Race (BPFO), Ball Pass Frequency Inner Race (BPFI), and Ball Fault Frequency (BFF). Figure (1) is an Outer Race Fault, where the BPFO is approximately 80 Hz. Note that the modulation rate, T1, is approximately .0125 seconds (e.g., 1/80 Hz). The time T2, the period of the resonance, is approximately 1.12e-4 seconds, or about 9000 Hz. Note that the time domain representation is the superposition of many resonances of the bearing itself.
![]()
Figure 1 Example Outer Race Fault
Mathematically, the modulation is described as:
![]()
This is amplitude modulation of the bearing rate (a) with the high-frequency carrier signal (resonant frequency (b)). This causes sidebands in the spectrum surrounding the resonant frequency. It is sometimes difficult to distinguish the exact frequency of the resonance. It is usually not known a priori and cannot be determined easily without a faulted component. However, demodulation techniques typically do not need to know the exact frequency. One method for the BEA involves multiplying the vibration signal by a resonant frequency (example, 9 kHz). This is then low pass filtered to remove the high-frequency image, decimated, and the spectral power density is estimated. (Eq 2)
![]()
The bearing components have many vibration modes, which will correspondingly generate resonance at various frequencies throughout the spectrum. The selection of the frequency range used to demodulate the bearing rate signal (e.g., the window center frequency) should take into account some issues: First, the gearbox spectrum contains several high-energy frequencies from shaft and gear harmonics, which would mask analysis at lower bearing frequencies. Second, there are several accelerometers with natural resonance at frequencies that are similar to the bearing modes. Using a higher frequency window close to the accelerometer resonance can amplify the bearing fault signal, increasing the probability of fault detection.
BEA should be performed at frequencies higher than the shaft and gear mesh frequencies. This ensures that the demodulated bearing frequencies are not masked by the other rotating sources, such as shaft and gear mesh, which are present at CPF, BPFO, BPFI, and BFF frequencies. Typical shaft order amplitudes of 0.1 G’s and gear mesh amplitudes of 10s of G’s are typical. Damaged bearing amplitudes are 0.003 G’s.
Note that because we perform the envelope at a frequency that is higher than those associated with the gearbox shaft/gears, the spectrum associated with health components is Gaussian. It can be proved that the spectrum of a Gaussian system is Rayleigh distributed. The square root of the sum of four Rayleigh CIs (representing the cage, ball, inner and outer race spectral energy) can then be shown to be Nakagami. Given that we can calculate the within and between aircraft variance, we can calculate the inverse cumulative distribution, and hence the threshold based on the probability of false alarm (again, set at 1 in a million, or “6 9s” reliability).
Gears are a slightly different process, as the phenomenology is different. Gears are complex and have several different failure modes (at least six). All analysis for gears is based on the time-synchronous average (TSA). For a shaft, we use a tachometer as a key phasor to resample the data for that shaft, to make synchronous to the angular position of that shaft. This filters out vibration associated with different shafts, gears, and bearings within the gearbox. Then we operate on the TSA to extract features that are sensitive to gear faults.
![]()
For example, if we remove vibration associated with the gear mesh and SO1, SO2, and SO3 of the shaft, what is left would be random noise. The kurtosis would be close to 3 (Gaussian). If there is a fault (see above, due to a chipped tooth), the signal is no longer Gaussian, the statics change, and we can get the threshold on this. For gear analysis, we generate 18 condition indicators based on the residual, narrowband analysis, the energy operator, amplitude modulation and frequency modulation analysis. No one condition indicator works for every gear fault. We again fuse a number of condition indicators into a health indicator.
The process is similar to bearing thresholding, but a transform is made on the gear CIs to make them more “Rayleigh” like, again, using the of square Rayleigh and taking advantage of the inverse cumulative Nakagami distribution to find a threshold.
We have run many gearboxes to failure in testing – we may have the largest set of fault data in the world. What we find with the threshold setting process is conservative. While we recommend maintenance at a Health Indicator (HI) of 1 (physical damage is visible), the bearing will seize around a HI of 50 to 100. While it may take 150 to 300 hours to go from 0.5 to 1, it may take only 50 hours to go to HI 10. We have never run a gear to failure, typically stopping testing at 3 to 5.
***
As you can see, there is a lot that goes into developing reliable thresholds per component type. This methodology has been reviewed and validated by no less than five rotorcraft OEMs and passed peer review at technical societies such as the Prognostic Health Management Society. I’ve got a large body of published work you can find here: https://www.researchgate.net/profile/Eric_Bechhoefer
Lets me cover other questions that typically come up on this topic.
Do you verify your model by reviewing removed components? Yes. When we have finds we always request an OEM review of the removed part to confirm it is no longer in spec. To date, ALL of our finds have been confirmed as ‘no longer nominal’ in the test lab thereafter.
How long does it take to establish thresholds? Does your system learn and do the thresholds improve over time? Because we begin with a physics-informed model, thresholds are set soon after we begin to collect data. This is not to say that we don’t adjust thresholds as we learn more over time. As we gather more data we do make adjustments. But the underlying methodology is fairly constant. And because our methodology is physics based, we do not depend on large data sets, extensive training, and LLM AI models. These are impractical in this arena and have actually been proven less reliable than traditional physics-informed models. See https://www.gpms-vt.com/blog/big-data-vs-physics-based-fault-detection-gpmss-approach-to-machine-condition-monitoring/ for more information.
Can validation on one platform reliably carry over to another? Yes. As we like to say, “a bearing is a bearing” whether its on a Bell or a Russian MI. While the physical measurements differ on different platforms, our methodology takes these variations into account. So long as we have the physical data we need – whether gotten from the OEM or not – we can set reliable thresholds. We are understandably proud of our OEM endorsements but sometimes the lack of an endorsement hinges on non-technical factors. For instance, we once had an instance where an OEM had validated us on one platform but not another when the system and software building blocks are virtually the same. Protection for an internal HUMS product drove the endorsement decision.
Of course the most gratifying indicator of success is the feedback we get from maintainers when they remove a part we’ve identified as out of spec. When they report that their own inspection validated the tool’s detection capabilities we know we’re helping them avoid unplanned downtime and/or collateral damage to their gearbox. Ultimately, we hope the information Foresight provides helps to improve safety and efficiency of operations.
