‹‹ Back to SVS Home

Pre-Study Power Calculation

9.3 Pre-Study Power Calculation

Summary

The PBAT capabilities for power calculations are software implementation of the approaches to analytical power calculations for FBATs by Dr. Christoph Lange ([Lange 2002aLange 2002bLange 2002c]). The power of family-based association tests (FBATs) and population-based association tests can be assessed for a large variety of study designs:

  • Dichotomous/binary and continuous traits for family designs.
  • Dichotomous/binary and quantitative traits for population designs.
  • Computation of power for a given sample size.
  • Computation of required sample size for a given power and significance level.
  • Missing parental information.
  • Multiple offspring per family.
  • Combinations of different family-types.
  • Different genetic models.
  • Different ascertainment conditions for the first and second proband.
  • Marker and disease locus are not identical.
  • Combination of different family-types and different ascertainment conditions.
  • Verification of all power calculations by Monte-Carlo simulations.
Using Pre-Study Power Calculation

To perform power calculation analysis, select Tools > Pre-Study Power Calculation from the project navigator.

Power Calculation Types

Four types of power calculations are supported:

  • Family-based using a binary trait
  • Family-based using a continuous trait
  • Population-based using case/control status
  • Population-based using a quantitative trait

Parameters for these types of power calculation are organized within four tabs, three of which are used at any given time:

  • Methods
  • Family Design
  • Genetic Model
  • Computational

See the subsection below for the type of power calculation you wish to perform. See A Glossary of Terms Used in Genetic Analysis for definitions of terms. Once you have set the parameters for the type of power calculation you wish to perform, click Run to begin.

A progress dialog will track the progress of the calculations. Pressing Cancel will stop the calculation of power.

When the power calculations have finished, a text viewer will appear. This viewer will be associated with a new Navigator Node.

Methods Tab (all designs)

The Methods tab (see Figure 56) contains options to set the type of calculation to be performed, the significance level and the type of computation method to calculate the power.


[Picture]

Figure 56: Methods Tab for Power Calculations for Binary Traits

Type of Computation

Select the calculation type that matches the study design under consideration. Other fields and/or tabs will be made accessible depending on the selection.

Statistical Parameters

The Significance Level is the probability of wrongly rejecting the null hypothesis when in fact it is true (probability of type one error). Ideally the Significance Level should be as small as possible. The default is 0.01.

Computation Method – Family-Based Studies

Three computation methods apply to family-based studies, and may be used with either binary or continuous traits. PBAT will use the selected method to compute the power of the FBAT statistic. These methods are:

  • Numerical Integration When the power of the FBAT statistic is computed based on numerical integration, the numerical precision is 0.01. This method can take several minutes depending on the complexity of the study design and the computer speed.
  • Approximation The analytical power of the FBAT statistic will be computed based on a second-order Taylor expansion. The precision is good for sample sizes of at least 100 families, and it is the fastest approach. This method is described in [Knapp 1999Lange 2002a].
  • Simulation The power will be estimated based on one million Monte-Carlo simulations and can take up to several minutes.

Two computation methods apply to population-based studies, and may be used with either case/control or quantitative trait studies of unrelated individuals. These methods are:

  • Compute Power For Given Sample Size Create a table of predicted powers based on sample size information.
  • Compute Required Sample Size For Given Power and Significance Level Create a table of sample sizes required to achieve a given power under various tests.
Family Design Tab – Binary Traits

The Family Design for a Binary trait contains the options which allow the specification of multiple family types which will be included in the calculations. These options include:

  • Number of families
  • Number of offspring per family
  • Number of missing parents
  • Whether additional offspring are phenotyped
  • Ascertainment conditions for the probands (these may be set to Unaffected, Affected, or N/A).

Each of these options can be specified for a given family design, and multiple family designs can be included in a set of calculations (e.g. one set of calculations could be run with 100 families with 2 offspring and 1 missing parent, and 50 families with 3 offspring and 0 missing parents, etc.). Usually more families will increase the power of the study while missing parents will decrease the power of the study. See Figure 57


[Picture]

Figure 57: Family Design Tab for Power Calculations for Binary Traits

To add a family type, enter the appropriate values for the options contained in the Change Family Design group, and click the Add Design button. The family design will appear in the list of Family Designs Currently Used, and will be included in the calculation of power.

Similarly, to remove a family type from the calculations, highlight the corresponding entry in the list of included family designs by clicking on it, then click the Remove Design button.

Family Design Tab – Continuous Traits

The Family Design for a continuous trait contains the options which allow the specification of multiple family types which will be included in the calculations. These options include:

  • Number of families
  • Number of offspring per family
  • Number of missing parents
  • Whether additional offspring are phenotyped
  • Ascertainment conditions for the probands

Each of these options can be specified for a given family design, and multiple family designs can be included in a set of calculations (e.g. one set of calculations could be run with 100 families with 2 offspring and 1 missing parent, and 50 families with 3 offspring and 0 missing parents, etc.). Usually more families will increase the power of the study while missing parents will decrease the power of the study.

Any of ten ascertainment conditions may be specified. The numbers in the ascertainment conditions refer to sampling conditions for the phenotypes of the first and second probands. These are specified by the corresponding probabilities of the phenotypic distributions of the traits.

Ascertainment condition 1 is predefined and may not be changed–it is always equivalent to a total population sample.

Ascertainment conditions 2 through 10 may be specified. For example, suppose ascertainment condition 2 is set as follows (see Figure 58):

  • Proband 1 Lower: 0.0
  • Proband 1 Upper: 0.25
  • Proband 2 Lower: 0.85
  • Proband 2 Upper: 1.0

For this condition, the trait of the first proband must be in the lower 25% of the phenotypic distribution, while the trait of the second proband must be in the upper 15% of the phenotypic distribution.


[Picture]

Figure 58: Family Design Tab for Power Calculations for Continuous Traits

To add a family type, enter the appropriate values for the options contained in the Change Family Design group, and click the Add Design button. The family design will appear in the list of Family Designs Currently Used, and will be included in the calculation of power. The currently highlighted ascertainment condition will be associated with the new family design.

Similarly, to remove a family type from the calculations, highlight the corresponding entry in the list of included family designs by clicking on it, then click the Remove Design button.

Genetic Model Tab – Family-design Binary Trait

Under the Genetic Model tab you can specify the genetic model underlying the power calculations. Note the disease gene is specified by allele “A”.

The basis for defining the genetic model may be specified (within the Specify Basis box) as follows (see Figure 59):

  • MOI, p, K, AF: Mode of inheritance (MOI), allele frequency (p), disease prevalence (K), attributable fraction (AF)
  • Penetrance Values and Allele Frequency
  • MOI, p, K, Odds Ratio
  • MOI, p, K, Allelic Odds Ratio

In addition, two modes exist for power output.

  • Enter zero for the allele frequency increment to allow entering parameters related to the disease gene not being the same as the marker gene. One power value will be output.
  • Enter a non-zero allele frequency increment. The calculations will take place as if the disease gene is the same as the marker gene. Power values will be output for allele frequencies starting with the “allele frequency for the disease gene” and incrementing by the allele frequency increment value.

Selecting the basis and selecting an allele frequency increment of zero vs. non-zero selects the parameters that are used to define the genetic model.

NOTE:

  • The mode of inheritance is the manner in which a particular genetic trait or disorder is passed from one generation to the next. Examples of MOI are autosomal dominant, autosomal recessive, X-linked dominant, X-linked recessive, multifactorial and mitochondrial, etc.
  • Penetrance indicates the likelihood that a given gene will actually result in the disease.
  • Odds ratio is a way of comparing whether the probability of a certain event is the same for two groups.
  • Attributable fraction is the proportion of disease occurrence that can be potentially eliminated if the exposure was prevented.

[Picture]

Figure 59: Genetic Model Tab for Power Calculations for Binary Traits

The parameters for the respective bases are as follows:

  • MOI, p, K, AF: with this basis selected, the following parameters are available for specification:
    • Allele Frequency (Marker Gene): allele frequency of the marker gene. (Enter if entering zero for the allele frequency increment.)
    • Allele Frequency (Disease Gene): allele frequency of the disease gene.
    • Genetic Attributable Fraction: the proportion of the disease occurrence that would potentially be eliminated if the disease gene were not present.
    • Allele Frequency Increment: increment in allele frequency per iteration of the power calculations. Enter zero to allow an offset to be specified.
    • P(Disease allele A—Marker allele A)[0;1]: the conditional probability of observing the disease allele A given the presence of marker allele A. (Enter if entering zero for the allele frequency increment.)
    • Model (mode of inheritance)
      • Additive
      • Multi (Multifactorial)
      • Dominant
      • Recessive
    • Offset: The offset to use for the FBAT statistic. (Enter if entering zero for the allele frequency increment. Otherwise, the offset will be automatically set to the population mean.)
    • Population Prevalence: the percentage of the population estimated to have the particular disease at a specific time.
    • Disease Locus = Marker Locus: indicates whether the disease locus is equal to the marker locus. (Enter if entering zero for the allele frequency increment.)
  • The following parameters will be calculated according to the inputs to the above parameters:
    • Penetrance for AA (genotype)
    • Penetrance for AB (genotype)
    • Penetrance for BB (genotype)
    • Relative Risk RR1 (relative risk for carrying one disease allele)
    • Relative Risk RR2 (relative risk for carrying two disease alleles)
    • Odds Ratio OR1 (odds ratio for carrying one disease allele)
    • Odds Ratio OR2 (odds ratio for carrying two disease alleles)
    • D’ between the marker gene and the disease gene (when zero is entered for the allele frequency increment)
    • Offset (= the population mean) (for a non-zero allele frequency increment)
  • Penetrance Values and Allele Frequency with this basis selected, the following parameters are available for specification:
    • Penetrance for AA: penetrance fraction for the AA genotype
    • Penetrance for AB: penetrance fraction for the AB genotype
    • Penetrance for BB: penetrance fraction for the BB genotype
    • Allele Frequency (Marker Gene): allele frequency of the marker gene. (Enter if entering zero for the allele frequency increment.)
    • Allele Frequency (Disease Gene): allele frequency of the disease gene.
    • Allele Frequency Increment: increment in allele frequency per iteration of the power calculations. Enter zero to allow an offset to be specified.
    • P(Disease allele A—Marker allele A)[0;1]: the conditional probability of observing the disease allele A given the presence of marker allele A. (Enter if entering zero for the allele frequency increment.)
    • Model (mode of inheritance)
      • Additive
      • Multi (Multifactorial)
      • Dominant
      • Recessive
    • Offset: The offset to use for the FBAT statistic. (Enter if entering zero for the allele frequency increment.)
    • Disease Locus = Marker Locus: indicates whether the disease locus is equal to the marker locus. (Enter if entering zero for the allele frequency increment.)
  • The following parameters will be calculated according to the inputs to the above parameters:
    • Population prevalence of the disease
    • Genetic attributable fraction of the gene
    • Relative Risk RR1 (relative risk for carrying one disease allele)
    • Relative Risk RR2 (relative risk for carrying two disease alleles)
    • Odds Ratio OR1 (odds ratio for carrying one disease allele)
    • Odds Ratio OR2 (odds ratio for carrying two disease alleles)
    • D’ between the marker gene and the disease gene (when zero is entered for the allele frequency increment)
    • Offset (= the population mean) (for a non-zero allele frequency increment)
  • MOI, p, K, Odds Ratio with this basis selected, the following parameters are available for specification:
    • Allele Frequency (Marker Gene): allele frequency of the marker gene. (Enter if entering zero for the allele frequency increment.)
    • Allele Frequency (Disease Gene): allele frequency of the disease gene.
    • Allele Frequency Increment: increment in allele frequency per iteration of the power calculations. Enter zero to allow an offset to be specified.
    • P(Disease allele A—Marker allele A)[0;1]: the conditional probability of observing the disease allele A given the presence of marker allele A. (Enter if entering zero for the allele frequency increment.)
    • Odds Ratio: Odds ratio for carrying one disease allele. (OR1)
    • Model (mode of inheritance)
      • Additive
      • Multi (Multifactorial)
      • Dominant
      • Recessive
    • Offset: The offset to use for the FBAT statistic. (Enter if entering zero for the allele frequency increment.)
    • Population Prevalence: the percentage of the population estimated to have the particular disease at a specific time.
    • Disease Locus = Marker Locus: indicates whether the disease locus is equal to the marker locus. (Enter if entering zero for the allele frequency increment.)
  • The following parameters will be calculated according to the inputs to the above parameters:
    • Genetic attributable fraction of the gene
    • Penetrance for AA (genotype)
    • Penetrance for AB (genotype)
    • Penetrance for BB (genotype)
    • Relative Risk RR1 (relative risk for carrying one disease allele)
    • Relative Risk RR2 (relative risk for carrying two disease alleles)
    • Odds Ratio OR2 (odds ratio for carrying two disease alleles)
    • Allelic Odds Ratio (odds ratio for an allele being a disease allele)
    • D’ between the marker gene and the disease gene (when zero is entered for the allele frequency increment)
    • Offset (= the population mean) (for a non-zero allele frequency increment)
  • MOI, p, K, Allelic Odds Ratio with this basis selected, the following parameters are available for specification:
    • Allele Frequency (Marker Gene): allele frequency of the marker gene. (Enter if entering zero for the allele frequency increment.)
    • Allele Frequency (Disease Gene): allele frequency of the disease gene.
    • Allele Frequency Increment: increment in allele frequency per iteration of the power calculations. Enter zero to allow an offset to be specified.
    • P(Disease allele A—Marker allele A)[0;1]: the conditional probability of observing the disease allele A given the presence of marker allele A. (Enter if entering zero for the allele frequency increment.)
    • Allelic Odds Ratio: Odds ratio for an allele being a disease allele.
    • Model (mode of inheritance)
      • Additive
      • Multi (Multifactorial)
      • Dominant
      • Recessive
    • Offset: The offset to use for the FBAT statistic. (Enter if entering zero for the allele frequency increment.)
    • Population Prevalence: the percentage of the population estimated to have the particular disease at a specific time.
    • Disease Locus = Marker Locus: indicates whether the disease locus is equal to the marker locus. (Enter if entering zero for the allele frequency increment.)
  • The following parameters will be calculated according to the inputs to the above parameters:
    • Genetic attributable fraction of the gene
    • Penetrance for AA (genotype)
    • Penetrance for AB (genotype)
    • Penetrance for BB (genotype)
    • Relative Risk RR1 (relative risk for carrying one disease allele)
    • Relative Risk RR2 (relative risk for carrying two disease alleles)
    • Odds Ratio OR1 (odds ratio for carrying one disease allele)
    • Odds Ratio OR2 (odds ratio for carrying two disease alleles)
    • D’ between the marker gene and the disease gene (when zero is entered for the allele frequency increment)
    • Offset (= the population mean) (for a non-zero allele frequency increment)
Genetic Model Tab – Family-design Continuous Trait

Under the Genetic Model Tab, you can specify the genetic model underlying the power calculations. Note, the disease gene is specified by allele “A”. See Figure 60.


[Picture]

Figure 60: Genetic Model Tab for Power Calculations for Continuous Traits

Two modes exist for power output.

  • Enter zero for the allele frequency increment to allow entering parameters related to the disease gene not being the same as the marker gene. One power value will be output.
  • Enter a non-zero allele frequency increment. The calculations will take place as if the disease gene is the same as the marker gene. Power values will be output for allele frequencies starting with the “allele frequency for the disease gene” and incrementing by the allele frequency increment value.

Selecting an allele frequency increment of zero vs. non-zero selects the parameters that are used to define the genetic model for family-based calculations with continuous traits.

The following parameters may be used to specify a model for family-based calculations with continuous traits:

  • Allele Frequency (Marker Gene): allele frequency of the marker gene. (Enter if entering zero for the allele frequency increment.)
  • Allele Frequency (Disease Gene): allele frequency of the disease gene.
  • Allele Frequency Increment: increment in allele frequency per iteration of the power calculations. Enter zero to allow an offset to be specified.
  • P(Disease allele A—Marker allele A)[0;1]: the conditional probability of observing the disease allele A given the presence of marker allele A. (Enter if entering zero for the allele frequency increment.)
  • Heritability: A measure of the degree to which the variance in the distribution of a phenotype is due to genetic causes.
  • Model (mode of inheritance)
    • Additive
    • Dominant
    • Recessive
  • Offset: The offset to use for the FBAT statistic. (Enter if entering zero for the allele frequency increment.)
  • Disease Locus = Marker Locus: indicates whether the disease locus is equal to the marker locus. (Enter if entering zero for the allele frequency increment.)

The following parameters will be calculated according to the inputs to the above parameters:

  • Total population mean
  • D’ between the marker gene and the disease gene (when zero is entered for the allele frequency increment)
  • Offset (for a non-zero allele frequency increment)
Genetic Model Tab – Population-design Case/Control Trait

Under the Genetic Model tab you can specify the genetic model underlying the power calculations. Note, the disease gene is specified by allele “A”.

The basis for defining the genetic model may be specified (within the Specify Basis box) as follows (see Figure 61):

  • MOI, p, K, Odds Ratio
  • MOI, p, K, Allelic Odds Ratio

Selecting the basis selects the parameters that are used to define the genetic model.


[Picture]

Figure 61: Genetic Model Tab for Power Calculations for Case/Control Traits

The parameters for the respective bases are as follows:

  • MOI, p, K, Odds Ratio with this basis selected, the following parameters are available for specification:
    • Min allele frequency of the disease allele: allele frequencies for the disease allele are calculated starting from this point based on the Allele Frequency Increment.
    • Allele Frequency Increment: increment in allele frequency per iteration of the power calculations.
    • Odds Ratio OR1 (AB versus BB): Odds ratio for carrying one disease allele.
    • Model (mode of inheritance)
      • Additive
      • Multi (Multifactorial)
      • Dominant
      • Recessive
    • Population Prevalence: the percentage of the population estimated to have the particular disease at a specific time.
  • The following parameters will be calculated according to the inputs to the above parameters:
    • Odds Ratio OR2 (odds ratio for carrying two disease alleles)
    • Allelic Odds Ratio (odds ratio for an allele being a disease allele)
  • MOI, p, K, Allelic Odds Ratio with this basis selected, the following parameters are available for specification:
    • Min allele frequency of the disease allele: allele frequencies for the disease allele are calculated starting from this point based on the Allele Frequency Increment.
    • Allele Frequency Increment: increment in allele frequency per iteration of the power calculations.
    • Allelic Odds Ratio: Odds ratio for an allele being a disease allele.
    • Model (mode of inheritance)
      • Additive
      • Multi (Multifactorial)
      • Dominant
      • Recessive
    • Population Prevalence: the percentage of the population estimated to have the particular disease at a specific time.
  • The following parameters will be calculated according to the inputs to the above parameters:
    • Odds Ratio OR1 (odds ratio for carrying one disease allele)
    • Odds Ratio OR2 (odds ratio for carrying two disease alleles)
Genetic Model Tab – Population-design Quantitative Trait

Under the Genetic Model tab you can specify the genetic model underlying the power calculations. Note, the disease gene is specified by allele “A”.

The basis for defining the genetic model (automatically specified within the Specify Basis box) is always (see Figure 62):

  • MOI, p, K, Heritability

Under this basis, certain parameters that are used to define the genetic model are selected.


[Picture]

Figure 62: Genetic Model Tab for Power Calculations for Quantitative Traits

The parameters for the following basis are as follows:

  • MOI, p, K, Heritability with this basis selected, the following parameters are available for specification:
    • Min allele frequency of the disease allele: allele frequencies for the disease allele are calculated starting from this point based on the Allele Frequency Increment.
    • Allele Frequency Increment: increment in allele frequency per iteration of the power calculations.
    • Heritability: A measure of the degree to which the variance in the distribution of a phenotype is due to genetic causes.
    • Model (mode of inheritance)
      • Additive
      • Dominant
      • Recessive
Computational Tab – Population-Based Case/Control Trait

The options for this tab are different based on if you are computing power based on sample size, or sample size based on required power and significance level.

Compute Power For Given Sample Size


[Picture]

Figure 63: Computational Tab for Power Calculations for Case/Control Traits

The options available for the Computational Tab when computing the power based on the sample size information are (see Figure 63):

  • Power and Sample Size Computational Parameters:
    • Number of Simulations
  • Case/Control Computational Parameters
    • Number of Cases
    • Number of Controls
  • Specify GWQ and QC parameters: (GWQ = Genome Wide Quality, QC = Quality Control)
    • Number of genotyped markers
    • Average call rate for common homozygous genotype
    • Average call rate for heterozygous genotype
    • Average call rate for rare homozygous genotype

Compute Required Sample Size For Given Power and Significance Level


[Picture]

Figure 64: Computational Tab for Sample Size Calculations for Case/Control Traits

The options available for the Computational Tab when computing the sample size based on required power and significance level are (see Figure 64):

  • Power and Sample Size Computational Parameters:
    • Number of Simulations
    • Achieved power for sample size calculations
  • Case/Control Computational Parameters
    • Ratio: cases vs. controls
  • Specify GWQ and QC parameters: (GWQ = Genome Wide Quality, QC = Quality Control)
    • Number of genotyped markers
    • Average call rate for common homozygous genotype
    • Average call rate for heterozygous genotype
    • Average call rate for rare homozygous genotype
Computational Tab – Population-Based Quantitative Trait

The options for this tab are different based on if you are computing power based on sample size, or sample size based on required power and significance level.

Compute Power For Given Sample Size


[Picture]

Figure 65: Computational Tab for Power Calculations for Quantitative Traits

The options available for the Computational Tab when computing the sample size based on required power and significance level are (see Figure 65):

  • Power and Sample Size Computational Parameters:
    • Number of Simulations
  • Quantitative Computational Parameters
    • Number of probands
  • Specify GWQ and QC parameters: (GWQ = Genome Wide Quality, QC = Quality Control)
    • Average call rate for common homozygous genotype
    • Average call rate for heterozygous genotype
    • Average call rate for rare homozygous genotype

Compute Required Sample Size For Given Power and Significance Level


[Picture]

Figure 66: Computational Tab for Sample Size Calculations for Quantitative Traits

The options available for the Computational Tab when computing the sample size based on required power and significance level are (see Figure 66):

  • Power and Sample Size Computational Parameters:
    • Number of Simulations
    • Achieved power for sample size calculations
  • Specify GWQ and QC parameters: (GWQ = Genome Wide Quality, QC = Quality Control)
    • Average call rate for common homozygous genotype
    • Average call rate for heterozygous genotype
    • Average call rate for rare homozygous genotype
PBAT Pre-Study Power Calculation Results

The results for the Pre-Study Power Calculations are displayed in a text viewer (see Figure 67). The output displayed depends on the type of calculation performed.


[Picture]

Figure 67: Text Viewer for PBAT Pre-Study Power Calculation Results