MILES TO READ B4 I SLEEP: array cgh

Chromosome microarray analysis: A soothing guide

First published: 24 March 2018

By now, few cytogenetics laboratories in Australia routinely provide an old‐fashioned chromosomal ‘karyotype’ analysis; the microscopic examination of chromosomes from cultured leukocytes is time‐consuming and requires a high level of expertise of the reader. Automated DNA‐based microarray analysis is now the standard first‐line investigation of genetic material in humans, allowing rapid, precise quantification of chromosomes. If chromosomes are the visible library of our genome, with a karyotype, we could at best detect the loss or addition of an entire bookcase. With DNA‐based arrays, we can often tell if a single book is missing, and sometimes a chapter.

Paediatricians depend on chromosome analysis in their investigation of intellectual problems, growth failure or organ malformations and have learned to incorporate microarrays into this over the last decade. While clinical genetics services are available to help, many paediatricians and obstetricians already feel comfortable interpreting these, and this will increase with use. However, some use them less often and are less comfortable, and we are all in the same boat when faced with results of debatable significance or the (daily) identification of completely unique anomalies.

This short discussion not only provides some guidelines for paediatricians to use in their interpretation of the smaller microarray abnormalities they come across but also indicates some pitfalls to avoid. It is not about technicalities but concepts, and it is meant to reassure users that we are only beginning to unravel the mysteries of the human genome, and it is still perfectly acceptable (and often safer) to say ‘I don't know’.

Karyotypes Versus Arrays

It took until the 1960s to be sure of how many chromosomes humans had in each cell and number each pair from 1, the largest, to 22, the smallest (OK it isn't, but they apparently thought it was back then) as well as X and Y chromosomes. By the year 2000, labs could culture, stain and examine all 46 so accurately that over 500 separate bands could be identified under a microscope. This is a ‘karyotype’, a snapshot of the genetic library shelves caught on camera. We had also reached a stage where we knew (mostly) which bands you could live without, double up on or twist around without seeming to come to any harm. We had decades of reports of the effects on individuals with just one or two bands missing or duplicated to refer to if we needed. Pretty sophisticated.

Now, we mainly use microarrays instead. Basically (and here I ask molecular geneticists to breathe deeply and slowly), microarray technology uses DNA extracted from all the chromosomes, pops it on a slide covered with fluorescent probes and, instead of the human eye, uses a machine‐detecting fluorescence to check if all the bits are there. Instead of stripes or bands on the chromosomes, it checks if manufactured probes have detected each chromosome target. First, thousands of big clunky probes called bacterial artificial chromosomes (BACs) were used, followed by tens of thousands of smaller probes called oligonucleotides and now hundreds of thousands of single‐nucleotide polymorphisms (SNPs). This means we are able to detect bits missing or added on to the chromosomes thousands of times smaller than the 500 stripes seen under the microscope, and each new type of array, BAC, Oligo and now SNP, generates more information than the last.

Hunter Genetics Unit in Newcastle was one of the first clinical services to offer array chromosome analysis in Australia in 2007 thanks to Kerry Fagan, and we had a rapid induction into the vast holes in our knowledge then and since. Far from being seasoned experts in the field, we now routinely warn patients that the test may leave us with more questions than answers.

What Can Arrays Detect?

The machine can only tell if a probe finds its target area on each chromosome pair. If all the targets attach twice, the result is normal. If a bit is missing, the machine will record it. If there is a bit duplicated, the machine will record it. These are both examples of ‘copy number variants’ (CNVs). The size of the CNV the particular array can detect depends on the probes used. Older arrays using BACs may have missed some tiny deletions or duplications, so it may be worth repeating the test now. The report often states the resolution of the array used, for example, ‘to a resolution of 100 kilobases (kb)’. To give an idea of how detailed this is, the average gene measures about 15 kb, so such an array could still miss a handful of genes.

What Can Arrays Not Detect?

The machine doesn't actually see the chromosomes and cannot tell if there is a rearrangement of the bookshelves.

No net loss of chromosomal material occurs in the ‘balanced translocations’ found in about 1 in 600 of the population, so the array result would be normal. However, the individual carrying the balanced translocation may be at considerable risk of having a child with a serious chromosomal anomaly.
The machine can't tell where an extra bit or duplication is inserted.

We often assume that a duplication is in ‘tandem’ with the chromosome it came from, that is. doubled up side by side. This is not always the case. The duplicated segment may have broken free and become embedded somewhere distant. It may even be inserted into the middle of an important gene, destroying its function. There may be an extra ‘supernumerary’ chromosome containing the duplicated segment.
The array only looks at bits of chromosomes, not the spelling of individual genes.

This is analogous to it being able to detect a whole shelf or even a couple of books missing in the genetic library, not checking that all the pages are present in the right order, never mind the spelling on every line. To detect a point mutation in a specific gene, we need to sequence that gene and check every letter of the code against a reference. Arrays are sophisticated but not designed for that. A normal array does not exclude a genetic cause for the problem.

Have I Found an Answer? Common Pitfalls Assigning Causality to Array Results

The smaller the anomaly, the less likely it is to be pathogenic.

Sorry, not necessarily. We have seen children grow and thrive with large chunks of chromosome 13 missing, while a miniscule deletion on chromosome 15 caused profound intellectual disability. Some chromosomes have more densely packed genes than others. It depends how many and exactly which genes are in the area affected, and very likely other effects on chromosome replication and function we still cannot completely predict.
Microduplications are much less likely to be pathogenic than microdeletions (after all we have two of each gene already so what harm in three),

Generally true but major exceptions exist:
- Again, pathogenicity depends on the particular genes duplicated. Some tolerate this, some don't. One of the tiniest duplications we can detect with array includes one gene, MECP2, with devastating consequences.
- The edge of a duplication may cut through a gene and disrupt its function, which can only be detected if the position of the genes in the area is examined carefully.
- As described before, the duplication may have strayed from its home and become inserted into a gene elsewhere, causing loss of function of that gene. Originally, all laboratories checked for this by following up every anomaly found with microscopy using detailed fluorescent probes to locate the extra segment, but due to time and funding constraints, few now do this.
Test the parents. If one of them has the same result, it is not pathogenic.

Nope. Parental tests are just part of the picture. If the anomaly is new in the patient (de novo variant), it is more likely to be pathogenic but not always, and there are plenty of ways an inherited CNV may still be pathogenic:
- A microdeletion may be inherited from one parent and a mutation in a gene within the same area from the other parent, uncovering a recessively inherited disease.
- Our examination of parents is notoriously slack – there is too much else to do in the consultation, and they may not volunteer the information that they are illiterate or had speech therapy until high school. Many successful adults are significantly autistic.
Look up the literature. If it is called a ‘syndrome’, then it is pathogenic.

Sorry again. We live in a competitive world in genetics just like the rest of you. Just because someone has written up a couple of cases and called it a syndrome doesn't mean you can assume the microdeletion or duplication is the cause of the problems you were testing for. Dozens of recurrent microdeletions and duplications are being identified, which may have marginal or contributory effects to a very common phenotype such as mild intellectual disability, autism or epilepsy, and the evidence simply isn't in yet to make a call on most of them. The case series are riddled with ascertainment bias and lack of epidemiological proof of causation, and even the attempts at large database series are hampered by lack of comprehensive knowledge of what is ‘normal’. As in all areas of medicine, we are looking at smaller and smaller ‘effect sizes’ these days in genetics. This requires rigorous comparison with vast amounts of normal controls. Don't believe everything you read. It may be useful, but it may also be prudent to keep looking for another cause.
Geneticists use international databases to tell them if an array anomaly is pathogenic.

Sort of true. Where we used to look up books of reports on chromosomal anomalies, we now use international databases like DECIPHER and other genome browsers to search for similar cases and check which genes are located in the area of the microdeletion or duplication. The end result is usually a list of genes or things that look like genes in the area of the deletion, but even in 2017, the function of most of our 22 000 genes is still unknown:
- Even when we have information about a gene, we have to use judgement and a bit of guesswork to determine pathogenicity of a variation in its copy number. A disease caused by a point mutation or spelling mistake in the gene is not necessarily caused by complete loss of one copy of the same gene. A spelling mistake in the DNA code might translate to a protein product with a nasty kink in it; this can be much more disruptive than just producing half the amount of protein with all the curves nature intended.
- Sometimes there are no genes in the area, but the structure of non‐coding DNA outside the genes is also very important and may have effects on genes far from the area studied.
- Similar or comparable cases are rarely found as more and more unique variations are identified. We are constantly looking over the edges of our knowledge and scanning the void beyond.

What the ****** is ‘Loss of Heterozygosity’, ‘LOH’ or ‘Long Stretches of Apparent Homozygosity’ on Those SNP Array Results?

If you have not come across this, do not worry, you soon will. SNP arrays are the latest technology to be used for chromosome analysis (still not gene sequencing, just checking all the books are on the shelves) and add a new dimension to the detail we detect. SNPs are common single‐letter variations in the DNA code, so as well as telling us if their target is present in the right dose, we also find out which letter is present in that spot, A or T, C or G. One SNP tells us nothing much, but a few thousand SNPs form a pretty characteristic signature on each chromosome. As we obtain one of each pair of chromosomes from a different, usually unrelated parent, the pattern of these SNP markers on each chromosome should reflect that difference (=heterozygosity). If the SNP array detects a pattern of markers with more similarity (=homozygosity) than expected between pairs of chromosomes, it will be reported using the terms above; ‘loss of heterozygosity’ says much the same as ‘increased homozygosity’:

This might mean that the parents are closely related, which may be known or unsuspected. The latter is more likely where widespread disruption of family structures has occurred as in the Australian Aboriginal population, where generations have lost knowledge of their family of origin. Treat this information with care and respect.
It may even show enough similarity to suggest incest. Laboratories follow guidelines to ensure they don't jump to this conclusion too readily, and they will usually say more than just ‘area of apparent LOH’ on the report if they think it likely.
Finding an area like this is a red flag to looking for a gene involved in a recessively inherited disorder, which is much more likely to be found in the area of SNP similarity, so it can greatly help diagnosis.
A single chromosome pair may also be identical as, for one cytogenetic reason or another, two copies of that chromosome have accidentally been inherited from the same parent, known as uniparental disomy (UPD). This can be a big clue to diagnosis as well; uniparental disomy 7 of maternal origin is well known to cause Russell–Silver syndrome.

As usual, some good, some less good. If you stumble into this and are not sure how to interpret it or what to say, ask for help. Do not let your jaw drop in front of the parents because the only word you can only remember from this whole article is ‘incest’, and please call your local genetics unit before involving community services.

Does this help? I hope so. Interpreting uncommon microarray results in general medicine can be a bit like resuscitation technique in clinical geneticists, not something we feel entirely comfortable with. Relax and call for help. Like any specialty area, those of us who work in the field become more comfortable with how little we know because at least we understand why.

MILES TO READ B4 I SLEEP

Friday, 30 March 2018

array cgh