aCGH-Smooth is a tool for automatic breakpoint identification and smoothing of array-CGH data.
aCGH-Smooth employs a "smoothing" algorithm that tries to adjust the observed array-CGH values such that they represent the copy number of the most common tumor cells. When the observed relative copy numbers (or their logarithms) for the clones are ordered by location on the genome, the values form ``clouds'' with different means, supposedly reflecting different levels of copy numbers, where a change of level represents a breakpoint. The problem is formalized as model fitting to search for most-likely-fit model given the data. A model describes a number of breakpoints, a position for each, and parameters of the distribution of copy number for each. The algorithm estimates the real parameters of the model from the observed array-CGH values.
aCGH-Smooth assumes that the data are generated by a Gaussian process and uses the maximum likelihood criterion for measuring the goodness of a partition, adjusted with a penalization term for taking into account model dimension. A local search procedure searches for a most probable partition of the data using N breakpoints, for a given N. The procedure is incorporated into a genetic algorithm that evolves a population of partitions with possibly different number of breakpoints that may vary during execution.
aCGH-Smooth is written in visual C++. It has a user friendly interface including a visualization of the results which highlights the obtained smoothing and allows the user to influence the smoothing and number of breakpoints by setting the value of suitable parameters.
In this help we will assume you are familiar with [1].
aCGH-Smooth requires Microsoft Windows and Excel 97 or newer installed on your system.
aCGH-Smooth uses Excel files as input. It can read files produced by the "UCSF Spot" software [2]. The first row should contain the column headers. The following headers should at least be present
aCGH-Smooth will look for these column headers in the first row of the first sheet of the Excel file. It will look in the first row until it finds a column with an empty header.
aCGH-Smooth will ignore every line in which there is no value for some of these columns. It will stop reading if none of these columns have a value.
There are a few things to note:
Opening files can be done in several ways. You can use “open” and “open all” in the file menu or use the icons in the toolbar.
“Open” allows you to select an Excel file. “Open all” allows you to select a directory in which there are some Excel files.
Saving files can be done in several ways. You can use “save” and “save all” in the file menu or use the icons in the toolbar.
Saving a file can only be done after smoothing. The dialog window that appears allows you to select 2 options. “Save new ratio” will add an extra column “NewRat” to the Excel file, giving the new found smooth ratio values. “Save type” will add an extra column “Type” to the Excel file, giving the type of each “clone”, that is “loss” (0), “normal” (1), ”gain” (2) or “amplification” (3).
Smoothing files can be done in several ways. You can use “smooth” and “smooth all” in the edit menu or use the icons in the toolbar. “Smooth” smooths the active or selected document and “smooth all” smooths all opened documents.
Changing parameters for the smoothing algorithm can be done in several ways. You can use “parameters” in the edit menu or use the icon in the toolbar.
The following parameters are available:
Sometimes if opening a file went wrong, next time you use Excel it does not work. The problem is that after aCGH-Smooth opened a (hidden) instance of Excel, it did not close the Excel application correctly after the error. The temporary solution (at least for Windws NT, 2000 and XP) is to open the Windows Task Manager (right click on taskbar), go to “processes”, locate Excel, right-click Excel and select “end process”.
[1] Jong K, Marchiori E, Vaart A van der, Ylstra B, Weiss M, Meijer G. (2003) Chromosomal Breakpoint Detection in Human Cancer. Proceedings EvoBio 2003.
[2] Jain AN, Tokuyasu TA, Snijders AM, Segraves R, Albertson DG, Pinkel D. (2002) Fully automatic quantification of microarray image data. Genome Res. 12:325-32.