Double Star Data Reduction

An example of the output data plots. The first two show a slice of the original data before processing. The last four plots show the data after being processed.

I have posted my first version of Python code on github which can process through large groups of FITS data cubes for reduction. The main features are breaking up FITS files into smaller pieces, writing the header data into a CSV file for an easy “log” of your observation run, and align, stack, and process double star images down to a single, pristine image. This was only my first attempt at doing the processing, and I was able to get it working fairly easily using some of the scientific Python modules such as NumPy, SciPy, Matplotlib, and PyFITS.

I chose to work with the FITS data cubes since that seems to be the widest accepted format of astronomical data, and also is an open standard that can be manipulated by anyone. The basic workflow for alignment goes something like this:

  1. Select the first slice of data from the FITS cube as the “reference” slice to which all other slices will be aligned.
  2. Normalize it based on a range set by the user.
  3. Select the data from that slice which is over a certain threshold set by the user. This allows a variable amount of background to be filtered out, depending on the sensitivity and noise in the data set.
  4. Using this “good” data, use “scipy.ndimage.measurements.center_of_mass” to find the center of mass of this first reference slice. These coordinates are used as the center to shift all of the other slices to later.
  5. Now that the center coordinates have been determined, start processing the rest of the slices. The first thing is to run “scipy.ndimage.filters.gaussian_laplace” over the slice which helps bring out the bright parts and eliminate the less bright parts.
  6. Find the center of mass of this individual slice.
  7. Normalize the data based on the range set by the user.
  8. Shift the data for this slice by the difference between the reference slice center of mass and this individual slice’s center of mass.
  9. Add the processed data for this slice into a NumPy ndarray of processed data.
  10. Repeat steps 5-9 for the rest of the slices in the data cube.
  11. Once completed and there is an array of processed data, use NumPy to take the mean of this processed data stack. This is the final stacked image.

An example of a double star slice before processing.

Once the final stacked image comes out the end, there are many things that can be done. For now, I’ve implemented a simple hack to guess the position of the primary and secondary stars by finding the coordinates of the “brightest spot”, i.e. maximum value on the stacked image (this is the primary star, in theory), then temporarily making a second copy of the stacked image, zeroing out all of the values for a 10 x 10 pixel box around the bright spot, and then repeating the process to find the next “brightest spot” (maximum value) which should be the secondary star. While this method is admittedly imprecise, it seems to have no problem finding both stars for most of the hundreds of data sets I have tested with. Of course, this assumes that the proper threshold and sigma values for the Laplacian filter have been used.

This is the processed image, with a clear double star and ready for easy measurement.

There are plenty of improvements to be made, and this was just a proof-of-concept which had great results. I want to work more on integrating the calibration values so that final results can be obtained automatically. There might be a better way to figure out the positions of the primary and secondary stars using K-means. I’m not sure on that one yet, it requires more research. There are also other ways to reduce the data which may prove to be better than the Laplace filter, but that was the path of least resistance and provided excellent results for a first trial.