Period - a SuperMongo interface to the MIDAS/TSA context

This pages describes the use of a small suite of SM and MIDAS programs that provide an easy interface to Schwarzenberg-Czerny's time series analysis context within MIDAS. In order to make use of this, you need to have installed SuperMongo and MIDAS. There has been a major change in SM 2.4.8(?) which introduced integer type vectors. My experience is that many internal (and my own) macros have been broken by this change, and I strongly recommend to stick to a version prior to this change. I use SM 2.4.1 with no problems. Pablo reported that he did have some difficulty with  the call to MIDAS from within SM, contact him if this happens to you. If you have any suggestions/comments, drop me an e-mail.

Release Notes: can be found at the top of  period.sm

You will need to download the following files, and make sure that your data is in the right format (see prepare and prep_rv below).

Download
You will need to download the following files, and make sure that your data is in the right format (see prepare and prep_rv below).
period.sm
This is the file with all the SM macros, and should go into the directory where you keep SM files. In my case, I have them in ~/sm/. Use this to prepare, analyse, and plot your data.
period.prg
This is the MIDAS program doing the actual period analysis. It is invoked from within SM, so you probably never need to look into this file. It should go where your MIDAS procedures live, in my case this is ~/midwork.
period.fmt
A MIDAS format file used by period.prg. This should got where your MIDAS procdedures live, in my case this is ~/midwork.
detrend.prg
This is a MIDAS program using the SINEFIT/TSA command to detrend the data using a sine fit. The detrended data is written to the disk for further analysis. Goes into ~/midwork
sinefit.prg
sinefit.fit
This is a MIDAS program which does a time fit to time series data. Requires an initial guess of the freqeuncy as parameter. The data file has to be called 'tsa' and has (at least) two columns: time, observed value. Used primarily by the macro 'predict_rv'. Goes into ~/midwork. 'sinefit.fit' is the fit definition, and should go in the directory where you are carrying out your RV analysis. [for those wanting to know the gory details: MIDAS stores the fit definition as well as the fit results in the descriptors of an otherwise empty file - hence the fit definition can be modified in a MIDAS program using  the write/desc command, and the results can be read out from FITPARAM and FITERROR.]
ut2jd, jd2ut
Command line utilities that convert UT to HJD, and HJD to UT. Based on code from John Thorstensen's skycalc. Both go into your ~/bin directory.
lc_rebin
Command line utility to rebin light curves (or anything else) - compiled C-code, drop it into your ~/bin directory.


Preliminaries - program setup and data preparation
setup_period
Initialises a number of parameters used by period: Subtract nightly mean - what it says, subtract the mean of each night (=file) before combining all the data. Ususally the best choice, as it eliminates night-to-night variations that would result in low-frequency power. Subtract total mean - in some cases it may be advantageous to first combine all available data and then subtract the mean, e.g. if a system has a very long period and only parts of the orbit are covered each night. Subtract HJD[0] - when combining the data, subtract the HJD of the first data point, i.e. the combined data file will start at time=0. This should be done because the TSA commands will behave badly if they encounter the large HJD values. Set to "n" if you want to prepare a data file for ephemeris folding. Detrend - a simple way to detrend long-period (low-frequency) trends from the data. It calculates a box-car smoothed light curve, and subtracts this from the original data. An additional input parameter is the number of points in the box car. If your data has a sinusoidal modulation that you would like to get rid of, it is better to use the command "detrend" below. Useful for e.g. removing the orbital trend in an intermediate polar to prepare the data for a spin-folded light curve. Smooth - smooth the data with a box car to remove short-period (high-frequency) signals or flickering. An additional parameter is the width of the box car.  Fake frequency - prepare (see below) always produces a faked data set along with combining the real data. Give here the frequency of that fake data set in 1/d. Plot error bars -  plot error bars, yes or no. Photometry - is the data you analyse photometry or radial velocities (changes the labels and limits in the plots).
clean_period
Removes most of the intermediate files created during the analysis process
prepare
Combines several data sets (=files) into a single file, potentially subtracting the nightly (or total) mean and the time of the first data point (see setup_period). The combined data are written to a file called tsa. It also prepares a faked data set tsa_fake by evaluating a sine function at the times of the observed data. The amplitude of the sine wave is adjusted to reflect the amplitude of the observed variation, does not work very well for multi-period signals. The phase of the fake data is arbitrary.

prepare [lower limit] [upper limit]

Data format: prepare need the time series data in my own (simple) ASCII format: Time, heliocentric correction, value, error, see an example, called "something.dat". The reason why I keep the heliocentric correction as a separate column is (a) that quite often observers provide me with data in JD or MJD, and I have to compute the correction myself, and I would like to keep the original data preserved, and (b) it is easy to screw up the H.C. calculation, e.g. by using wrong coordinates for the star. Once this happened, it could be hard to get back to the original (JD) times. If you use a different format, you need to adjust prepare. prepare also assumes to find a file called "obs.lst", which lists the available data files (nights) to be analysed (without the extension ".dat").

If there are a few far outliers in the photometry, they can be clipped with the two arguments [lower limit] and [upper limit].
prep_rv
Prepares radial velocity measurements for the analysis with period and pfold by sorting the data in time, subtracting the mean radial velocity, and subtracting the time of the first data point.

prep_rv [datafile] [rv_limit] [rv_err_limit]

By default, prep_rv reads from the file rv.dat, if a different file should be used it can be given as first parameter [datafile]. Prep_rv will normally use all RV measurements, but you can filter out values exceeding a given limit by setting [rv_limit], and in addition kick out values exceeding a certain error in the RV measurement by setting [rv_err_limit]. If you chose to filter the RV measurements, both limits have to be given, e.g.

prep_rv rv.dat 300 0

or

prep_rv rv.dat 200 50

Data format: It assumes that the measurements are in a file "rv.dat", which has the columns time, dummy, RV, RV error (dummy can be anything) - this is the standard output that Molly produces.
add_sine
Sometimes, you would like to add a signal with a known frequency and amplitude to your data, to test alias patterns, or the detection threshold. Use

add_sine [frequency] [amplitude] [input file] [output file]

where [frequency] and [amplitude] are compulsary arguments. [input file] is defaulted to "tsa" and [output file] to "tsa_sine".

Examples:

add_sine 300 0.05

will read from the file "tsa", add a sine wave (of arbitrary phase) with a frequency of 300 d-1 and an amplitude of 0.05, and write the result to "tsa_sine". The amplitude has obviously to be given in the same unit as the data, e.g. if "tsa" in the example above contains differential magnitudes, then the example adds a sine wave with an amplitude of 50 mmag.

Basic period analysis commands
period
Runs the period analysis by calling MIDAS. The syntax is

period [start] [end] [nfreq] [method] [datafile] [order] [cover]

where [start] and [end] define the range of frequencies to be sampled, [nfreq] the number of test frequencies, [method] is one of "power", "scargle", "aov", or "ort" (see the links to the MIDAS help pages for details),  [datafile] is the input, [order] is the number of bins used by AOV, or the number of harmonics used by ORT, and [cover] is the number of overlapping bin covers to be employed by AOV
(again, see the links above for details).

Only the first three parameters are mandatory. The default for [datafile] is "tsa", the default for [method] is "scargle". If you set [end] to zero, MIDAS will choose the values of [start], [end], and [nfreq] automatically according to the Nyquist sampling theory. In the case of AOV or ORT periodograms, the parameters [order] and [cover] are set to the MIDAS defaults.

Examples:

period 0 20 10000 - computes a Scargle periodogram of the data file tsa in the range 0-20 1/d divided into 10000 frequencies. The result is written to scargle.dat

period 0 20 10000 aov tsa_fake 5 2- computes an AOV periodogram of the fake data tsa_fake (also produced by prepare) over the frequency range 0-20 1/d, divided into 10000 frequencies, using 5 bins  and a cover factor of 2 for the AOV method.

Useful hints: "power" and "scargle"  work best on quasi-sinusoidal signals, try out the data on SDS2116+1134. A scargle periodogram (period 0 50 20000) as strong signals at 18.44 1/d (78.09min) and 17.46 1/d (82.49min), as well as at 34.97 1/d (41.18min) and 35.95 1/d (40.05 min). As the light curve is double-humped, which is quite often seen in short-period dwarf novae, the high-frequency signal should be comensurate with the low-frequency one, and we conclude that the orbital period is probably 82.49min. SDSS0854+3905 is a polar whose light curve is totally dominated by cyclotron beaming, obviously highly non-sinusoidal. "Scargle" and "power" lose out here, but "aov" and "ort" (e.g. period 0 20 10000 ort) find the correct period of 113.26min.
periodall
Runs all four types of perisod analysis,  "power", "scargle", "aov", and "ort" on the data.

periodall [start] [end] [nfreq] [datafile]

Parameters are as in period. Only [start], [end], [nfreq] are mandatory, [datafile] is defaulted to "tsa".
power
Plots a periodogram

power [method]

Default for [method] is "scargle".
powerall
Plots periodograms for all four methods "power", "scargle", "aov", and "ort", plus their average. Usually used after running periodall.
pfold
Pick a peak in the periodogram and fold the data over the selected frequency.

pfold [method] [datafile]

The default for [method] is "scargle", the default for [datafile] is "tsa". When you run pfold, it plots the periodogram and allows you to select with the cursor a frequency range that you want to zoom into. Mark the left and right end of the desired frequency range. Pfold then re-plots only that range, and activates the cursor again. Select with the cursor the frequency range just enclosing the peak that you want to use for folding the data. Pfold will then re-plot the selected range in red, find the frequency corresponding to the maximum power, and mark it wit a little asteriks. It prints out the frequency and power at the maximum of the peak, plus the amplitude (=power **2, only valid for "power"). It then asks "Continue...". Enter "y", and pfold will fold the data over the selected frequency.
fold
Folds the data over a given frequency.

fold [frequency] [datafile]

[frequency] is mandatory, [datafile] is defaulted to "tsa".
detrend
Fits and subtracts a sine wave from the data, using the MIDAS/TSA task SINEFIT/TSA. The resulting file can be analysed again with period, pfold, etc.

detrend [frequency] [input datafile] [output datafile] [niter] [nharmonic]

[frequency] is the initial guess for the frequency of the sine, should be very close to the actual value. Can be estimated by using period and "pfold. [input datafile] is defaulted to "tsa", [output datafile] is also defaulted to "tsa", i.e. detrend by default overwrites the file "tsa". [niter] is the number of iterations of the sine fit, defaulted to 30. If you want to fit a sine with a fixed frequency, use [niter=1]. [nharmonic] is the degree of the Fourier series to be fit, i.e. the number of harmonics to be included. [nharmonic=0] means fit a constant, [nharmonic=1] is a single sine (the fundamental), [nharmonic=2] is the fundamental + first harmonic etc.

Eclipse analysis
eclipse
Determines the time of eclipse center by mirroring the eclipse, and interactively shifting the mirrored/original until the best overlap is found

eclipse [datafile]

where [datafile] is one of the light curves in the usual format (see "prepare" above), given without the extension, e.g.,

eclipse hs0455+8315_20001110_aip_r.dat

Click with the cursor on the eclipse to be analysed, and you will be taken to a zoomed version of the eclipse with the mirrored light curve in red. Return shifts the mirrored light curve, "+" and "-" change the direction of the shift, "q" exits. Write the eclipse times determined in that way to a file called "eclipses_object.dat" with a guess of the error in the measurements (typically 2-3 of the shifts).
eclipse2
Determines the time of eclipse center by fitting a third-order polynomial over a range of the eclipse profile that is interactively provided.

eclipse2 [datafile]

where [datafile] is one of the light curves in the usual format (see "prepare" above), given without the extension, e.g.,

eclipse hs0455+8315_20001110_aip_r.dat

Click with the cursor on the eclipse to be analysed, and you will be taken to a zoomed version of the eclipse. Specify with two cursor clicks the left and right limit on the eclipse profile that should be fitted by the polynomial.  The fit will be overplotted, and the data points that were used for the plot will be shown in red. Write the  eclipse times determined in that way to a file called "eclipses_object.dat" with a guess of the error in the measurements.
fit_ecl_eph
Carries out a linear eclipse ephemeris fit.

fit_ecl_eph [object] [pstart] [pend]

The data file has to be called "eclipses_object.dat", and contains two columns: times of mid-eclipse and errors for those values. "[pstart]" and "[pend]" are bracketing the range over which the initial period search is done. The macro works in two stages. In the first stage, orbital phases are computed for the observed times of mid eclipse for a set of 10000 (default) test periods in the range between "[pstart]" and "[pend]", and the most likely period (and cycle count) is determined by the minimum difference. Then a linear fit is done to the observed times of mid-eclipse versus the cycle count number. This fit provides the coefficients for the eclipse ephemeris of the kind

HJD(Phi=0)=ZERO + E*PERIOD

as well as errors on the zero point and on the period. The macro then also calculates O-C values using this ephemeris and writes them to a file "o-c_object.dat".
oc
Plots O-C values calculated by fit_ecl_eph

oc [object]

cycle
Simple utility to calculate the UT times of eclipses based on a linear eclipse ephemeris. For planning observations.

Simulations
simu_rv
Very powerful simulator for RV curves.

simu_rv

simu_rv allows the simulation of RV measurements taken over one of several nights with arbitrary sampling. This is extremely useful to test a priori how that forthcoming observing run would go if you observe object X, which you believe to be a short-period system, for 2h in the first night, and a stingy 1h in the second night. The macro adds timing jitter and RV errors.  You need to specify the following parameters for the simulation within the macro.

omega: the orbital frequency
amp:     the RV amplitude, not very important as the TSA tools don't care about the absolute value
gam:     the gamma velocity of the RV curve
texp:     the exposure time in minutes
tjitt:       the maximum timing jitter by which the mid-exposure times may drift. A random fraction [-1,1] of that is then applied
rvstat:   the maximum statistical error on the RV values, a random fraction [-1,2] is then applied to the RV values
rvsys:   a systematic RV error that is used to make sure that the RV error never gets too low

simu_rv writes to a file tsa_fake, which you can analyse with period, e.g. "period 0 20 10000 scargle tsa_fake", followed by "pfold scargle tsa_fake"
bootstrap
Runs a number of "bootstrap" simulations which can then be used with bootselect to evaluate the probability of a given alias. In brief, it creates a simulated data set with the same number of points as the original data set by selects at random values from the original data. In that process, it is very likely to select the same original points more than once, and therefore omit some other of the original data. This simulated data set is then subject to exactly the same analysis as the original data. The advantage of this method is that it makes no assumptions on e.g. the errors of the data - but you need to have at least 10 data points in your observations, otherwise it will not work well. To use this command, it is advised that you prepare your data, and run your favourite period analysis method before, so that both the data and the periodogram you wish to analyse are available.

bootstrap [start] [end] [nfreq] [nrun] [method] [datafile] [order] [cover]

where all parameters except [nrun] have exactly the same meaning as in period. [nrun] gives the number of bootstrap simulations, for a quick test you may run it with 10, for a real analysis, you should use at least 100. By analogy with period, only the first three parameters are mandatory. It is semi-obvious that you should run bootstrap with the same period analysis method as you used for the observed data.
bootselect
Allows to interactively select an alias in the periodogram computed from the observed data, and evaluates the bootstrap simulations to give a likelyhood that the selected alias is the correct period.

bootselect [method]

where [method] refers to the method used to compute the periodogram of the observed data and the run the bootstrap simulations. If no method is given, "Scarcle" will be used
predict_rv
Analyses a set of RV measurements, and projects the RVs for a set of interactively chosen aliases into the next nights, allowing to pick the optimum time for the next observation to nail down the correct period.

predict_rv [datafile] [auto] [start] [end] [nfreq]

where [datafile] is the file containing the radial velocity measurements. The data format is the same as for prep_rv, which is in fact used here, i.e. time, dummy, RV, RV error. Dummy can be anything, time has to be here in HJD. If the second parameter is set to "auto", predict_rv will look up the first and the last HJD of the radial velocity data, and use that for plotting. Night/day shading will be turned off in this setup. For any other value of the second parameter predict_rv will use the setup of the observing run, as defined in the next paragraph. The parameters [start] [end] [nfreq] have the same meaning as in period.

Setup: you need to edit 'period.sm' and set 'ut_start' and 'ut_end' to the UT at the beginning/end of the night, include twilight time if you can use it, and remember that e.g. 20h30m = 20.5 UT. Also, set 'nnight' to the number of nights in your observing run.

The macro computes a Scargle periodogram, plots it, and asks the user to select any desired number of aliases to inspect (though, more that ~5 will clutter the subsequent plots). Place the cursor above the alias you want to inspect, click left, continue for the next aliases, click right to finish. The macro will highlight the picked aliases by red dots and wants you to confirm your choice by typing 'y'. If you are unhappy, hit return, and chose your aliases again. Next, it runs sine fits with the selected frequencies as initial guess (using the MIDAS program sinefit.prg), and then calculates RV curves for the different aliases over the entire length of the run. You can now chose with the cursor (left click, then right click) to zoom into one night. Finally, in here you can click in this plot (left click, then right click), and you get on the SM prompt the corresponding HJD and UT. You probably want to select a time where the different RV curves are most discrepant. Click to the left of the y-axis to exit this mode, and predict_rv will finish by plotting RV folds over the N aliases you selected for inspection.

Example: download the RV data file for SDSS2149-0717, and run

prep_rv rv2149.dat

Comment in/out some of the lines in the RV file to see how the constraints of the aliases changes. You will notice that the times of observations were not optimised in the first few nights, only in the last two nights I developed a precursor of predict_rv to improve the timing of the observations. Credits to JKT suggesting the algorithm selecting the closest alias.

Utilities
pconv
Converts a period plus its error from a given (s, min, h, d) unit into all common units. Tries to be clever and give the number of significant digits, but seems to give one digit too many...

pconv [period] [error] [unit]

e.g.

pconv 112.23 0.03 min
6733.80 +/- 1.80 s
112.230 +/- 0.030 min
1.87050 +/- 0.00050 h
0.077938 +/- 0.000021 d
fconv
Converts a frequency plus its error given in cycles/day to period plus error in all common units (s, min, h, d). The main use is along with the command 'detrend' which uses a sine fit (build on the MIDAS/TSA task SINEFIT/TSA) to get a best-fit period plus error - but the output of SINEFIT/TSA is in frequencies.

fconv [frequency] [error]

e.g.

fconv 5.2561279E+00  1.2328067E-03
16437.96 +/- 3.86 s
273.966 +/- 0.064 min
4.5661 +/- 0.0011 h
0.190254 +/- 0.000045 d