Description =========== :Class: `jwst.combine_1d.Combine1dStep` :Alias: combine_1d The ``combine_1d`` step computes a weighted average of 1-D spectra and writes the combined 1-D spectrum as output. The combination of spectra proceeds as follows. For each pixel of each input spectrum, the corresponding pixel in the output is identified (based on wavelength), and the input value multiplied by the weight is added to the output buffer. Pixels that are flagged (via the DQ column) with "DO_NOT_USE" will not contribute to the output. After all input spectra have been included, the output is normalized by dividing by the sum of the weights. The weight will typically be the integration time or the exposure time, but uniform (unit) weighting can be specified instead. The only part of this step that is not completely straightforward is the determination of wavelengths for the output spectrum. The output wavelengths will be increasing, regardless of the order of the input wavelengths. In the ideal case, all input spectra would have wavelength arrays that were very nearly the same. In this case, each output wavelength would be computed as the average of the wavelengths at the same pixel in all the input files. The combine_1d step is intended to handle a more general case where the input wavelength arrays may be offset with respect to each other, or they might not align well due to different distortions. All the input wavelength arrays will be concatenated and then sorted. The code then looks for "clumps" in wavelength, based on the standard deviation of a slice of the concatenated and sorted array of input wavelengths; a small standard deviation implies a clump. In regions of the spectrum where the input wavelengths overlap with somewhat random offsets and don't form any clumps, the output wavelengths are computed as averages of the concatenated, sorted input wavelengths taken N at a time, where N is the number of overlapping input spectra at that point. Input ===== An association file specifies which file or files to read for the input data. Each input data file contains one or more 1-D spectra in table format, e.g. as written by the extract_1d step. Each input data file will ordinarily be in MultiSpecModel format (which can contain more than one spectrum). The association file should have an object called "products", which is a one-element list containing a dictionary. This dictionary contains two entries (at least), one with key "name" and one with key "members". The value for key "name" is a string, the name that will be used as a basis for creating the output file name. "members" is a list of dictionaries, each of which contains one input file name, identified by key "expname". Output ====== For most modes, the output will be in CombinedSpecModel format, with a table extension having the name COMBINE1D. This extension will have eight columns, giving the wavelength, flux, error estimate for the flux, surface brightness, error estimate for the surface brightness, the combined data quality flags, the sum of the weights that were used when combining the input spectra, and the number of input spectra that contributed to each output pixel. For WFSS modes, which may have hundreds or thousands of spectra from different sources, the output will be in WFSSMultiCombinedSpecModel format. This model differs from the other MultiCombinedSpecModel classes in that it is designed to hold all the spectra in a WFSS observation in a single "flat" table format. Therefore, there is only one item per spectral order in the `spec` list, and each object in the `spec` list has a `spec_table` attribute that contains the spectral data and metadata for all sources in the observation. The spectral table for this model contains the same columns as the ``CombinedSpecModel``, but each row in the table contains the combined spectrum for a single source. The spectral columns are 2D: each row is a 1D vector containing all data points for the spectrum. In addition, the spectral tables for this model have extra 1D columns to contain the metadata for the spectrum in each row. These metadata fields include: SOURCE_ID, N_ALONGDISP, SOURCE_TYPE, SOURCE_RA, SOURCE_DEC. Note that the vector columns have the same length for all the sources in the table, meaning that the number of elements in the table rows is set by the spectrum with the most data points. The other spectra are NaN-padded to match the longest spectrum, and the number of valid data points for each spectrum is recorded in the N_ALONGDISP column. For example, to access the wavelength and flux for a specific source ID (say, 1200) in a WFSSMultiCombinedSpecModel: .. doctest-skip:: >>> from stdatamodels.jwst import datamodels >>> model = datamodels.open('multi_wfss_c1d.fits') >>> spec_this_order = model.spec[0] >>> print(spec.spectral_order) # returns e.g. '1' >>> tab = spec.spec_table >>> row_want = tab[tab["SOURCE_ID"] == 1200][0] >>> nelem = row_want["N_ALONGDISP"] >>> wave, flux = row_want["WAVELENGTH"][:nelem], row_want["FLUX"][:nelem]