diff --git a/Documentation/WMPL_Upgrades_2026April.md b/Documentation/WMPL_Upgrades_2026April.md new file mode 100644 index 00000000..848b1d0b --- /dev/null +++ b/Documentation/WMPL_Upgrades_2026April.md @@ -0,0 +1,218 @@ +# WMPL Upgrades +by Mark McIntyre, April 2026 + +## Key points + +- Added new operation mode to create candidates. +- Added distributed processing for candidates and phase1 solutions. +- Added checks for duplicate transactions. +- Replaced JSON database with SQLite databases. +- Slight change to command-line parameters. + +### Operation Modes + +The updated solver now has three core operational modes, numbered 4, 1, and 2. In the code these are MCMODE_CANDS, MCMODE_PHASE1 and MCMODE_PHASE2. The previous mode 1 has been split into two stages numbered 4 and 1 as explained below. + +See the end of this document for examples of how each mode can be used + +Here's what each phase does. + +- In mcmode 4, the solver finds and saves candidate groups of observations. + During this phase, unpaired observations are loaded and candidate groups found. Observations are excluded if they're already marked as paired in the observations database, and potential candidate groups are also checked against the candidate database to avoid reanalysing combinations that were already found. Remaining new candidates are then added to the candidate database and saved to disk. + +- In mcmode 1, the solver loads candidates created by the previous step and attempts to find a simple solution. +If successful, the trajectory is saved to disk and a copy placed in the 'phase1' folder for further analysis, while the trajectory and observations databases are updated accordingly. If unsuccessful, the trajectory is added to the list of failed trajectories in the trajectories database. + +- In mcmode 2, the solver loads phase1 solutions and performs Monte-Carlo analysis. This mode is unchanged from previously. + +Some Bitwise combinations of modes are permitted as shown in the table below: + +| Value | Effect | Example Use | +| ------------------- | ------------------------------------------------------------ | ------------------------------------------- | +| 3
MCMODE_BOTH | Runs modes 1+2, i.e. loads and fully solves candidates. | UKMON currently uses this mode. | +| 5
MCMODE_SIMPLE | Runs modes 4+1, i.e. creates phase 1 solutions from scratch. | GMN currently uses this mode. | +| 7
MCMODE_ALL | Equivalent to 0 or passing no mcmode | Typically used during manual data analysis. | +| Any other value | Treated as a value of 7 | | + +Note that in modes 0, 3, 5 and 7, intermediate files (ie candidates and phase1 files) are not saved to disk. + +### Single-Server Continuous Processing +In single-server continuous processing mode, run three instances of the solver, one in each of modes 4, 1, and 2. + +In testing, we found that a 16-core server could create around 6000 candidates per hour, while also processing 1000 candidates per hour to get phase1 solutions, and processing around 200 phase1 solutions per hour to get full phase2 monte-carlo solutions. + +We therefore suggest running the mode 4 solver with a frequency of 15 minutes (ie with flags `-a` and `--autofreq 15`). Using a short interval ensures rapid identification and update of candidates as new data arrives on the server. + +Meanwhile modes 1 and 2 can be run with a period of around 30 minutes (`--autofreq 30`), and should be restricted to around 200 trajectories at a time, using `--maxtraj 200`. This ensures that data are processed in reasonable batches and written/updated on disk promptly. It also ensures data are available for distributed processing as explained in the next section. + +### Distributed Processing + +The solver supports distribution of both candidates and phase1 solutions. + +To enable distributed processing, we require one master node and one or more child nodes. + +On the master, we set up the system as for single-server continuous processing, then create a configuration file '**wmpl_remote.cfg**' in the same folder as the databases. The content of the configuration file is explained below and a sample file is included in the repository. + +On each child, we also create a configuration file (see 'Child Node Configuration' below). Child nodes can run in modes 1 or 2, collecting relevant data from the master node and uploading the results upon completion. + +SFTP is used to move data between master and child, and each child must therefore have an SFTP account on the server hosting the master. + +Data are written into a 'files' folder in the sftp account's home directory, and therefore the account running the master instances of the solver must be able to read from, write to and create folders in a "files" directory in the children's home directories. On my test server I achieved this with POSIX ACLs and Unix group membership. + +Additionally, the solver itself sets permissions on files and subfolders, and these should not be altered. + +The required folder structure for one node is shown below. + +![image](node_structure.png) + +**Master Node Configuration** + +The configuration file for the master node specifies the child nodes that are available, the capacity of each node, and the mode that its operating in (modes 1 or 2, no other mode is supported). + +The capacity value can be any integer, with zero meaning the node is disabled and a negative value meaning the node has no capacity limit. Some experimentation may be required to determine the optimal capacity, but in testing I found that my 4-core i7 desktop could process 200 candidates in around 30 minutes, and 200 monte-carlo solutions in around three hours. Limits of 500 for mode 1 and 200 for mode 2 therefore seem reasonable. + +Example master-mode configuration file: + +\[mode\] +mode = master +\[children\] +node1 = /home/node1, 500, 1 +node2 = /home/node2, 200, 2 +node3 = /home/node3, 0, 1 + +This indicates that: + +- node 1 is running in mode 1 and has capacity of 500. +- node 2 is running in mode 2 and has capacity of 200. +- Node 3 is currently disabled (capacity zero) and will not be assigned data. + +If we bring node 3 online, we can change the capacity from zero to some suitable value, and the master will begin assigning candidates to it (see 'Dynamically Adding Nodes' below). + +If no nodes are available, or if all nodes are at capacity, any remaining data will be assigned to the master node. + +The master will also stop assigning data to a node if a special file named "stop" is present in the files folder of the child's SFTP home directory. The child nodes create this file when shutting down but it can also be created manually if necessary. + +Furthermore, if data has not been picked up by a child within six hours, then it will be reassigned to the master node. This ensures that data is not left unprocessed if for example a node crashes unexpectedly or is running very slowly. + +**Dynamically Adding Nodes** + +The master instance of the solver re-reads the remote configuration file on each loop, and so nodes can be added, removed, disabled or enabled on demand, without needing to restart the master. + +So, for example, one could create a configuration listing several child nodes with capacity set to zero, which would mean they were initially disabled. The Solver would thus assign all candidates to the master node. However, if volumes rose, an instance of the solver could be started up on a child node and the master configuration file updated. On the candidate finding process's next loop, the file would be re-loaded and data would be automatically assigned to the child. + +You can also _manually_ move files between child node folders on the server. For instance, if you want to move some load from node1 to node2 you can move some of the candidate files from node1's _candidates_ folder to node2's _candidates_ folder. The UNIX command to move 100 candidates from node 1 to node 2 would be: + +_ls -1 ~node/files/candidates | head -100 | while read i ; do mv \$i ~node2/files/candidates; done_ + +**Processing Uploaded Data** + +Upon each loop round, the master node will scan each node's home directory for uploaded results. These will be integrated into the trajectories data and the databases updated. After integration, a deduplication routine is run. This handles edge cases where a new version of a trajectory is created by a child node. + +**Child Node Configuration** + +The child must be running in mode 1 or 2 - no other mode is supported at present - and should be run with the `-a --autofreq 15` flags, to ensure that it continues to collect and process data promptly. + +The child configuration file specifies the server, user and key to use for connections to the master node. Port is optional but can be specified if a non-standard SFTP port is in use. + +\[mode\] +mode = child +
\[children\] +host = testserver.somewhere.com +user = node1 +key = ~/.ssh/somekey +port = 22 + +At startup, the child node will connect to the master and remove the "stop" file, if present. This indicates to the master that it is "open for business". The child will then download all assigned data and begin processing it. Downloaded files are moved to a subfolder _processed_ on the sftp server. Upon completion it will upload the results to the sftp server, then look for new data to process. + +Note that the child will download *all* assigned data and the `maxtraj` parameter should not be used when launching WMPL on the child. If this parameter is set, then there's a risk that a backlog of downloaded-but-unprocessed data will build up. For example if the child's capacity is set to 200 on the master node, but the child is limited to 100 via `maxtraj`, then it will download 200 on each pass, but only process and upload 100. + + +**Stopping a Child Node** + +Any node can be terminated by pressing Ctrl-C or by sending SIGINT to its process. The node will stop processing immediately and create a "stop" file on the sftp server. + +Note that termination will leave data incompletely processed and no upload will take place, and so it is advisable to wait until the child's logfile indicates it is idle. + +If this is not possible, one can identify the most recent, potentially incomplete, data set that was assigned to the node by looking in the child's _processed_ folders on the server, and copying the data back to the master node's _candidate_ or _phase1_ folders as appropriate. + +**Recovering from a Child Node Crash or Shutdown** + +If a child node crashes or is otherwise terminated during processing, the data can be recovered and redistributed to the master or other nodes, or indeed to the failed node after it has restarted. This can be done by looking in the _processed_ folders on the child, or if the child node is unavailable, in the child's _processed_ folders on the master node, identifying the most recent data, and moving it as necessary. + +## Duplicate Transaction Checks + +A check has been introduced in both candidate finding and phase1 solving that examines the database for potential duplicate or mergeable trajectories. + +Duplicates are defined as trajectories that contain the same observations. When detected, the solution with the least ignored observations is retained and the duplicates are deleted from the database and disk. + +Mergeable trajectories are defined as those with at least one common observation. In principle these should never arise but in practice with a distributed processing model, it is possible. For example, a candidate might be found and handed off for solving but while it is still being solved, a new observation might be uploaded by a camera, and so on its next pass the candidate finder creates a second candidate with an additional observation and a different reference timestamp. When detected the better of the two mergeable trajectories is retained. + +Aside: A better solution is to remove both solutions and allow the the candidate finder to identify a single combined candidate on its next pass. This is still being investigated. + +## Databases + +The JSON database has been replaced by three SQLite databases, one for Observations, one for Trajectories and one for Candidates. + +This approach was taken because most trajectory and observation database writing takes place during phase 1 solving, but some takes place during candidate finding notably when reprocessing previous trajectories with new observations. By splitting the databases, we minimise potential concurrent write situations. SQLite does not support multiple simultaneous writes, and though it will back off and retry after a few milliseconds, it is preferrable to avoid unnecessary delays. Additionally, it simplifies the process of merging in data from child nodes and avoids potential deadlocks when doing so. + +**If The Solver Crashes** + +Although most operations are immediately committed to the databases, it is possible for the solver to crash and leave an incomplete transaction. This will be revealed by the existence of write-ahead logs in the database directory e.g. "observations.db-wal". + +If this file is present, then upon next startup, SQLite will complete any pending transactions. This minimises the risk of data loss, but at worst may lead to observations being reprocessed. This is preferable to trajectories being missed. + +**The Legacy JSON database** + +The legacy JSON database is no longer used It is not deleted however, after the initial data migration described below it is no longer being used and can be moved to long-term storage if desired. + +**Initial Population of SQLite** + +When the Solver is started up, it checks for the existence of the SQLite databases. If they are not present, it creates them and prepopulates them with the last few days of data from the old JSON database if available. For example, if run with the auto flag and default period of 5 days lookback, the last five days of data will be copied to SQLite. This ensures that sufficient observation and failed trajectory data is present for normal operation of the solver. + +The JSON database is then closed and is not used again in any future pass of the solver. It is not truncated, archived or deleted and remains as an historical record of the state of the database as at the cutover date. + +**Historic Reruns** + +If the solver is rerun for an historic period from before the cutover, there will be no paired observations or failed trajectories data in the databases. The assumption is that if we are rerunning for an historic period, we are either looking to integrate new observations into the dataset or to recalculate trajectories using improved mathematical models. In either case it seems likely we'd want to start by reanalysing the raw data. + +That said, should we wish to copy historical data into the SQLite databases, this can be done with the command-line interface to CorrelateDB as shown below: + +_python -m wmpl.Trajectory.CorrelateDB --dir_path rms_data --action copy --timerange "(20251215-000000,20251222-000000)"_ + +This will copy observations and failed trajectories into SQLite from the JSON database in _rms_data_ for a date range 2025-12-15 to 2025-12-22, creating the SQLite databases if necessary. + +This is quite a slow operation - on my 4-core i7 desktop it takes about several minutes to copy a week's worth of data. + +## Command Line Options + +One option has been removed and two new options added + +Removed: + +- \--**remotehost**: this has been superseded by the remote configuration file + +Added: + +- \--**addlogsuffix**: default false - this adds a suffix to the logfile to indicate which phase is being run. + For example, with this flag passed, the logfile for a run in mode 4 (candidate finding) would be something like \_correlate_rms_20260214_121314_cands.log* whereas a phase-1 log file would be _correlate_rms_20260214_121314_simple.log_. + +- **\--archivemonths:** default 0: this specifies the number of months' data to keep in the databases. Data older than this number of months will be archived. A value of zero means purge rather than delete. At least 21 days data will always be kept, and if a time-range is specified when calling CorrelateRMS, then only data older than this will be archived or purged. + +## Examples of using the Modes + +In the below examples, raw camera data is in `$DATADIR`, and the output and any intermediate files are being written to `$TARGDIR`. The `--addlogsuffix` parameter has been used to make sure the logfiles are uniquely named. + +* Run an instance in candidate-finding mode, looking back at the last three days' data and automatically rescanning the data every 15 minutes (or when the last pass completed). +``` bash +python -m wmpl.Trajectory.CorrelateRMS $DATADIR -a 3 --autofreq 15 --mcmode 4 --cpucores 4 --logdir $TARGDIR/logs --outdir $TARGDIR --dbdir $TARGDIR --addlogsuffix +``` + +* Run an instance in phase-1 solver mode, consuming at most 500 candidates at a time and automatically rescanning the raw data every 30 minutes (or when the last pass completed). Note that its not necessary to specify a value for `-a` as the solver will consume any available candidates. +``` bash +python -m wmpl.Trajectory.CorrelateRMS $DATADIR -a --autofreq 30 --mcmode 1 --cpucores 4 --logdir $TARGDIR/logs --outdir $TARGDIR --dbdir $TARGDIR --maxtrajs 500 --addlogsuffix +``` + +* Run an instance in phase-2 solver mode, consuming at most 200 phase-1 solutions at a time and automatically rescanning the phase 1 data every 30 minutes (or when the last pass completed). Note that its not necessary to specify a value for `-a` as the solver will consume any available data. +``` bash +python -m wmpl.Trajectory.CorrelateRMS $DATADIR -a --autofreq 30 --mcmode 2 --cpucores 4 --logdir $TARGDIR/logs --outdir $TARGDIR --dbdir $TARGDIR --maxtrajs 200 --addlogsuffix +``` \ No newline at end of file diff --git a/Documentation/node_structure.png b/Documentation/node_structure.png new file mode 100644 index 00000000..212cc11e Binary files /dev/null and b/Documentation/node_structure.png differ diff --git a/wmpl/Rebound/REBOUND.py b/wmpl/Rebound/REBOUND.py index 92ef3330..75620897 100644 --- a/wmpl/Rebound/REBOUND.py +++ b/wmpl/Rebound/REBOUND.py @@ -14,7 +14,7 @@ REBOUND_FOUND = True except ImportError: - print("REBOUND package not found. Install REBOUND and reboundx packages to use the REBOUND functions.") + # don't print a message here as its already printed whenever REBOUND_FOUND is False REBOUND_FOUND = False from wmpl.Utils.TrajConversions import ( diff --git a/wmpl/Trajectory/CorrelateDB.py b/wmpl/Trajectory/CorrelateDB.py new file mode 100644 index 00000000..d7a2262f --- /dev/null +++ b/wmpl/Trajectory/CorrelateDB.py @@ -0,0 +1,1092 @@ +# The MIT License + +# Copyright (c) 2024 Mark McIntyre + +# Permission is hereby granted, free of charge, to any person obtaining a copy +# of this software and associated documentation files (the "Software"), to deal +# in the Software without restriction, including without limitation the rights +# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +# copies of the Software, and to permit persons to whom the Software is +# furnished to do so, subject to the following conditions: + +# The above copyright notice and this permission notice shall be included in +# all copies or substantial portions of the Software. + +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +# THE SOFTWARE. + +""" Python scripts to manage the WMPL SQLite databases +""" +import os +import sqlite3 +import logging +import logging.handlers +import argparse +import datetime +import json +import numpy as np + +from wmpl.Utils.TrajConversions import datetime2JD, jd2Date + + +log = logging.getLogger("traj_correlator") + +############################################################ +# classes to handle the Observation and Trajectory databases +############################################################ + + +class ObservationsDatabase(): + """ + A class to handle the sqlite observations database transparently. + """ + + def __init__(self, db_path, db_name='observations.db', purge_records=False, verbose=False): + """ + Create an observations database instance + + Parameters: + db_path : path to the location of the database + db_name : name to use, typically observations.db + purge_records : boolean, if true then delete any existing records + + """ + db_full_name = os.path.join(db_path, f'{db_name}') + if verbose: + log.info(f'opening database {db_full_name}') + con = sqlite3.connect(db_full_name) + self.dbhandle = con + con.execute('pragma journal_mode=wal') + if purge_records: + con.execute('drop table if exists paired_obs') + res = con.execute("SELECT name FROM sqlite_master WHERE name='paired_obs'") + if res.fetchone() is None: + if verbose: + log.info('create table paired_obs') + con.execute("CREATE TABLE paired_obs(obs_id VARCHAR(36) UNIQUE, obs_dt REAL, status INTEGER)") + self._commitObsDatabase() + + def _commitObsDatabase(self): + """ + Commit the obs db. This function exists so we can do lazy writes + """ + self.dbhandle.commit() + try: + self.dbhandle.execute('pragma wal_checkpoint(TRUNCATE)') + except Exception: + self.dbhandle.execute('pragma wal_checkpoint(PASSIVE)') + return + + def closeObsDatabase(self): + """ + Close the database, making sure we commit any pending updates + """ + + if self.dbhandle: + self._commitObsDatabase() + self.dbhandle.close() + self.dbhandle = None + return + + def checkObsPaired(self, obs_id, verbose=False): + """ + Check if an observation is already marked paired + return True if there is an observation with the correct obs id and with status = 1 + + Parameters: + obs_id : observation ID to check + + Returns: + True if paired, False otherwise + """ + + paired = True + cur = self.dbhandle.execute(f"SELECT obs_id FROM paired_obs WHERE obs_id='{obs_id}' and status=1") + if cur.fetchone() is None: + paired = False + if verbose: + log.info(f'{obs_id} is {"Paired" if paired else "Unpaired"}') + return paired + + def addPairedObservations(self, obs_ids, jdt_refs, verbose=False): + """ + Add or update a list of observations paired, setting status = 1 + + Parameters: + obs_ids : list of observation IDs + jdt_refs : list of julian reference dates of the observations + """ + + vals_str = ','.join(map(str,[(id, dt, 1) for id,dt in zip(obs_ids,jdt_refs)])) + + if verbose: + log.info(f'adding {obs_ids} to paired_obs table') + try: + self.dbhandle.execute(f"insert or replace into paired_obs values {vals_str}") + self.dbhandle.commit() + return True + except Exception: + log.warning(f'failed to add {obs_ids} to paired_obs table') + return False + + return + + def addPairedObs(self, obs_id, jdt_ref, verbose=False): + """ + Add or update a single entry in the database to mark an observation paired, setting status = 1 + + Parameters: + obs_id : observation ID + jdt_ref : julian reference date of the observation + """ + + if verbose: + log.info(f'adding {obs_id} to paired_obs table') + try: + self.dbhandle.execute(f"insert or replace into paired_obs values ('{obs_id}', {jdt_ref}, 1)") + self.dbhandle.commit() + return True + except Exception: + log.warning(f'failed to add {obs_id} to paired_obs table') + return False + + def unpairObs(self, obs_ids, verbose=False): + """ + Mark an observation unpaired. + If an entry exists in the database, update the status to 0. + ** Currently unused. ** + + Parameters: + met_obs_list : a list of observation IDs + """ + obs_ids_str = ','.join(f"'{id}'" for id in obs_ids) + + if verbose: + log.info(f'unpairing {obs_ids_str}') + try: + self.dbhandle.execute(f"update paired_obs set status = 0 where obs_id in ({obs_ids_str})") + self.dbhandle.commit() + return True + except Exception: + log.warning(f'failed to unpair {obs_ids_str}') + return False + + def getLinkedObservations(self, jdt_ref): + """ + Return a list of observation IDs linked with a trajectory based on the jdt_ref of the traj + + Parameters + jdt_ref : the julian reference date of the trajectory + + """ + cur = self.dbhandle.execute(f"SELECT obs_id FROM paired_obs WHERE obs_dt={jdt_ref} and status=1") + return [x[0] for x in cur.fetchall()] + + def archiveObsDatabase(self, db_path, arch_prefix, archdate_jd): + """ + archive records older than archdate_jd to a database {arch_prefix}_observations.db + + Parameters: + db_path : path to the location of the archive database + arch_prefix : prefix to apply - typically of the form yyyymm. Set this to None to purge without archiving. + archdate_jd : julian date before which to archive data. Set this to None to purge anything older than 21 days. + """ + + if archdate_jd is None: + archdate = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(days=21) + archdate_jd = datetime2JD(archdate) + + purge_ok = True + log.info(f'{"Archiving" if arch_prefix else "Purging"} observations database') + if arch_prefix: + # create the database if it doesnt exist + archdb_name = f'{arch_prefix}_observations.db' + archdb = ObservationsDatabase(db_path, archdb_name) + archdb.closeObsDatabase() + + # attach the arch db, copy the records then delete them + archdb_fullname = os.path.join(db_path, f'{archdb_name}') + self.dbhandle.execute(f"attach database '{archdb_fullname}' as archdb") + try: + self.dbhandle.execute(f'insert or replace into archdb.paired_obs select * from paired_obs where obs_dt < {archdate_jd}') + except Exception: + log.warning('unable to archive observations database') + purge_ok = False + + if purge_ok: + self.purgeObsDatabase(archdate_jd=archdate_jd) + return + + def purgeObsDatabase(self, archdate_jd=None): + """ + purge records from before a specified julian date. + + parameters: + archdate_jd : julian date before which to purge. Default None will purge records more than 21 days old + + """ + if archdate_jd is None: + archdate = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(days=21) + archdate_jd = datetime2JD(archdate) + + cur = self.dbhandle.execute(f'select count(*) from paired_obs where obs_dt < {archdate_jd}') + res = cur.fetchone() + count = res[0] if res else 0 + log.info(f' purging {count} records from paired_obs') + self.dbhandle.execute(f'delete from paired_obs where obs_dt < {archdate_jd}') + self.dbhandle.commit() + return + + def copyObsJsonRecords(self, paired_obs, dt_range): + """ + Copy data from the legacy Json database to the new database between the dates specified in dt_range. + Note that copying large date ranges will be extremely slow. + + Parameters: + paired_obs : a json list of paired observations from the old database. + dt_range : a date range to operate on. + + """ + # only copy recent observations since + dt_end = dt_range[1] + dt_beg = dt_range[0] + + log.info('-----------------------------') + log.info('moving recent observations to sqlite - this may take some time....') + log.info(f'observation date range {dt_beg.isoformat()} to {dt_end.isoformat()}') + + i = 0 + keylist = paired_obs.keys() + for stat_id in keylist: + for obs_id in paired_obs[stat_id]: + try: + obs_date = datetime.datetime.strptime(obs_id.split('_')[1], '%Y%m%d-%H%M%S.%f') + except Exception: + obs_date = datetime.datetime(2000,1,1,0,0,0) + obs_date = obs_date.replace(tzinfo=datetime.timezone.utc) + + if obs_date >= dt_beg and obs_date < dt_end: + self.addPairedObs(obs_id, datetime2JD(obs_date)) + i += 1 + if not i % 100000 and i != 0: + log.info(f'moved {i} observations') + self.dbhandle.commit() + log.info(f'done - moved {i} observations') + log.info('-----------------------------') + return + + def mergeObsDatabase(self, source_db_path): + """ + Merge in records from another database 'source_db_path', for example from a remote node + + Parameters: + source_db_path : full name and path to the source database to merge from + """ + + if not os.path.isfile(source_db_path): + log.warning(f'source database missing: {source_db_path}') + return + # attach the other db, copy the records then detach it + self.dbhandle.execute(f"attach database '{source_db_path}' as sourcedb") + res = self.dbhandle.execute("SELECT name FROM sourcedb.sqlite_master WHERE name='paired_obs'") + if res.fetchone() is None: + # table is missing so nothing to do + status = True + else: + try: + self.dbhandle.execute('insert or replace into paired_obs select * from sourcedb.paired_obs') + status = True + except Exception as e: + log.info(f'unable to merge child observations from {source_db_path}') + log.info(e) + status = False + + self.dbhandle.commit() + self.dbhandle.execute("detach database 'sourcedb'") + return status + + +############################################################ + + +class TrajectoryDatabase(): + """ + A class to handle the sqlite trajectory database transparently. + """ + + def __init__(self, db_path, db_name='trajectories.db', purge_records=False, verbose=False): + """ + initialise the trajectory database + + Parameters: + db_path : path to the location to store the database + db_name : database name + purge_records : boolean, if true, delete any existing records + """ + + db_full_name = os.path.join(db_path, f'{db_name}') + log.info(f'opening database {db_full_name}') + con = sqlite3.connect(db_full_name) + if purge_records: + con.execute('drop table if exists trajectories') + con.execute('drop table if exists failed_trajectories') + con.commit() + res = con.execute("SELECT name FROM sqlite_master WHERE name='trajectories'") + if res.fetchone() is None: + if verbose: + log.info('create table trajectories') + con.execute("""CREATE TABLE trajectories( + jdt_ref REAL UNIQUE, + traj_id VARCHAR UNIQUE, + traj_file_path VARCHAR, + participating_stations VARCHAR, + ignored_stations VARCHAR, + radiant_eci_mini VARCHAR, + state_vect_mini VARCHAR, + phase_1_only INTEGER, + v_init REAL, + gravity_factor REAL, + v0z REAL, + v_avg REAL, + rbeg_jd REAL, + rend_jd REAL, + rbeg_lat REAL, + rbeg_lon REAL, + rbeg_ele REAL, + rend_lat REAL, + rend_lon REAL, + rend_ele REAL, + obs_ids VARCHAR, + ign_obs_ids VARCHAR, + status INTEGER) """) + + res = con.execute("SELECT name FROM sqlite_master WHERE name='failed_trajectories'") + if res.fetchone() is None: + # note: traj_id not set as unique as some fails will have traj-id None + if verbose: + log.info('create table failed_trajectories') + con.execute("""CREATE TABLE failed_trajectories( + jdt_ref REAL UNIQUE, + traj_id VARCHAR, + traj_file_path VARCHAR, + participating_stations VARCHAR, + ignored_stations VARCHAR, + radiant_eci_mini VARCHAR, + state_vect_mini VARCHAR, + phase_1_only INTEGER, + v_init REAL, + gravity_factor REAL, + obs_ids VARCHAR, + ign_obs_ids VARCHAR, + status INTEGER) """) + + con.commit() + self.dbhandle = con + return + + def _commitTrajDatabase(self, verbose=False): + """ + commit the traj db. + This function exists so we can do lazy writes in some cases + """ + + if verbose: + log.info('commit trajdb') + self.dbhandle.commit() + return + + def closeTrajDatabase(self, verbose=False): + """ + close the database, making sure we commit any pending updates + """ + + if verbose: + log.info('close trajdb') + if self.dbhandle: + self._commitTrajDatabase(verbose=verbose) + self.dbhandle.close() + self.dbhandle = None + return + + def checkTrajIfFailed(self, traj_reduced, verbose=False): + """ + Check if a Trajectory was marked failed + + Parameters: + traj_reduced : a TrajReduced object + + Returns + True if there is a failed trajectory with the same jdt_ref and matching list of stations + """ + + if not hasattr(traj_reduced, 'jdt_ref') or not hasattr(traj_reduced, 'participating_stations') or not hasattr(traj_reduced, 'ignored_stations'): + return False + + found = False + station_list = list(set(traj_reduced.participating_stations + traj_reduced.ignored_stations)) + res = self.dbhandle.execute(f"SELECT traj_id,participating_stations, ignored_stations FROM failed_trajectories WHERE jdt_ref={traj_reduced.jdt_ref} and status=1") + row = res.fetchone() + if row is None: + found = False + else: + traj_stations = list(set(json.loads(row[1]) + json.loads(row[2]))) + found = True if (traj_stations == station_list) else False + return found + + def addTrajectory(self, traj_reduced, failed=False, force_add=True, verbose=False): + """ + add or update an entry in the database, setting status = 1 + + Parameters: + traj_reduced : a TrajReduced object + failed : boolean, if true, add the traj to the fails list + + Returns: + true if the trajectory was added, false if it exists already + + """ + + tblname = 'failed_trajectories' if failed else 'trajectories' + + # if force_add is false, don't replace any existing entry + if not force_add and hasattr(traj_reduced, 'traj_id') and traj_reduced.traj_id is not None: + res = self.dbhandle.execute(f'select traj_id from {tblname} where status = 1 and traj_id = "{traj_reduced.traj_id}"') + row = res.fetchone() + if row is not None and row[0] !='None': + return False + + if verbose: + log.info(f' adding jdt {traj_reduced.jdt_ref} to {tblname}') + + # remove the output_dir part from the path so that the data are location-independent + traj_file_path = traj_reduced.traj_file_path[traj_reduced.traj_file_path.find('trajectories'):] + + # and remove windows-style path separators + traj_file_path = traj_file_path.replace('\\','/') + + obs_ids = 'None' if not hasattr(traj_reduced, 'obs_ids') or traj_reduced.obs_ids is None else traj_reduced.obs_ids + ign_obs_ids = 'None' if not hasattr(traj_reduced, 'ign_obs_ids') or traj_reduced.ign_obs_ids is None else traj_reduced.ign_obs_ids + + if failed: + # fixup possible bad values + traj_id = 'None' if not hasattr(traj_reduced, 'traj_id') or traj_reduced.traj_id is None else traj_reduced.traj_id + v_init = 0 if traj_reduced.v_init is None else traj_reduced.v_init + radiant_eci_mini = [0,0,0] if traj_reduced.radiant_eci_mini is None else traj_reduced.radiant_eci_mini + state_vect_mini = [0,0,0] if traj_reduced.state_vect_mini is None else traj_reduced.state_vect_mini + + sql_str = (f'insert or replace into failed_trajectories values (' + f"{traj_reduced.jdt_ref}, '{traj_id}', '{traj_file_path}'," + f"'{json.dumps(traj_reduced.participating_stations)}'," + f"'{json.dumps(traj_reduced.ignored_stations)}'," + f"'{json.dumps(radiant_eci_mini)}'," + f"'{json.dumps(state_vect_mini)}'," + f"0,{v_init},{traj_reduced.gravity_factor}," + f"'{json.dumps(obs_ids)}'," + f"'{json.dumps(ign_obs_ids)}',1)") + else: + sql_str = (f'insert or replace into trajectories values (' + f"{traj_reduced.jdt_ref}, '{traj_reduced.traj_id}', '{traj_file_path}'," + f"'{json.dumps(traj_reduced.participating_stations)}'," + f"'{json.dumps(traj_reduced.ignored_stations)}'," + f"'{json.dumps(traj_reduced.radiant_eci_mini)}'," + f"'{json.dumps(traj_reduced.state_vect_mini)}'," + f"{traj_reduced.phase_1_only},{traj_reduced.v_init},{traj_reduced.gravity_factor}," + f"{traj_reduced.v0z},{traj_reduced.v_avg}," + f"{traj_reduced.rbeg_jd},{traj_reduced.rend_jd}," + f"{traj_reduced.rbeg_lat},{traj_reduced.rbeg_lon},{traj_reduced.rbeg_ele}," + f"{traj_reduced.rend_lat},{traj_reduced.rend_lon},{traj_reduced.rend_ele}," + f"'{json.dumps(obs_ids)}'," + f"'{json.dumps(ign_obs_ids)}',1)") + + sql_str = sql_str.replace('nan','"NaN"') + try: + self.dbhandle.execute(sql_str) + except Exception as e: + print(e) + print(sql_str) + self.dbhandle.commit() + return True + + def removeTrajectory(self, traj_reduced, failed=False, verbose=False): + """ + Mark a trajectory unsolved + If an entry exists, update the status to 0. + + Parameters: + traj_reduced : a TrajReduced object + failed : boolean, if true then remove from the fails list + """ + if verbose: + log.info(f'removing {traj_reduced.traj_id}') + table_name = 'failed_trajectories' if failed else 'trajectories' + + self.dbhandle.execute(f"update {table_name} set status=0 where jdt_ref='{traj_reduced.jdt_ref}'") + self.dbhandle.commit() + + return True + + def removeTrajectoryById(self, traj_id, failed=False, verbose=False): + """ + Mark a trajectory unsolved + If an entry exists, update the status to 0. + + Parameters: + traj_id : a trajectory ID + failed : boolean, if true then remove from the fails list + """ + if verbose: + log.info(f'removing {traj_id}') + table_name = 'failed_trajectories' if failed else 'trajectories' + + self.dbhandle.execute(f"update {table_name} set status=0 where traj_id='{traj_id}'") + self.dbhandle.commit() + + return True + + + def getTrajectories(self, output_dir, jdt_range, failed=False, verbose=False): + """ + Get a list of trajectories between two julian dates + + Parameters: + output_dir : output_dir specified when invoking CorrelateRMS - will be prepended to the trajectory path + jdt_range : tuple of julian dates to retrieve data between. if the 2nd date is None, retrieve all data to today + failed : boolean - if true, retrieve failed traj rather than successful ones + + Returns: + trajs: json list of traj_reduced objects + """ + + jdt_start, jdt_end = jdt_range + + table_name = 'failed_trajectories' if failed else 'trajectories' + if verbose: + log.info(f'getting trajectories between {jd2Date(jdt_start, dt_obj=True).strftime("%Y%m%d_%M%M%S.%f")} and {jd2Date(jdt_end, dt_obj=True).strftime("%Y%m%d_%M%M%S.%f")}') + + if not jdt_end: + self.dbhandle.execute(f"SELECT * FROM {table_name} WHERE jdt_ref={jdt_start}") + rows = cur.fetchall() + else: + rows = self.dbhandle.execute(f"SELECT * FROM {table_name} WHERE jdt_ref>={jdt_start} and jdt_ref<={jdt_end}") + trajs = [] + for rw in rows.fetchall(): + rw = [np.nan if x == 'NaN' else x for x in rw] + json_dict = {'jdt_ref':rw[0], 'traj_id':rw[1], 'traj_file_path':os.path.join(output_dir, rw[2]), + 'participating_stations': json.loads(rw[3]), + 'ignored_stations': json.loads(rw[4]), + 'radiant_eci_mini': json.loads(rw[5]), + 'state_vect_mini': json.loads(rw[6]), + 'phase_1_only': rw[7], 'v_init': rw[8],'gravity_factor': rw[9], + 'v0z': rw[10], 'v_avg': rw[11], + 'rbeg_jd': rw[12], 'rend_jd': rw[13], + 'rbeg_lat': rw[14], 'rbeg_lon': rw[15], 'rbeg_ele': rw[16], + 'rend_lat': rw[17], 'rend_lon': rw[18], 'rend_ele': rw[19], + 'obs_ids': json.loads(rw[20]), 'ign_obs_ids': json.loads(rw[21]), + } + + trajs.append(json_dict) + return trajs + + def getTrajBasics(self, output_dir, jdt_range, failed=False, verbose=False): + """ + Get a list of minimal trajectory details between two dates + + Parameters: + output_dir : output_dir specified when invoking CorrelateRMS - will be prepended to the trajectory path + jdt_range : tuple of julian dates to retrieve data betwee + failed : boolean, if true retrieve names of fails, otherwise retrieve successful + + Returns: + trajs: a json list of tuples of {jdt_ref, traj_id, traj_file_path} + + """ + + jdt_start, jdt_end = jdt_range + table_name = 'failed_trajectories' if failed else 'trajectories' + if not jdt_start: + cur = self.dbhandle.execute(f"SELECT jdt_ref, traj_id, traj_file_path, obs_ids, ign_obs_ids FROM {table_name} where status=1 order by jdt_ref") + rows = cur.fetchall() + elif not jdt_end: + cur = self.dbhandle.execute(f"SELECT jdt_ref, traj_id, traj_file_path, obs_ids, ign_obs_ids FROM {table_name} WHERE jdt_ref={jdt_start} and status=1 order by jdt_ref") + rows = cur.fetchall() + else: + cur = self.dbhandle.execute(f"SELECT jdt_ref, traj_id, traj_file_path, obs_ids, ign_obs_ids FROM {table_name} WHERE jdt_ref>={jdt_start} and jdt_ref<={jdt_end} and status=1 order by jdt_ref") + rows = cur.fetchall() + trajs = [] + for rw in rows: + trajs.append({'jdt_ref':rw[0], 'traj_id':rw[1], 'traj_file_path':os.path.join(output_dir, rw[2]), + 'obs_ids':json.loads(rw[3]), 'ign_obs_ids':json.loads(rw[4])}) + return trajs + + def archiveTrajDatabase(self, db_path, arch_prefix, archdate_jd): + """ + archive records older than archdate_jd to a database {arch_prefix}_trajectories.db + + Parameters: + db_path : path to the location of the archive database + arch_prefix : prefix to apply - typically of the form yyyymm. Set to None to purge data without archiving. + archdate_jd : julian date before which to archive data. Default is now-21 dayss + + """ + # if no archdate is set, then set it to 21 days + if archdate_jd is None: + archdate = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(days=21) + archdate_jd = datetime2JD(archdate) + + log.info(f'{"Archiving" if arch_prefix else "Purging"} trajectories database') + + purge_ok = True + if arch_prefix: + # create the archive database if it doesnt exist + archdb_name = f'{arch_prefix}_trajectories.db' + archdb = TrajectoryDatabase(db_path, archdb_name) + archdb.closeTrajDatabase() + + # attach the arch db, copy the records then delete them + archdb_fullname = os.path.join(db_path, f'{archdb_name}') + cur = self.dbhandle.execute(f"attach database '{archdb_fullname}' as archdb") + for table_name in ['trajectories', 'failed_trajectories']: + try: + # bulk-copy if possible + cur.execute(f'insert or replace into archdb.{table_name} select * from {table_name} where jdt_ref < {archdate_jd}') + except Exception: + log.warning(f'unable to archive {table_name} in trajectories database') + purge_ok = False + + self.dbhandle.commit() + + if purge_ok: + self.purgeTrajDatabase(archdate_jd=archdate_jd) + return + + def purgeTrajDatabase(self, archdate_jd=None): + """ + purge records from before a specified julian date. + + parameters: + archdate_jd: julian date before which to purge. Default None will purge records more than 21 days old + + """ + if archdate_jd is None: + archdate = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(days=21) + archdate_jd = datetime2JD(archdate) + + for table_name in ['trajectories', 'failed_trajectories']: + cur = self.dbhandle.execute(f'select count(*) from {table_name} where jdt_ref < {archdate_jd}') + res = cur.fetchone() + count = res[0] if res else 0 + log.info(f' purging {count} records from {table_name}') + self.dbhandle.execute(f'delete from {table_name} where jdt_ref < {archdate_jd}') + self.dbhandle.commit() + return + + + + def copyTrajJsonRecords(self, trajectories, dt_range, failed=True, max_days=14): + """ + Copy trajectories from the old Json database + We generally only copy recent records since if we ever run for an historic date + its likely we will want to reanalyse all available data + + Parameters: + + trajectories : json list of trajetories extracted from the old Json DB + dt_range: : date range to use, at most fourteen days at a time + failed : boolean, default true to move failed traj + + """ + jd_end = datetime2JD(dt_range[1]) + jd_beg = max(datetime2JD(dt_range[0]), jd_end - max_days) + + log.info(f'moving recent {"" if failed is False else "failed"} trajectories to sqlite - this may take some time....') + log.info(f'trajectory date range {jd2Date(jd_beg, dt_obj=True).isoformat()} to {dt_range[1].isoformat()}') + + keylist = [k for k in trajectories.keys() if float(k) >= jd_beg and float(k) <= jd_end] + i = 0 # just in case there aren't any trajectories to move + for i,jdt_ref in enumerate(keylist): + self.addTrajectory(trajectories[jdt_ref], failed=failed) + i += 1 + if not i % 10000: + self._commitTrajDatabase() + log.info(f'moved {i} {"" if failed is False else "failed"} trajectories') + self._commitTrajDatabase() + log.info(f'done - moved {i} {"" if failed is False else "failed"} trajectories') + + return + + def mergeTrajDatabase(self, source_db_path): + """ + merge in records from another database, for example from a remote node + + Parameters: + source_db_path : the full name of the source database from which to merge in records + + """ + + if not os.path.isfile(source_db_path): + log.warning(f'source database missing: {source_db_path}') + return + # attach the other db, copy the records then detach it + cur = self.dbhandle.execute(f"attach database '{source_db_path}' as sourcedb") + + status = True + for table_name in ['trajectories', 'failed_trajectories']: + try: + # bulk-copy if possible + cur.execute(f'insert or replace into {table_name} select * from sourcedb.{table_name}') + except Exception: + log.warning(f'unable to merge data from {source_db_path}') + status = False + self.dbhandle.commit() + cur.execute("detach database 'sourcedb'") + return status + + +############################################################ + + +class CandidateDatabase(): + """ + A class to handle the sqlite candidates database transparently. + """ + + def __init__(self, db_path:str, db_name='candidates.db', keep=21, verbose=False): + """ + Create a database instance + + Parameters: + db_path : path to the location of the database + db_name : name to use, typically candidates.db + keep : Amount of data to keep. Default 21 days + + """ + db_full_name = os.path.join(db_path, f'{db_name}') + if verbose: + log.info(f'opening database {db_full_name}') + con = sqlite3.connect(db_full_name) + con.execute('pragma journal_mode=wal') + res = con.execute("SELECT name FROM sqlite_master WHERE name='candidates'") + if res.fetchone() is None: + if verbose: + log.info('create table candidates') + con.execute("CREATE TABLE candidates(cand_id VARCHAR UNIQUE, ref_dt REAL, obs_ids VARCHAR, status INTEGER)") + con.commit() + self.dbhandle = con + if keep > 0: + keep_dt = datetime.datetime.now().replace(tzinfo=datetime.timezone.utc) - datetime.timedelta(days=keep) + keep_jd = datetime2JD(keep_dt) + self.purgeCandDatabase(archdate_jd=keep_jd) + + def _commitCandDatabase(self): + """ + Commit the db. This function exists so we can do lazy writes + """ + self.dbhandle.commit() + try: + self.dbhandle.execute('pragma wal_checkpoint(TRUNCATE)') + except Exception: + self.dbhandle.execute('pragma wal_checkpoint(PASSIVE)') + return + + def closeCandDatabase(self): + """ + Close database, making sure we commit any pending updates + """ + if self.dbhandle: + self._commitCandDatabase() + self.dbhandle.close() + self.dbhandle = None + return + + def checkAndAddCand(self, cand_id:str, ref_dt:float, obs_ids:list, verbose=False): + """ + Check and add a candidate if its not already there + + Parameters: + cand_id : candidate ID + ref_dt : reference date as a timestamp + obs_ids : list of observation IDs + + Returns: + True if added, False if its already present + """ + + to_be_added = True + cur = self.dbhandle.execute(f"SELECT * FROM candidates WHERE cand_id='{cand_id}' and status=1") + if cur.fetchone() is not None: + to_be_added = False + else: + to_be_added = True + obs_ids_str = json.dumps(list(set(obs_ids))) + self.dbhandle.execute(f"insert into candidates values ('{cand_id}',{ref_dt},'{obs_ids_str}',1)") + self.dbhandle.commit() + if verbose: + log.info(f'{cand_id} {"was added to the database" if to_be_added else "already present"}') + return to_be_added + + def getCandidateObs(self, cand_id:str, verbose=False): + """ + retrieve a list of observations linked to a candidate + + Parameters: + cand_id : candidate ID + + Returns: + the observations linked to the candidate + + This function is currently unused + """ + + obs_ids = [] + cur = self.dbhandle.execute(f"SELECT obs_ids FROM candidates WHERE cand_id='{cand_id}' and status=1") + rw = cur.fetchone() + if rw is not None: + obs_ids= json.loads(rw[0]) + if verbose: + log.info(f'{cand_id} contains {obs_ids}') + return obs_ids + + def purgeCandDatabase(self, archdate_jd=None): + """ + purge old candidates after 'keep' weeks + + Parameters: + keep : days to keep data for, default 21 + """ + if archdate_jd is None: + keep_dt = datetime.datetime.now().replace(tzinfo=datetime.timezone.utc) - datetime.timedelta(days=21) + else: + keep_dt = jd2Date(archdate_jd,dt_obj=True) + + log.info(f'purging candidates older than {keep_dt.isoformat()}') + self.dbhandle.execute(f"delete from candidates where ref_dt < {keep_dt.timestamp()}") + self.dbhandle.commit() + return + + def archiveCandDatabase(self, db_path, arch_prefix, archdate_jd): + """ + archive records older than archdate_jd to a database {arch_prefix}_candidates.db + + Parameters: + db_path : path to the location of the archive database + arch_prefix : prefix to apply - typically of the form yyyymm + archdate_jd : julian date before which to archive data + + """ + + if archdate_jd is None: + keep_dt = datetime.datetime.now().replace(tzinfo=datetime.timezone.utc) - datetime.timedelta(days=21) + else: + keep_dt = jd2Date(archdate_jd,dt_obj=True) + + purge_ok = True + if arch_prefix: + # create the archive database if it doesnt exist + archdb_name = f'{arch_prefix}_candidates.db' + archdb = CandidateDatabase(db_path, archdb_name) + archdb.closeCandDatabase() + + # attach the arch db, copy the records then delete them + archdb_fullname = os.path.join(db_path, f'{archdb_name}') + cur = self.dbhandle.execute(f"attach database '{archdb_fullname}' as archdb") + try: + cur.execute(f'insert or replace into archdb.candidates select * from candidates where ref_dt < {keep_dt.timestamp()}') + except Exception: + log.warning(f'unable to archive candidate database') + purge_ok = False + + if purge_ok: + self.purgeCandDatabase(archdate_jd=archdate_jd) + + self.dbhandle.commit() + return + + def mergeCandDatabase(self, source_db_path): + """ + merge in records from another observation database, for example from a remote node + + Parameters: + source_db_path : the full name of the source database from which to merge in records + + """ + + if not os.path.isfile(source_db_path): + log.warning(f'source database missing: {source_db_path}') + return + # attach the other db, copy the records then detach it + cur = self.dbhandle.execute(f"attach database '{source_db_path}' as sourcedb") + + status = True + for table_name in ['candidates']: + try: + # bulk-copy if possible + cur.execute(f'insert or replace into {table_name} select * from sourcedb.{table_name}') + except Exception: + log.warning(f'unable to merge data from {source_db_path}') + status = False + self.dbhandle.commit() + cur.execute("detach database 'sourcedb'") + return status + + +################################################################################## +# dummy classes for use in the above. +# We can't import from CorrelateRMS as this would create a circular reference + + +class DummyTrajReduced(): + """ + a dummy class for handling TrajReduced objects. + We can't import CorrelateRMS as that would create a circular dependency + """ + def __init__(self, jdt_ref=None, traj_id=None, traj_file_path=None, json_dict=None): + if json_dict is None: + self.jdt_ref = jdt_ref + self.traj_id = traj_id + self.traj_file_path = traj_file_path + else: + self.__dict__ = json_dict + + +class dummyDatabaseJSON(): + """ + Dummy class to handle the old Json data format + We can't import CorrelateRMS as that would create a circular dependency + """ + def __init__(self, db_dir, dt_range=None): + self.db_file_path = os.path.join(db_dir, 'processed_trajectories.json') + self.paired_obs = {} + self.failed_trajectories = {} + if os.path.exists(self.db_file_path): + self.__dict__ = json.load(open(self.db_file_path)) + + if hasattr(self, 'failed_trajectories'): + # Convert trajectories from JSON to TrajectoryReduced objects + traj_dict = getattr(self, "failed_trajectories") + trajectories_obj_dict = {} + for traj_json in traj_dict: + traj_reduced_tmp = DummyTrajReduced(json_dict=traj_dict[traj_json]) + trajectories_obj_dict[traj_reduced_tmp.jdt_ref] = traj_reduced_tmp + setattr(self, "failed_trajectories", trajectories_obj_dict) + + if hasattr(self, 'trajectories'): + # Convert trajectories from JSON to TrajectoryReduced objects + traj_dict = getattr(self, "trajectories") + trajectories_obj_dict = {} + for traj_json in traj_dict: + traj_reduced_tmp = DummyTrajReduced(json_dict=traj_dict[traj_json]) + trajectories_obj_dict[traj_reduced_tmp.jdt_ref] = traj_reduced_tmp + setattr(self, "trajectories", trajectories_obj_dict) + + +################################################################################## + + +if __name__ == '__main__': + arg_parser = argparse.ArgumentParser(description="""Automatically compute trajectories from RMS data in the given directory.""", + formatter_class=argparse.RawTextHelpFormatter) + + arg_parser.add_argument('--dir_path', type=str, default=None, help='Path to the directory containing the databases.') + + arg_parser.add_argument('--database', type=str, default=None, help='Database to process, either observations or trajectories') + + arg_parser.add_argument('--action', type=str, default=None, help='Action to take on the database') + + arg_parser.add_argument('--stmt', type=str, default=None, help='statement to execute eg "select * from paired_obs"') + + arg_parser.add_argument("--logdir", type=str, default=None, + help="Path to the directory where the log files will be stored. If not given, a logs folder will be created in the database folder") + + arg_parser.add_argument('-r', '--timerange', metavar='TIME_RANGE', + help="""Apply action to this date range in the format: "(YYYYMMDD-HHMMSS,YYYYMMDD-HHMMSS)".""", type=str) + + cml_args = arg_parser.parse_args() + # Find the log directory + log_dir = cml_args.logdir + if log_dir is None: + log_dir = os.path.join(cml_args.dir_path, 'logs') + os.makedirs(log_dir, exist_ok=True) + log.setLevel(logging.DEBUG) + + # Init the log formatter + log_formatter = logging.Formatter( + fmt='%(asctime)s-%(levelname)-5s-%(module)-15s:%(lineno)-5d- %(message)s', + datefmt='%Y/%m/%d %H:%M:%S') + + # Init the file handler + timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S") + log_file = os.path.join(log_dir, f"correlate_db_{timestamp}.log") + file_handler = logging.handlers.TimedRotatingFileHandler(log_file, when="midnight", backupCount=7) + file_handler.setFormatter(log_formatter) + log.addHandler(file_handler) + + # Init the console handler (i.e. print to console) + console_handler = logging.StreamHandler() + console_handler.setFormatter(log_formatter) + log.addHandler(console_handler) + + if cml_args.database: + dbname = cml_args.database.lower() + action = cml_args.action.lower() + + stmt = cml_args.stmt + + dt_range = None + if cml_args.timerange is not None: + time_beg, time_end = cml_args.timerange.strip("(").strip(")").split(",") + dt_beg = datetime.datetime.strptime(time_beg, "%Y%m%d-%H%M%S").replace(tzinfo=datetime.timezone.utc) + dt_end = datetime.datetime.strptime(time_end, "%Y%m%d-%H%M%S").replace(tzinfo=datetime.timezone.utc) + log.info("Custom time range:") + log.info(" BEG: {:s}".format(str(dt_beg))) + log.info(" END: {:s}".format(str(dt_end))) + dt_range = [dt_beg, dt_end] + + + if action == 'copy': + if dt_range is None: + log.info('Date range must be provided for copy operation') + else: + dt_range_jd = [datetime2JD(dt_range[0]),datetime2JD(dt_range[1])] + jsondb = dummyDatabaseJSON(db_dir=cml_args.dir_path) + obsdb = ObservationsDatabase(cml_args.dir_path) + obsdb.copyObsJsonRecords(jsondb.paired_obs, dt_range) + obsdb.closeObsDatabase() + trajdb = TrajectoryDatabase(cml_args.dir_path) + trajdb.copyTrajJsonRecords(jsondb.failed_trajectories, dt_range, failed=True) + trajdb.copyTrajJsonRecords(jsondb.trajectories, dt_range, failed=False) + trajdb.closeTrajDatabase() + else: + if dbname == 'observations': + obsdb = ObservationsDatabase(cml_args.dir_path) + if action == 'status': + cur = obsdb.dbhandle.execute('select * from paired_obs where status=1') + print(f'there are {len(cur.fetchall())} paired obs') + cur = obsdb.dbhandle.execute('select * from paired_obs where status=0') + print(f'and {len(cur.fetchall())} unpaired obs') + if action == 'execute': + print(stmt) + cur = obsdb.dbhandle.execute(stmt) + for rw in cur.fetchall(): + print(rw) + obsdb.closeObsDatabase() + + elif dbname == 'trajectories': + trajdb = TrajectoryDatabase(cml_args.dir_path) + if action == 'status': + cur = trajdb.dbhandle.execute('select * from trajectories where status=1') + print(f'there are {len(cur.fetchall())} successful trajectories') + cur = trajdb.dbhandle.execute('select * from failed_trajectories') + print(f'and {len(cur.fetchall())} failed trajectories') + if action == 'execute': + print(stmt) + cur = trajdb.dbhandle.execute(stmt) + for rw in cur.fetchall(): + print(rw) + trajdb.closeTrajDatabase() + else: + log.info('valid database not specified') diff --git a/wmpl/Trajectory/CorrelateEngine.py b/wmpl/Trajectory/CorrelateEngine.py index 52ff61f1..3c3a482c 100644 --- a/wmpl/Trajectory/CorrelateEngine.py +++ b/wmpl/Trajectory/CorrelateEngine.py @@ -8,7 +8,6 @@ import multiprocessing import logging import os - import numpy as np from wmpl.Trajectory.Trajectory import ObservedPoints, PlaneIntersection, Trajectory, moveStateVector @@ -19,11 +18,34 @@ from wmpl.Utils.TrajConversions import J2000_JD, geo2Cartesian, cartesian2Geo, raDec2AltAz, altAz2RADec, \ raDec2ECI, datetime2JD, jd2Date, equatorialCoordPrecession_vect +MCMODE_NONE = 0 +MCMODE_PHASE1 = 1 +MCMODE_PHASE2 = 2 +MCMODE_CANDS = 4 +MCMODE_SIMPLE = MCMODE_CANDS + MCMODE_PHASE1 +MCMODE_BOTH = MCMODE_PHASE1 + MCMODE_PHASE2 +MCMODE_ALL = MCMODE_CANDS + MCMODE_PHASE1 + MCMODE_PHASE2 + # Grab the logger from the main thread log = logging.getLogger("traj_correlator") +def getMcModeStr(mcmode, strtype=0): + modestrs = {4:'cands', 1:'simple', 2:'mcphase', 5:'candsimple', 3:'simplemc',7:'full',0:'full'} + fullmodestrs = {4:'CANDIDATE STAGE', 1:'SIMPLE STAGE', 2:'MONTE CARLO STAGE', 7:'FULL',0:'FULL'} + if strtype == 0: + if mcmode in fullmodestrs.keys(): + return fullmodestrs[mcmode] + else: + return 'MIXED' + else: + if mcmode in modestrs.keys(): + return modestrs[mcmode] + else: + return False + + def pickBestStations(obslist, max_stns): """ Find the stations with the best statistics @@ -239,6 +261,8 @@ def __init__(self, data_handle, traj_constraints, v_init_part, data_in_j2000=Tru # enable OS style ground maps if true self.enableOSM = enableOSM + self.candidatemode = None + def trajectoryRangeCheck(self, traj_reduced, platepar): """ Check that the trajectory is within the range limits. @@ -423,7 +447,9 @@ def checkFOVOverlap(self, rp, tp): def initObservationsObject(self, met, pp, ref_dt=None): - """ Init the observations object which will be fed into the trajectory solver. """ + """ + Init the observations object which will be fed into the trajectory solver. + """ # If the reference datetime is given, apply a time offset if ref_dt is not None: @@ -471,12 +497,18 @@ def initObservationsObject(self, met, pp, ref_dt=None): np.radians(pp.lon), pp.elev, meastype=1, station_id=pp.station_code, magnitudes=mag_data, ignore_list=ignore_list, fov_beg=met.fov_beg, fov_end=met.fov_end, comment=comment) + # we seem to have two variables for observation id - need to tidy up this! + obs.id = met.id if hasattr(met, 'id') else None + obs.obs_id = obs.id + return obs def projectPointToTrajectory(self, indx, obs, plane_intersection): - """ Compute lat, lon and height of given point on the meteor trajectory. """ + """ + Compute lat, lon and height of given point on the meteor trajectory. + """ meas_vector = obs.meas_eci[indx] jd = obs.JD_data[indx] @@ -493,7 +525,9 @@ def projectPointToTrajectory(self, indx, obs, plane_intersection): def quickTrajectorySolution(self, obs1, obs2): - """ Perform an intersecting planes solution and check if it satisfies specified sanity checks. """ + """ + Perform an intersecting planes solution and check if it satisfies specified sanity checks. + """ # Do the plane intersection solution plane_intersection = PlaneIntersection(obs1, obs2) @@ -535,8 +569,8 @@ def quickTrajectorySolution(self, obs1, obs2): or (ht2_end < self.traj_constraints.min_end_ht): log.info("Meteor heights outside allowed range!") - log.info("H1_beg: {:.2f}, H1_end: {:.2f}".format(ht1_beg, ht1_end)) - log.info("H2_beg: {:.2f}, H2_end: {:.2f}".format(ht2_beg, ht2_end)) + log.info(" H1_beg: {:.2f}, H1_end: {:.2f}".format(ht1_beg, ht1_end)) + log.info(" H2_beg: {:.2f}, H2_end: {:.2f}".format(ht2_beg, ht2_end)) return None @@ -601,7 +635,7 @@ def initTrajectory(self, jdt_ref, mc_runs, verbose=False): return traj - def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=None): + def solveTrajectory(self, traj, mc_runs, mcmode=MCMODE_ALL, matched_obs=None, orig_traj=None, verbose=False): """ Given an initialized Trajectory object with observation, run the solver and automatically reject bad observations. @@ -630,23 +664,23 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # make a note of how many observations are already marked ignored. initial_ignore_count = len([obs for obs in traj.observations if obs.ignore_station]) log.info(f'initially ignoring {initial_ignore_count} stations...') + successful_traj_fit = False - # run the first phase of the solver if mcmode is 0 or 1 - if mcmode < 2: + # run the first phase of the solver if mcmode is MCMODE_PHASE1 + if mcmode & MCMODE_PHASE1: # Disable Monte Carlo runs until an initial stable set of observations is found traj.monte_carlo = False # Run the solver try: traj_status = traj.run() - # If solving has failed, stop solving the trajectory except ValueError as e: log.info("Error during trajectory estimation!") print(e) + # TODO do we need to add the trajectory to the failed traj database here? return False - # Reject bad observations until a stable set is found, but only if there are more than 2 # stations. Only one station will be rejected at one point in time successful_traj_fit = False @@ -678,7 +712,8 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Skip this part if there are less than 3 stations - if len(traj.observations) < 3: + active_obs = [obstmp for obstmp in traj.observations if not obstmp.ignore_station] + if len(active_obs) < 3: break @@ -707,7 +742,8 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N max_rejections_possible = int(np.ceil(0.5*len(traj_status.observations))) + initial_ignore_count log.info(f'max stations allowed to be rejected is {max_rejections_possible}') for i, obs in enumerate(traj_status.observations): - + if obs.ignore_station: + continue # Compute the median angular uncertainty of all other non-ignored stations ang_res_list = [obstmp.ang_res_std for j, obstmp in enumerate(traj_status.observations) if (i != j) and not obstmp.ignore_station] @@ -718,10 +754,6 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N ang_res_median = np.median(ang_res_list) - # ### DEBUG PRINT - # print(obs.station_id, 'ang res:', np.degrees(obs.ang_res_std)*3600, \ - # np.degrees(ang_res_median)*3600) - # Check if the current observations is larger than the minimum limit, and # outside the median limit or larger than the maximum limit if (obs.ang_res_std > np.radians(self.traj_constraints.min_arcsec_err/3600)) \ @@ -795,19 +827,26 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Init a new trajectory object (make sure to use the new reference Julian date) - traj = self.initTrajectory(traj_status.jdt_ref, mc_runs, verbose=False) + traj = self.initTrajectory(traj_status.jdt_ref, mc_runs, verbose=verbose) # Disable Monte Carlo runs until an initial stable set of observations is found traj.monte_carlo = False - # Reinitialize the observations, rejecting the ignored stations + # Reinitialize the observations. Note we *include* the ignored obs as they're internally marked ignored + # and so will be skipped, but to avoid confusion in the logs we only print the names of the non-ignored ones for obs in traj_status.observations: + traj.infillWithObs(obs) if not obs.ignore_station: - log.info(f'Adding {obs.station_id}') - traj.infillWithObs(obs) + log.info(f'Adding {obs.obs_id}') log.info("") - log.info(f'Rerunning the trajectory solution with {len(traj.observations)} stations...') + active_stns = len([obs for obs in traj.observations if not obs.ignore_station]) + if active_stns < 2: + log.info(f"Only {active_stns} stations left - trajectory estimation failed!") + skip_trajectory = True + break + + log.info(f'Rerunning the trajectory solution with {active_stns} stations...') # Re-run the trajectory solution try: traj_status = traj.run() @@ -816,7 +855,8 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N except ValueError as e: log.info("Error during trajectory estimation!") print(e) - return False + skip_trajectory = True + break # If the trajectory estimation failed, skip this trajectory @@ -835,21 +875,17 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Skip the trajectory if no good solution was found if skip_trajectory: - # Add the trajectory to the list of failed trajectories - self.dh.addTrajectory(traj, failed_jdt_ref=jdt_ref) - log.info("Trajectory skipped and added to fails!") + ref_dt = jd2Date(min([met_obs.jdt_ref for met_obs in traj.observations]), dt_obj=True) + log.info(f"Trajectory at {ref_dt.isoformat()} skipped and added to fails!") + traj.obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is False] + traj.ign_obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is True] + self.dh.addTrajectory(traj, failed_jdt_ref=jdt_ref, verbose=verbose) return False - # # If the trajectory solutions was not done at any point, skip the trajectory completely - # if traj_best is None: - # return False - - # # Otherwise, use the best trajectory solution until the solving failed - # else: - # log.info("Using previously estimated best trajectory...") - # traj_status = traj_best - + # restore the obs ids + traj_status.obs_ids = [obs.obs_id for obs in traj_status.observations if obs.ignore_station is False] + traj_status.ign_obs_ids = [obs.obs_id for obs in traj_status.observations if obs.ignore_station is True] # If there are only two stations, make sure to reject solutions which have stations with # residuals higher than the maximum limit @@ -857,11 +893,13 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N if np.any([(obstmp.ang_res_std > np.radians(self.traj_constraints.max_arcsec_err/3600)) for obstmp in traj_status.observations]): + ref_dt = jd2Date(min([met_obs.jdt_ref for met_obs in traj.observations]), dt_obj=True) + traj_status.obs_ids = [obs.obs_id for obs in traj_status.observations if obs.ignore_station is False] + traj_status.ign_obs_ids = [obs.obs_id for obs in traj_status.observations if obs.ignore_station is True] log.info("2 station only solution, one station has an error above the maximum limit, skipping!") - # Add the trajectory to the list of failed trajectories - self.dh.addTrajectory(traj_status, failed_jdt_ref=jdt_ref) - + log.info(f"Trajectory at {ref_dt.isoformat()} skipped and added to fails!") + self.dh.addTrajectory(traj_status, failed_jdt_ref=jdt_ref, verbose=verbose) return False @@ -869,7 +907,7 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N traj = traj_status # if we're only doing the simple solution, then print the results - if mcmode == 1: + if mcmode == MCMODE_PHASE1: # Only proceed if the orbit could be computed if traj.orbit.ra_g is not None: # Update trajectory file name @@ -885,18 +923,16 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N else: shower_code = shower_obj.IAU_code log.info("Shower: {:s}".format(shower_code)) + + if mcmode & MCMODE_PHASE1: successful_traj_fit = True log.info('finished initial solution') ##### end of simple soln phase ##### now run the Monte-carlo phase, if the mcmode is 0 (do both) or 2 (mc-only) - if mcmode == 0 or mcmode == 2: - if mcmode == 2: - traj_status = traj + if mcmode & MCMODE_PHASE2: + traj_status = traj - # save the traj in case we need to clean it up - save_traj = traj - # Only proceed if the orbit could be computed if traj.orbit.ra_g is not None: @@ -905,7 +941,7 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N log.info("Stable set of observations found, computing uncertainties using Monte Carlo...") # Init a new trajectory object (make sure to use the new reference Julian date) - traj = self.initTrajectory(traj_status.jdt_ref, mc_runs, verbose=False) + traj = self.initTrajectory(traj_status.jdt_ref, mc_runs, verbose=verbose) # Enable Monte Carlo traj.monte_carlo = True @@ -918,7 +954,7 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Don't do this in mc-only mode since phase1 has already selected the stations and we could # create duplicate orbits if we now exclude some stations from the solution # TODO should we do this here *at all* ? - if len(non_ignored_observations) > self.traj_constraints.max_stations and mcmode != 2: + if len(non_ignored_observations) > self.traj_constraints.max_stations and mcmode != MCMODE_PHASE2: # Sort the observations by residuals (smallest first) # TODO: implement better sorting algorithm @@ -940,6 +976,7 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Reinitialize the observations, rejecting ignored stations for obs in obs_selected: if not obs.ignore_station: + #log.info(f'adding obs_id {obs.obs_id}') traj.infillWithObs(obs) @@ -951,7 +988,6 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N except ValueError as e: log.info("Error during trajectory estimation!") print(e) - self.dh.cleanupPhase2TempPickle(save_traj) return False @@ -959,10 +995,12 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N if traj_status is None: # Add the trajectory to the list of failed trajectories - if mcmode != 2: - self.dh.addTrajectory(traj, failed_jdt_ref=jdt_ref) - log.info('Trajectory failed to solve') - self.dh.cleanupPhase2TempPickle(save_traj) + if mcmode != MCMODE_PHASE2: + traj.obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is False] + traj.ign_obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is True] + self.dh.addTrajectory(traj, failed_jdt_ref=jdt_ref, verbose=verbose) + ref_dt = jd2Date(min([met_obs.jdt_ref for met_obs in traj.observations]), dt_obj=True) + log.info(f"Trajectory at {ref_dt.isoformat()} skipped and added to fails!") return False @@ -975,7 +1013,6 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N log.info("Average velocity outside range: {:.1f} < {:.1f} < {:.1f} km/s, skipping...".format(self.traj_constraints.v_avg_min, traj.orbit.v_avg/1000, self.traj_constraints.v_avg_max)) - self.dh.cleanupPhase2TempPickle(save_traj) return False @@ -983,14 +1020,12 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N for obs in traj.observations: if (obs.rbeg_ele is None) and (not obs.ignore_station): log.info("Heights from observations failed to be estimated!") - self.dh.cleanupPhase2TempPickle(save_traj) return False # Check that the orbit could be computed if traj.orbit.ra_g is None: log.info("The orbit could not be computed!") - self.dh.cleanupPhase2TempPickle(save_traj) return False # Set the trajectory fit as successful @@ -1015,7 +1050,6 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N else: log.info("The orbit could not be computed!") - self.dh.cleanupPhase2TempPickle(save_traj) return False @@ -1023,77 +1057,204 @@ def solveTrajectory(self, traj, mc_runs, mcmode=0, matched_obs=None, orig_traj=N # Save the trajectory if successful. if successful_traj_fit: # restore the original traj_id so that the phase1 and phase 2 results use the same ID - if mcmode == 2: + if mcmode == MCMODE_PHASE2: traj.traj_id = saved_traj_id traj.phase_1_only = False - if mcmode == 1: + if mcmode == MCMODE_PHASE1: traj.phase_1_only = True if orig_traj: log.info(f"Removing the previous solution {os.path.dirname(orig_traj.traj_file_path)} ...") - self.dh.removeTrajectory(orig_traj) + manage_phase1 = True if abs(round((traj.jdt_ref-orig_traj.jdt_ref)*86400000,0)) > 0 else False + orig_traj.pre_mc_longname = os.path.split(self.dh.generateTrajOutputDirectoryPath(orig_traj, make_dirs=False))[-1] + self.dh.removeTrajectory(orig_traj, remove_phase1=manage_phase1) + traj.pre_mc_longname = os.path.split(self.dh.generateTrajOutputDirectoryPath(orig_traj, make_dirs=False))[-1] + # if we are in MCMODE Phase2, we do not want to save a new copy of the Phase1 file + # even if the trajectory has a slightly different ref_dt + if mcmode == MCMODE_PHASE2: + manage_phase1 = False + else: + manage_phase1 = False + log.info('Saving trajectory....') - self.dh.saveTrajectoryResults(traj, self.traj_constraints.save_plots) - if mcmode != 2: - # we do not need to update the database for phase2 - log.info('Updating database....') - self.dh.addTrajectory(traj) + self.dh.saveTrajectoryResults(traj, self.traj_constraints.save_plots, save_phase1=manage_phase1) - # Mark observations as paired in a trajectory if fit successful - if mcmode != 2 and matched_obs is not None: - for _, met_obs_temp, _ in matched_obs: - self.dh.markObservationAsPaired(met_obs_temp) + # we do not need to update the database for phase2 + if mcmode != MCMODE_PHASE2: + log.info('Updating database....') + traj.obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is False] + traj.ign_obs_ids = [obs.obs_id for obs in traj.observations if obs.ignore_station is True] + self.dh.addTrajectory(traj, verbose=verbose) + if matched_obs is not None: + self.dh.addPairedObs(matched_obs, traj.jdt_ref, verbose=verbose) else: log.info('unable to fit trajectory') return successful_traj_fit + def mergeBrokenCandidates(self, candidate_trajectories): + ### Merge all candidate trajectories which share the same observations ### + log.info("") + log.info("---------------------------") + log.info("3) MERGING BROKEN OBSERVATIONS") + log.info("---------------------------") + log.info(f"Initially {len(candidate_trajectories)} candidates") + merged_candidate_trajectories = [] + merged_indices = [] + total_obs_used = 0 + for i, traj_cand_ref in enumerate(candidate_trajectories): + + # Skip candidate trajectories that have already been merged + if i in merged_indices: + continue + + # Stop the search if the end has been reached + if (i + 1) == len(candidate_trajectories): + merged_candidate_trajectories.append(traj_cand_ref) + total_obs_used += len(traj_cand_ref) + break - def run(self, event_time_range=None, bin_time_range=None, mcmode=0): - """ Run meteor corellation using available data. - Keyword arguments: - event_time_range: [list] A list of two datetime objects. These are times between which - events should be used. None by default, which uses all available events. - mcmode: [int] flag to indicate whether or not to run monte-carlos - """ + # Get the mean time of the reference observation + ref_mean_dt = traj_cand_ref[0][1].mean_dt - # a bit of logging to let readers know what we're doing - if mcmode == 2: - mcmodestr = ' - MONTE CARLO STAGE' - elif mcmode == 1: - mcmodestr = ' - SIMPLE STAGE' - else: - mcmodestr = ' ' + obs_list_ref = [entry[1] for entry in traj_cand_ref] + merged_candidate = [] + + # Compute the mean radiant of the reference solution + plane_radiants_ref = [entry[2].radiant_eq for entry in traj_cand_ref] + ra_mean_ref = meanAngle([ra for ra, _ in plane_radiants_ref]) + dec_mean_ref = np.mean([dec for _, dec in plane_radiants_ref]) - if mcmode != 2: - # Get unpaired observations, filter out observations with too little points and sort them by time - unpaired_observations_all = self.dh.getUnpairedObservations() - unpaired_observations_all = [mettmp for mettmp in unpaired_observations_all - if len(mettmp.data) >= self.traj_constraints.min_meas_pts] - unpaired_observations_all = sorted(unpaired_observations_all, key=lambda x: x.reference_dt) - # Remove all observations done prior to 2000, to weed out those with bad time - unpaired_observations_all = [met_obs for met_obs in unpaired_observations_all - if met_obs.reference_dt > datetime.datetime(2000, 1, 1, 0, 0, 0, tzinfo=datetime.timezone.utc)] + # Check for pairs + found_first_pair = False + for j, traj_cand_test in enumerate(candidate_trajectories[(i + 1):]): + # Skip same observations + if traj_cand_ref[0] == traj_cand_test[0]: + continue - # Normalize all reference times and time data so that the reference time is at t = 0 s - for met_obs in unpaired_observations_all: - # Correct the reference time - t_zero = met_obs.data[0].time_rel - met_obs.reference_dt = met_obs.reference_dt + datetime.timedelta(seconds=t_zero) + # Get the mean time of the test observation + test_mean_dt = traj_cand_test[0][1].mean_dt - # Normalize all observation times so that the first time is t = 0 s - for i in range(len(met_obs.data)): - met_obs.data[i].time_rel -= t_zero + # Make sure the observations that are being compared are within the time window + time_diff = (test_mean_dt - ref_mean_dt).total_seconds() + if abs(time_diff) > self.traj_constraints.max_toffset: + continue + # Break the search if the time went beyond the search. This can be done as observations + # are ordered in time + if time_diff > self.traj_constraints.max_toffset: + break + + + + # Create a list of observations + obs_list_test = [entry[1] for entry in traj_cand_test] + + # Check if there any any common observations between candidate trajectories and merge them + # if that is the case + found_match = False + test_ids = [x.id for x in obs_list_test] + for obs1 in obs_list_ref: + if obs1.id in test_ids: + found_match = True + break + + + # Compute the mean radiant of the reference solution + plane_radiants_test = [entry[2].radiant_eq for entry in traj_cand_test] + ra_mean_test = meanAngle([ra for ra, _ in plane_radiants_test]) + dec_mean_test = np.mean([dec for _, dec in plane_radiants_test]) + + # Skip the merging attempt if the estimated radiants are too far off + if np.degrees(angleBetweenSphericalCoords(dec_mean_ref, ra_mean_ref, dec_mean_test, ra_mean_test)) > self.traj_constraints.max_merge_radiant_angle: + continue + + + # Add the candidate trajectory to the common list if a match has been found + if found_match: + + ref_stations = [obs.station_code for obs in obs_list_ref] + + # Add observations that weren't present in the reference candidate + for entry in traj_cand_test: + + # Make sure the added observation is not already added + if entry[1] not in obs_list_ref: + + # Print the reference and the merged radiants + if not found_first_pair: + log.info("") + log.info("------") + log.info("Reference time: {:s}".format(str(ref_mean_dt))) + log.info("Reference stations: {:s}".format(", ".join(sorted(ref_stations)))) + log.info("Reference radiant: RA = {:.2f}, Dec = {:.2f}".format(np.degrees(ra_mean_ref), np.degrees(dec_mean_ref))) + log.info("") + found_first_pair = True + + log.info("Merging: {:s} {:s}".format(str(entry[1].mean_dt), str(entry[1].station_code))) + traj_cand_ref.append(entry) + + log.info("Merged radiant: RA = {:.2f}, Dec = {:.2f}".format(np.degrees(ra_mean_test), np.degrees(dec_mean_test))) + + # Mark that the current index has been processed + merged_indices.append(i + j + 1) + + # Add the reference candidate observations to the list + merged_candidate += traj_cand_ref + total_obs_used += len(traj_cand_ref) + + # Add the merged observation to the final list + merged_candidate_trajectories.append(merged_candidate) + + log.info(f"After merging, there are {len(merged_candidate_trajectories)} candidates") + return merged_candidate_trajectories, total_obs_used + + + def run(self, event_time_range=None, bin_time_range=None, mcmode=MCMODE_ALL, verbose=False): + """ Run meteor corellation using available data. + + Keyword arguments: + event_time_range: [list] A list of two datetime objects. These are times between which + events should be used. None by default, which uses all available events. + mcmode: [int] flag to indicate whether or not to run monte-carlos + """ + + # a bit of logging to let readers know what we're doing + mcmodestr = getMcModeStr(mcmode, strtype=1) + + if mcmode != MCMODE_PHASE2: + if mcmode & MCMODE_CANDS: + # Get unpaired observations, filter out observations with too few points and sort them by time + unpaired_observations_all = self.dh.getUnpairedObservations() + unpaired_observations_all = [mettmp for mettmp in unpaired_observations_all + if len(mettmp.data) >= self.traj_constraints.min_meas_pts] + unpaired_observations_all = sorted(unpaired_observations_all, key=lambda x: x.reference_dt) + + # Remove all observations done prior to 2000, to weed out those with bad time + unpaired_observations_all = [met_obs for met_obs in unpaired_observations_all + if met_obs.reference_dt > datetime.datetime(2000, 1, 1, 0, 0, 0, tzinfo=datetime.timezone.utc)] + + # Normalize all reference times and time data so that the reference time is at t = 0 s + for met_obs in unpaired_observations_all: + + # Correct the reference time + t_zero = met_obs.data[0].time_rel + met_obs.reference_dt = met_obs.reference_dt + datetime.timedelta(seconds=t_zero) + + # Normalize all observation times so that the first time is t = 0 s + for i in range(len(met_obs.data)): + met_obs.data[i].time_rel -= t_zero + else: + event_time_range = self.dh.dt_range # If the time range was given, only use the events in that time range if event_time_range: @@ -1104,11 +1265,17 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # Data will be divided into time bins, so the pairing function doesn't have to go pair many # observations at once and keep all pairs in memory else: - dt_beg = unpaired_observations_all[0].reference_dt - dt_end = unpaired_observations_all[-1].reference_dt + if mcmode & MCMODE_CANDS: + dt_beg = unpaired_observations_all[0].reference_dt + dt_end = unpaired_observations_all[-1].reference_dt + bin_days = 0.25 + else: + dt_beg, dt_end = self.dh.dt_range + bin_days = 1 + dt_bin_list = generateDatetimeBins( dt_beg, dt_end, - bin_days=1, utc_hour_break=12, tzinfo=datetime.timezone.utc, reverse=False + bin_days=bin_days, utc_hour_break=12, tzinfo=datetime.timezone.utc, reverse=False ) else: @@ -1126,6 +1293,7 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): log.info("---------------------------------") log.info("") + log.info(f'mcmode is {mcmodestr}') # Go though all time bins and split the list of observations for bin_beg, bin_end in dt_bin_list: @@ -1133,426 +1301,354 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): traj_solved_count = 0 # if we're in MC mode 0 or 1 we have to find the candidate trajectories - if mcmode < 2: - log.info("") - log.info("-----------------------------------") - log.info(" PAIRING TRAJECTORIES IN TIME BIN:") - log.info(" BIN BEG: {:s} UTC".format(str(bin_beg))) - log.info(" BIN END: {:s} UTC".format(str(bin_end))) - log.info("-----------------------------------") - log.info("") - - - # Select observations in the given time bin - unpaired_observations = [met_obs for met_obs in unpaired_observations_all - if (met_obs.reference_dt >= bin_beg) and (met_obs.reference_dt <= bin_end)] - - log.info(f'Analysing {len(unpaired_observations)} observations...') - - ### CHECK FOR PAIRING WITH PREVIOUSLY ESTIMATED TRAJECTORIES ### - - log.info("") - log.info("--------------------------------------------------------------------------") - log.info(" 1) CHECKING IF PREVIOUSLY ESTIMATED TRAJECTORIES HAVE NEW OBSERVATIONS") - log.info("--------------------------------------------------------------------------") - log.info("") - - # Get a list of all already computed trajectories within the given time bin - # Reducted trajectory objects are returned - - if bin_time_range: - # restrict checks to the bin range supplied to run() plus a day to allow for data upload times - log.info(f'Getting computed trajectories for bin {str(bin_time_range[0])} to {str(bin_time_range[1])}') - computed_traj_list = self.dh.getComputedTrajectories(datetime2JD(bin_time_range[0]), datetime2JD(bin_time_range[1])+1) - else: - # use the current bin. - log.info(f'Getting computed trajectories for {str(bin_beg)} to {str(bin_end)}') - computed_traj_list = self.dh.getComputedTrajectories(datetime2JD(bin_beg), datetime2JD(bin_end)) - - # Find all unpaired observations that match already existing trajectories - for traj_reduced in computed_traj_list: - - # If the trajectory already has more than the maximum number of stations, skip it - if len(traj_reduced.participating_stations) >= self.traj_constraints.max_stations: - - log.info( - "Trajectory {:s} has already reached the maximum number of stations, " - "skipping...".format( - str(jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)))) - - # TODO DECIDE WHETHER WE ACTUALLY WANT TO DO THIS - # the problem is that we could end up with unpaired observations that form a new trajectory instead of - # being added to an existing one - continue - - # Get all unprocessed observations which are close in time to the reference trajectory - traj_time_pairs = self.dh.getTrajTimePairs(traj_reduced, unpaired_observations, - self.traj_constraints.max_toffset) - - # Skip trajectory if there are no new obervations - if not traj_time_pairs: - continue - - + if mcmode != MCMODE_PHASE2: + ## we are in candidatemode mode 0 or 1 and want to find candidates + if mcmode & MCMODE_CANDS: + log.info("") + log.info("-----------------------------------") + log.info("0) PAIRING TRAJECTORIES IN TIME BIN:") + log.info(" BIN BEG: {:s} UTC".format(str(bin_beg))) + log.info(" BIN END: {:s} UTC".format(str(bin_end))) + log.info("-----------------------------------") log.info("") - log.info("Checking trajectory at {:s} in countries: {:s}".format( - str(jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)), - ", ".join(list(set([stat_id[:2] for stat_id in traj_reduced.participating_stations]))))) - log.info("--------") - - - # Filter out bad matches and only keep the good ones - candidate_observations = [] - traj_full = None - skip_traj_check = False - for met_obs in traj_time_pairs: - - log.info("Candidate observation: {:s}".format(met_obs.station_code)) - - platepar = self.dh.getPlatepar(met_obs) - - # Check that the trajectory beginning and end are within the distance limit - if not self.trajectoryRangeCheck(traj_reduced, platepar): - continue - - - # Check that the trajectory is within the field of view - if not self.trajectoryInFOV(traj_reduced, platepar): - continue - - - # Load the full trajectory object - if traj_full is None: - traj_full = self.dh.loadFullTraj(traj_reduced) - - # If the full trajectory couldn't be loaded, skip checking this trajectory - if traj_full is None: - - skip_traj_check = True - break - - - ### Do a rough trajectory solution and perform a quick quality control ### - - # Init observation object using the new meteor observation - obs_new = self.initObservationsObject(met_obs, platepar, - ref_dt=jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)) - - - # Get an observation from the trajectory object with the maximum convergence angle to - # the reference observations - obs_traj_best = None - qc_max = 0.0 - for obs_tmp in traj_full.observations: - - # Compute the plane intersection between the new and one of trajectory observations - pi = PlaneIntersection(obs_new, obs_tmp) - - # Take the observation with the maximum convergence angle - if (obs_traj_best is None) or (pi.conv_angle > qc_max): - qc_max = pi.conv_angle - obs_traj_best = obs_tmp - - - # Do a quick trajectory solution and perform sanity checks - plane_intersection = self.quickTrajectorySolution(obs_traj_best, obs_new) - if plane_intersection is None: - continue - - ### ### - - candidate_observations.append([obs_new, met_obs]) - - - # Skip the candidate trajectory if it couldn't be loaded from disk - if skip_traj_check: - continue - # If there are any good new observations, add them to the trajectory and re-run the solution - if candidate_observations: + # Select observations in the given time bin + unpaired_observations = [met_obs for met_obs in unpaired_observations_all + if (met_obs.reference_dt >= bin_beg) and (met_obs.reference_dt <= bin_end)] - log.info("Recomputing trajectory with new observations from stations:") + total_unpaired = len(unpaired_observations) + log.info(f'Analysing {total_unpaired} observations in this bucket...') + num_obs_paired = 0 - # Add new observations to the trajectory object - for obs_new, _ in candidate_observations: - log.info(obs_new.station_id) - traj_full.infillWithObs(obs_new) + # List of all candidate trajectories + candidate_trajectories = [] + ### CHECK FOR PAIRING WITH PREVIOUSLY ESTIMATED TRAJECTORIES ### + if total_unpaired > 0: + log.info("") + log.info("--------------------------------------------------------------------------") + log.info(" 1) CHECKING IF PREVIOUSLY ESTIMATED TRAJECTORIES HAVE NEW OBSERVATIONS") + log.info("--------------------------------------------------------------------------") + log.info("") - # Re-run the trajectory fit - # pass in orig_traj here so that it can be deleted from disk if the new solution succeeds - successful_traj_fit = self.solveTrajectory(traj_full, traj_full.mc_runs, mcmode=mcmode, orig_traj=traj_reduced) + # Get a list of all already computed trajectories within the given time bin + # Reducted trajectory objects are returned - # If the new trajectory solution succeeded, remove the now-paired observations - if successful_traj_fit: - - log.info("Remove paired observations from the processing list...") - for _, met_obs_temp in candidate_observations: - self.dh.markObservationAsPaired(met_obs_temp) - unpaired_observations.remove(met_obs_temp) - + if bin_time_range: + # restrict checks to the bin range supplied to run() plus a day to allow for data upload times + log.info(f'Getting computed trajectories for bin {str(bin_time_range[0])} to {str(bin_time_range[1])}') + computed_traj_list = self.dh.getComputedTrajectories(datetime2JD(bin_time_range[0]), datetime2JD(bin_time_range[1])+1) else: - log.info("New trajectory solution failed, keeping the old trajectory...") - - ### ### + # use the current bin. + log.info(f'Getting computed trajectories for {str(bin_beg)} to {str(bin_end)}') + computed_traj_list = self.dh.getComputedTrajectories(datetime2JD(bin_beg), datetime2JD(bin_end)) + # Find all unpaired observations that match already existing trajectories + for traj_reduced in computed_traj_list: - log.info("") - log.info("-------------------------------------------------") - log.info(" 2) PAIRING OBSERVATIONS INTO NEW TRAJECTORIES") - log.info("-------------------------------------------------") - log.info("") + # If the trajectory already has more than the maximum number of stations, skip it + if len(traj_reduced.participating_stations) >= self.traj_constraints.max_stations: - # List of all candidate trajectories - candidate_trajectories = [] + log.info( + "Trajectory {:s} has already reached the maximum number of stations, " + "skipping...".format( + str(jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)))) - # Go through all unpaired and unprocessed meteor observations - for met_obs in unpaired_observations: - - # Skip observations that were processed in the meantime - if met_obs.processed: - continue + # TODO DECIDE WHETHER WE ACTUALLY WANT TO DO THIS + # the problem is that we could end up with unpaired observations that form a new trajectory instead of + # being added to an existing one + continue + + # Get all unprocessed observations which are close in time to the reference trajectory + traj_time_pairs = self.dh.getTrajTimePairs(traj_reduced, unpaired_observations, + self.traj_constraints.max_toffset) - # Get station platepar - reference_platepar = self.dh.getPlatepar(met_obs) - obs1 = self.initObservationsObject(met_obs, reference_platepar) + # Skip trajectory if there are no new obervations + if not traj_time_pairs: + continue - # Keep a list of observations which matched the reference observation - matched_observations = [] + log.info("") + log.info("Checking trajectory at {:s} in countries: {:s}".format( + str(jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)), + ", ".join(list(set([stat_id[:2] for stat_id in traj_reduced.participating_stations]))))) + log.info("--------") - # Find all meteors from other stations that are close in time to this meteor - plane_intersection_good = None - time_pairs = self.dh.findTimePairs(met_obs, unpaired_observations, - self.traj_constraints.max_toffset) - for met_pair_candidate in time_pairs: - log.info("") - log.info("Processing pair:") - log.info("{:s} and {:s}".format(met_obs.station_code, met_pair_candidate.station_code)) - log.info("{:s} and {:s}".format(str(met_obs.reference_dt), str(met_pair_candidate.reference_dt))) - log.info("-----------------------") + # Filter out bad matches and only keep the good ones + candidate_observations = [] + traj_full = None + skip_traj_check = False + for met_obs in traj_time_pairs: - ### Check if the stations are close enough and have roughly overlapping fields of view ### + log.info("Candidate observation: {:s}".format(met_obs.station_code)) - # Get candidate station platepar - candidate_platepar = self.dh.getPlatepar(met_pair_candidate) + platepar = self.dh.getPlatepar(met_obs) - # Check if the stations are within range - if not self.stationRangeCheck(reference_platepar, candidate_platepar): - continue + # Check that the trajectory beginning and end are within the distance limit + if not self.trajectoryRangeCheck(traj_reduced, platepar): + continue - # Check the FOV overlap - if not self.checkFOVOverlap(reference_platepar, candidate_platepar): - log.info("Station FOV does not overlap: {:s} and {:s}".format(met_obs.station_code, - met_pair_candidate.station_code)) - continue - ### ### + # Check that the trajectory is within the field of view + if not self.trajectoryInFOV(traj_reduced, platepar): + continue + # Load the full trajectory object + if traj_full is None: + traj_full = self.dh.loadFullTraj(traj_reduced) - ### Do a rough trajectory solution and perform a quick quality control ### + # If the full trajectory couldn't be loaded, skip checking this trajectory + if traj_full is None: + + skip_traj_check = True + break - # Init observations - obs2 = self.initObservationsObject(met_pair_candidate, candidate_platepar, - ref_dt=met_obs.reference_dt) - # Do a quick trajectory solution and perform sanity checks - plane_intersection = self.quickTrajectorySolution(obs1, obs2) - if plane_intersection is None: - continue + ### Do a rough trajectory solution and perform a quick quality control ### - else: - plane_intersection_good = plane_intersection + # Init observation object using the new meteor observation + obs_new = self.initObservationsObject(met_obs, platepar, + ref_dt=jd2Date(traj_reduced.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc)) + obs_new.station_code = met_obs.station_code + obs_new.mean_dt = met_obs.mean_dt - ### ### + # Get an observation from the trajectory object with the maximum convergence angle to + # the reference observations + obs_traj_best = None + qc_max = 0.0 + for obs_tmp in traj_full.observations: + + # Compute the plane intersection between the new and one of trajectory observations + pi = PlaneIntersection(obs_new, obs_tmp) - matched_observations.append([obs2, met_pair_candidate, plane_intersection]) + # Take the observation with the maximum convergence angle + if (obs_traj_best is None) or (pi.conv_angle > qc_max): + qc_max = pi.conv_angle + obs_traj_best = obs_tmp + # Do a quick trajectory solution and perform sanity checks + plane_intersection = self.quickTrajectorySolution(obs_traj_best, obs_new) + if plane_intersection is None: + continue - # If there are no matched observations, skip it - if len(matched_observations) == 0: + ### ### - if len(time_pairs) > 0: - log.info("") - log.info(" --- NO MATCH ---") + candidate_observations.append([obs_new, met_obs]) - continue - # Skip if there are not good plane intersections - if plane_intersection_good is None: - continue + # Skip the candidate trajectory if it couldn't be loaded from disk + if skip_traj_check: + continue - # Add the first observation to matched observations - matched_observations.append([obs1, met_obs, plane_intersection_good]) + # If there are any good new observations, add them to the trajectory and re-run the solution + if candidate_observations: - # Mark observations as processed - for _, met_obs_temp, _ in matched_observations: - met_obs_temp.processed = True - self.dh.markObservationAsProcessed(met_obs_temp) + log.info("Recomputing trajectory with new observations:") + # Add new observations to the trajectory object + for obs_new, _ in candidate_observations: + log.info(f' {obs_new.obs_id}') + traj_full.infillWithObs(obs_new) - # Store candidate trajectories - log.info("") - log.info(" --- ADDING CANDIDATE ---") - candidate_trajectories.append(matched_observations) + # Re-run the trajectory fit + # pass in orig_traj here so that it can be deleted from disk if the new solution succeeds + # pass the new candidates in so that they can be marked paired if the new soln succeeds + # Note: mcmode must be phase1 here to force a recompute + successful_traj_fit = self.solveTrajectory(traj_full, traj_full.mc_runs, mcmode=MCMODE_PHASE1, + matched_obs=candidate_observations, orig_traj=traj_reduced, verbose=verbose) + + # If the new trajectory solution succeeded, remove the now-paired observations from the in memory list + if successful_traj_fit: + log.info("Remove paired observations from the processing list...") + for _, met_obs_temp in candidate_observations: + unpaired_observations.remove(met_obs_temp) - ### Merge all candidate trajectories which share the same observations ### - log.info("") - log.info("---------------------------") - log.info("MERGING BROKEN OBSERVATIONS") - log.info("---------------------------") - merged_candidate_trajectories = [] - merged_indices = [] - for i, traj_cand_ref in enumerate(candidate_trajectories): - - # Skip candidate trajectories that have already been merged - if i in merged_indices: - continue + else: + log.info("New trajectory solution failed, keeping the old trajectory...") - - # Stop the search if the end has been reached - if (i + 1) == len(candidate_trajectories): - merged_candidate_trajectories.append(traj_cand_ref) - break + ### ### - # Get the mean time of the reference observation - ref_mean_dt = traj_cand_ref[0][1].mean_dt + log.info("") + log.info("-------------------------------------------------") + log.info(" 2) PAIRING OBSERVATIONS INTO NEW TRAJECTORIES") + log.info("-------------------------------------------------") + log.info("") - obs_list_ref = [entry[1] for entry in traj_cand_ref] - merged_candidate = [] - # Compute the mean radiant of the reference solution - plane_radiants_ref = [entry[2].radiant_eq for entry in traj_cand_ref] - ra_mean_ref = meanAngle([ra for ra, _ in plane_radiants_ref]) - dec_mean_ref = np.mean([dec for _, dec in plane_radiants_ref]) + # Go through all unpaired and unprocessed meteor observations + for met_obs in unpaired_observations: + # Skip observations that were processed in the meantime + if met_obs.processed: + continue - # Check for pairs - found_first_pair = False - for j, traj_cand_test in enumerate(candidate_trajectories[(i + 1):]): + if self.dh.checkIfObsPaired(met_obs.id, verbose=verbose): + continue - # Skip same observations - if traj_cand_ref[0] == traj_cand_test[0]: - continue + # Get station platepar + reference_platepar = self.dh.getPlatepar(met_obs) + obs1 = self.initObservationsObject(met_obs, reference_platepar) - # Get the mean time of the test observation - test_mean_dt = traj_cand_test[0][1].mean_dt + # Keep a list of observations which matched the reference observation + matched_observations = [] - # Make sure the observations that are being compared are within the time window - time_diff = (test_mean_dt - ref_mean_dt).total_seconds() - if abs(time_diff) > self.traj_constraints.max_toffset: - continue + # Find all meteors from other stations that are close in time to this meteor + plane_intersection_good = None + time_pairs = self.dh.findTimePairs(met_obs, unpaired_observations, + self.traj_constraints.max_toffset) + for met_pair_candidate in time_pairs: + log.info("") + log.info("Processing pair:") + log.info("{:s} and {:s}".format(met_obs.station_code, met_pair_candidate.station_code)) + log.info("{:s} and {:s}".format(str(met_obs.reference_dt), str(met_pair_candidate.reference_dt))) + log.info("-----------------------") - # Break the search if the time went beyond the search. This can be done as observations - # are ordered in time - if time_diff > self.traj_constraints.max_toffset: - break + ### Check if the stations are close enough and have roughly overlapping fields of view ### + # Get candidate station platepar + candidate_platepar = self.dh.getPlatepar(met_pair_candidate) + # Check if the stations are within range + if not self.stationRangeCheck(reference_platepar, candidate_platepar): + continue - # Create a list of observations - obs_list_test = [entry[1] for entry in traj_cand_test] + # Check the FOV overlap + if not self.checkFOVOverlap(reference_platepar, candidate_platepar): + log.info("Station FOV does not overlap: {:s} and {:s}".format(met_obs.station_code, + met_pair_candidate.station_code)) + continue - # Check if there any any common observations between candidate trajectories and merge them - # if that is the case - found_match = False - for obs1 in obs_list_ref: - if obs1 in obs_list_test: - found_match = True - break + ### ### - # Compute the mean radiant of the reference solution - plane_radiants_test = [entry[2].radiant_eq for entry in traj_cand_test] - ra_mean_test = meanAngle([ra for ra, _ in plane_radiants_test]) - dec_mean_test = np.mean([dec for _, dec in plane_radiants_test]) - # Skip the mergning attempt if the estimated radiants are too far off - if np.degrees(angleBetweenSphericalCoords(dec_mean_ref, ra_mean_ref, dec_mean_test, ra_mean_test)) > self.traj_constraints.max_merge_radiant_angle: + ### Do a rough trajectory solution and perform a quick quality control ### - continue + # Init observations + obs2 = self.initObservationsObject(met_pair_candidate, candidate_platepar, + ref_dt=met_obs.reference_dt) + # Do a quick trajectory solution and perform sanity checks + plane_intersection = self.quickTrajectorySolution(obs1, obs2) + if plane_intersection is None: + continue - # Add the candidate trajectory to the common list if a match has been found - if found_match: + else: + plane_intersection_good = plane_intersection - ref_stations = [obs.station_code for obs in obs_list_ref] + ### ### - # Add observations that weren't present in the reference candidate - for entry in traj_cand_test: + matched_observations.append([obs2, met_pair_candidate, plane_intersection]) - # Make sure the added observation is not from a station that's already added - if entry[1].station_code in ref_stations: - continue - if entry[1] not in obs_list_ref: - # Print the reference and the merged radiants - if not found_first_pair: - log.info("") - log.info("------") - log.info("Reference time: {:s}".format(str(ref_mean_dt))) - log.info("Reference stations: {:s}".format(", ".join(sorted(ref_stations)))) - log.info("Reference radiant: RA = {:.2f}, Dec = {:.2f}".format(np.degrees(ra_mean_ref), np.degrees(dec_mean_ref))) - log.info("") - found_first_pair = True + # If there are no matched observations, skip it + if len(matched_observations) == 0: - log.info("Merging: {:s} {:s}".format(str(entry[1].mean_dt), str(entry[1].station_code))) - traj_cand_ref.append(entry) + if len(time_pairs) > 0: + log.info("") + log.info(" --- NO MATCH ---") - log.info("Merged radiant: RA = {:.2f}, Dec = {:.2f}".format(np.degrees(ra_mean_test), np.degrees(dec_mean_test))) + continue - + # Skip if there are not good plane intersections + if plane_intersection_good is None: + continue + # Add the first observation to matched observations + matched_observations.append([obs1, met_obs, plane_intersection_good]) - # Mark that the current index has been processed - merged_indices.append(i + j + 1) + # Mark observations as processed + for _, met_obs_temp, _ in matched_observations: + met_obs_temp.processed = True - # Add the reference candidate observations to the list - merged_candidate += traj_cand_ref + # Store candidate trajectory group + # Note that this will include candidate groups that already failed on previous runs. + # We will exclude these later - we can't do it just yet as if new data has arrived, then + # in the next step, the group might be merged with another group creating a solvable set. + log.info("") + ref_dt = min([met_obs.reference_dt for _, met_obs, _ in matched_observations]) + log.info(f" --- ADDING CANDIDATE at {ref_dt.isoformat()} ---") + candidate_trajectories.append(matched_observations) + # Check for mergeable candidate combinations + merged_candidate_trajectories, num_obs_paired = self.mergeBrokenCandidates(candidate_trajectories) + candidate_trajectories = merged_candidate_trajectories - # Add the merged observation to the final list - merged_candidate_trajectories.append(merged_candidate) + log.info("-----------------------") + log.info(f'There are {total_unpaired - num_obs_paired} remaining unpaired observations in this bucket.') + log.info("-----------------------") + # in candidate mode we want to save the candidates to disk + if mcmode == MCMODE_CANDS: + log.info("-----------------------") + if bin_time_range: + log.info(f'5) SAVING {len(candidate_trajectories)} CANDIDATES for {str(bin_time_range[0])} to {str(bin_time_range[1])}') + else: + log.info(f'5) SAVING {len(candidate_trajectories)} CANDIDATES for {str(bin_beg)} to {str(bin_end)}') + log.info("-----------------------") + # Save candidates. This will check and skip over already-processed + # combinations + self.dh.saveCandidates(candidate_trajectories, verbose=verbose) - candidate_trajectories = merged_candidate_trajectories + return len(candidate_trajectories) + + else: + log.info("-----------------------") + log.info('5) PROCESSING {} CANDIDATES'.format(len(candidate_trajectories))) + log.info("-----------------------") + # end of 'if mcmode & MCMODE_CANDS' ### ### - else: + else: + # candidatemode is LOAD so load any available candidates for processing + traj_solved_count = 0 + log.info("-----------------------") + log.info('6) LOADING CANDIDATES') + log.info("-----------------------") + candidate_trajectories = self.dh.loadCandidates(verbose=verbose) + + # end of 'self.candidatemode == CANDMODE_LOAD' + # end of 'if mcmode != MCMODE_PHASE2' + else: + # mcmode == MCMODE_PHASE2 so we need to load the phase1 solutions log.info("-----------------------") - log.info('LOADING PHASE1 SOLUTIONS') + log.info('6) LOADING PHASE1 SOLUTIONS') log.info("-----------------------") candidate_trajectories = self.dh.phase1Trajectories - # end of "if mcmode < 2" + # end of "if mcmode == MCMODE_PHASE2" + + # avoid reprocessing candidates that were already processed + num_traj = len(candidate_trajectories) log.info("") log.info("-----------------------") - log.info(f'SOLVING {len(candidate_trajectories)} TRAJECTORIES {mcmodestr}') + log.info(f'7) SOLVING {num_traj} TRAJECTORIES {mcmodestr}') log.info("-----------------------") log.info("") # Go through all candidate trajectories and compute the complete trajectory solution - for matched_observations in candidate_trajectories: + for i, matched_observations in enumerate(candidate_trajectories): log.info("") log.info("-----------------------") - + cand_id = self.dh.getCandidateId(matched_observations) if mcmode==MCMODE_PHASE1 else '' + log.info(f'processing {"candidate" if mcmode==MCMODE_PHASE1 else "trajectory"} {cand_id} {i+1}/{num_traj}') # if mcmode is not 2, prepare to calculate the intersecting planes solutions - if mcmode != 2: + if mcmode != MCMODE_PHASE2: # Find unique station counts station_counts = np.unique([entry[1].station_code for entry in matched_observations], return_counts=True) @@ -1609,10 +1705,9 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # Print info about observations which are being solved log.info("") - log.info("Observations:") - for entry in matched_observations: - obs, met_obs, _ = entry - log.info(f'{met_obs.station_code} - {met_obs.mean_dt} - {obs.ignore_station}') + log.info("Observations and ignore flag:") + for obs, _, _ in matched_observations: + log.info(f' {obs.obs_id} - {obs.ignore_station}') @@ -1622,6 +1717,23 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): log.info("Max convergence angle too small: {:.1f} < {:.1f} deg".format(qc_max, self.traj_constraints.min_qc)) + # create a traj object to add to the failed database so we don't try to recompute this one again + ref_dt = min([met_obs.reference_dt for _, met_obs, _ in matched_observations]) + jdt_ref = datetime2JD(ref_dt) + + failed_traj = self.initTrajectory(jdt_ref, 0, verbose=verbose) + for obs_temp, met_obs, _ in matched_observations: + failed_traj.infillWithObs(obs_temp) + + failed_traj.obs_ids = [obs_temp.obs_id for obs_temp, _,_ in matched_observations] + + t0 = min([obs.time_data[0] for obs in failed_traj.observations if (not obs.ignore_station) + or (not np.all(obs.ignore_list))]) + if t0 != 0.0: + failed_traj.jdt_ref = failed_traj.jdt_ref + t0/86400.0 + + log.info(f"Trajectory at {ref_dt.isoformat()} skipped and added to fails!") + self.dh.addTrajectory(failed_traj, failed_traj.jdt_ref, verbose=verbose) continue @@ -1649,20 +1761,26 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # Init the solver (use the earliest date as the reference) - ref_dt = min([met_obs.reference_dt for _, met_obs, _ in matched_observations]) - jdt_ref = datetime2JD(ref_dt) - traj = self.initTrajectory(jdt_ref, mc_runs, verbose=False) + jdt_ref = min([obs_temp.jdt_ref for obs_temp, _, _ in matched_observations]) + + #log.info(f'ref_dt {jd2Date(jdt_ref, dt_obj=True)}') + traj = self.initTrajectory(jdt_ref, mc_runs, verbose=verbose) # Feed the observations into the trajectory solver for obs_temp, met_obs, _ in matched_observations: # Normalize the observations to the reference Julian date - jdt_ref_curr = datetime2JD(met_obs.reference_dt) + jdt_ref_curr = obs_temp.jdt_ref # datetime2JD(met_obs.reference_dt) obs_temp.time_data += (jdt_ref_curr - jdt_ref)*86400 - + # we have normalised the time data to jdt_ref, now we need to reset jdt_ref for each obs too + obs_temp.jdt_ref = jdt_ref + obs_temp.obs_id = obs_temp.id traj.infillWithObs(obs_temp) + traj.obs_ids = [obs.obs_id for obs, _,_ in matched_observations if obs.ignore_station is False] + traj.ign_obs_ids = [obs.obs_id for obs, _,_ in matched_observations if obs.ignore_station is True] + ### Recompute the reference JD and all times so that the first time starts at 0 ### # Determine the first relative time from reference JD @@ -1671,29 +1789,30 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # If the first time is not 0, normalize times so that the earliest time is 0 if t0 != 0.0: - + #log.info(f'adjusting by {t0}') # Offset all times by t0 for i in range(len(traj.observations)): traj.observations[i].time_data -= t0 - + # log.info(f'obs jdt_ref is {jd2Date(traj.observations[i].jdt_ref, dt_obj=True)}') # Recompute the reference JD to corresponds with t0 traj.jdt_ref = traj.jdt_ref + t0/86400.0 + #log.info(f'ref_dt {jd2Date(traj.jdt_ref, dt_obj=True)}') # If this trajectory already failed to be computed, don't try to recompute it again unless # new observations are added if self.dh.checkTrajIfFailed(traj): log.info("The same trajectory already failed to be computed in previous runs!") continue - # pass in matched_observations here so that solveTrajectory can mark them paired if they're used - result = self.solveTrajectory(traj, mc_runs, mcmode=mcmode, matched_obs=matched_observations) + # pass in matched_observations here so that we can mark them paired if they're used + result = self.solveTrajectory(traj, mc_runs, mcmode=mcmode, matched_obs=matched_observations, verbose=verbose) traj_solved_count += int(result) - # end of if mcmode != 2 + # end of if mcmode != MCMODE_PHASE2 else: - # mcmode is 2 and so we have a list of trajectories that were solved in phase 1 + # mcmode is MCMODE_PHASE2 and so we have a list of trajectories that were solved in phase 1 # to prepare for monte-carlo solutions traj = matched_observations @@ -1717,18 +1836,18 @@ def run(self, event_time_range=None, bin_time_range=None, mcmode=0): # This will increase the number of MC runs while keeping the processing time the same mc_runs = int(np.ceil(mc_runs/self.traj_constraints.mc_cores)*self.traj_constraints.mc_cores) - # pass in matched_observations here so that solveTrajectory can mark them paired if they're used - result = self.solveTrajectory(traj, mc_runs, mcmode=mcmode, matched_obs=matched_observations, orig_traj=traj) + # pass in matched_observations here so that we can mark them unpaired if the solver fails + result = self.solveTrajectory(traj, mc_runs, mcmode=mcmode, matched_obs=matched_observations, orig_traj=traj, verbose=verbose) traj_solved_count += int(result) # end of "for matched_observations in candidate_trajectories" outcomes = [traj_solved_count] - # Finish the correlation run (update the database with new values) - self.dh.saveDatabase() log.info(f'SOLVED {sum(outcomes)} TRAJECTORIES') log.info("") log.info("-----------------") log.info("SOLVING RUN DONE!") log.info("-----------------") + + return sum(outcomes) diff --git a/wmpl/Trajectory/CorrelateRMS.py b/wmpl/Trajectory/CorrelateRMS.py index 88c11292..20444d56 100644 --- a/wmpl/Trajectory/CorrelateRMS.py +++ b/wmpl/Trajectory/CorrelateRMS.py @@ -20,14 +20,20 @@ import pandas as pd from dateutil.relativedelta import relativedelta import numpy as np +import sys +import secrets from wmpl.Formats.CAMS import loadFTPDetectInfo -from wmpl.Trajectory.CorrelateEngine import TrajectoryCorrelator, TrajectoryConstraints +from wmpl.Trajectory.CorrelateEngine import TrajectoryCorrelator, TrajectoryConstraints, getMcModeStr from wmpl.Utils.Math import generateDatetimeBins from wmpl.Utils.OSTools import mkdirP from wmpl.Utils.Pickling import loadPickle, savePickle from wmpl.Utils.TrajConversions import datetime2JD, jd2Date -from wmpl.Utils.remoteDataHandling import collectRemoteTrajectories, moveRemoteTrajectories, uploadTrajToRemote +from wmpl.Utils.remoteDataHandling import RemoteDataHandler +from wmpl.Trajectory.CorrelateDB import ObservationsDatabase, TrajectoryDatabase, CandidateDatabase +# from wmpl.Trajectory.Trajectory import Trajectory + +from wmpl.Trajectory.CorrelateEngine import MCMODE_CANDS, MCMODE_PHASE1, MCMODE_PHASE2, MCMODE_ALL, MCMODE_BOTH ### CONSTANTS ### @@ -77,6 +83,10 @@ def __init__(self, traj_file_path, json_dict=None, traj_obj=None): except FileNotFoundError: log.info("Pickle file not found: " + traj_file_path) return None + + except: + log.info("Pickle file could not be loaded: " + traj_file_path) + return None else: @@ -84,7 +94,6 @@ def __init__(self, traj_file_path, json_dict=None, traj_obj=None): traj = traj_obj self.traj_file_path = os.path.join(traj.output_dir, traj.file_name + "_trajectory.pickle") - # Reference Julian date (beginning of the meteor) self.jdt_ref = traj.jdt_ref @@ -138,21 +147,25 @@ def __init__(self, traj_file_path, json_dict=None, traj_obj=None): if hasattr(traj, 'traj_id'): self.traj_id = traj.traj_id + self.obs_ids = None + if hasattr(traj, 'obs_ids'): + self.obs_ids = traj.obs_ids + self.ign_obs_ids = None + if hasattr(traj, 'ign_obs_ids'): + self.ign_obs_ids = traj.ign_obs_ids + # Load values from a dictionary else: + if not hasattr(json_dict, 'obs_ids'): + json_dict['obs_ids'] = None self.__dict__ = json_dict - class DatabaseJSON(object): def __init__(self, db_file_path, verbose=False): self.db_file_path = db_file_path - # List of processed directories (keys are station codes, values are relative paths to night - # directories) - self.processed_dirs = {} - # List of paired observations as a part of a trajectory (keys are station codes, values are unique # observation IDs) self.paired_obs = {} @@ -168,7 +181,6 @@ def __init__(self, db_file_path, verbose=False): # Load the database from a JSON file self.load(verbose=verbose) - def load(self, verbose=False): """ Load the database from a JSON file. """ @@ -202,14 +214,14 @@ def load(self, verbose=False): # Overwrite the database path with the saved one self.db_file_path = db_file_path_saved - if db_is_ok: + # if the trajectories attribute is not present, then the database has been converted to sqlite + if db_is_ok and hasattr(self, 'trajectories'): # Convert trajectories from JSON to TrajectoryReduced objects for traj_dict_str in ["trajectories", "failed_trajectories"]: traj_dict = getattr(self, traj_dict_str) trajectories_obj_dict = {} for traj_json in traj_dict: traj_reduced_tmp = TrajectoryReduced(None, json_dict=traj_dict[traj_json]) - trajectories_obj_dict[traj_reduced_tmp.jdt_ref] = traj_reduced_tmp # Set the trajectory dictionary @@ -219,159 +231,6 @@ def load(self, verbose=False): self.verbose = verbose - def save(self): - """ Save the database of processed meteors to disk. """ - - # Back up the existing data base - db_bak_file_path = self.db_file_path + ".bak" - if os.path.exists(self.db_file_path): - shutil.copy2(self.db_file_path, db_bak_file_path) - - # Save the data base - try: - with open(self.db_file_path, 'w') as f: - self2 = copy.deepcopy(self) - - # Convert reduced trajectory objects to JSON objects - self2.trajectories = {key: self.trajectories[key].__dict__ for key in self.trajectories} - self2.failed_trajectories = {key: self.failed_trajectories[key].__dict__ - for key in self.failed_trajectories} - if hasattr(self2, 'phase1Trajectories'): - delattr(self2, 'phase1Trajectories') - - f.write(json.dumps(self2, default=lambda o: o.__dict__, indent=4, sort_keys=True)) - - # Remove the backup file - if os.path.exists(db_bak_file_path): - os.remove(db_bak_file_path) - - except Exception as e: - log.warning('unable to save the database, likely corrupt data') - shutil.copy2(db_bak_file_path, self.db_file_path) - log.warning(e) - - def addProcessedDir(self, station_name, rel_proc_path): - """ Add the processed directory to the list. """ - - if station_name in self.processed_dirs: - if rel_proc_path not in self.processed_dirs[station_name]: - self.processed_dirs[station_name].append(rel_proc_path) - - - def addPairedObservation(self, met_obs): - """ Mark the given meteor observation as paired in a trajectory. """ - - if met_obs.station_code not in self.paired_obs: - self.paired_obs[met_obs.station_code] = [] - - if met_obs.id not in self.paired_obs[met_obs.station_code]: - self.paired_obs[met_obs.station_code].append(met_obs.id) - - - def checkObsIfPaired(self, met_obs): - """ Check if the given observation has been paired to a trajectory or not. """ - - if met_obs.station_code in self.paired_obs: - return (met_obs.id in self.paired_obs[met_obs.station_code]) - - else: - return False - - - def checkTrajIfFailed(self, traj): - """ Check if the given trajectory has been computed with the same observations and has failed to be - computed before. - - """ - - # Check if the reference time is in the list of failed trajectories - if traj.jdt_ref in self.failed_trajectories: - - # Get the failed trajectory object - failed_traj = self.failed_trajectories[traj.jdt_ref] - - # Check if the same observations participate in the failed trajectory as in the trajectory that - # is being tested - all_match = True - for obs in traj.observations: - - if not ((obs.station_id in failed_traj.participating_stations) or (obs.station_id in failed_traj.ignored_stations)): - - all_match = False - break - - # If the same stations were used, the trajectory estimation failed before - if all_match: - return True - - - return False - - - def addTrajectory(self, traj_file_path, traj_obj=None, failed=False): - """ Add a computed trajectory to the list. - - Arguments: - traj_file_path: [str] Full path the trajectory object. - - Keyword arguments: - traj_obj: [bool] Instead of loading a traj object from disk, use the given object. - failed: [bool] Add as a failed trajectory. False by default. - """ - - # Load the trajectory from disk - if traj_obj is None: - - # Init the reduced trajectory object - traj_reduced = TrajectoryReduced(traj_file_path) - if self.verbose: - log.info(f' loaded {traj_file_path}, traj_id {traj_reduced.traj_id}') - # Skip if failed - if traj_reduced is None: - return None - - if not hasattr(traj_reduced, "jdt_ref"): - return None - - else: - # Use the provided trajectory object - traj_reduced = traj_obj - if self.verbose: - log.info(f' loaded {traj_obj.traj_file_path}, traj_id {traj_reduced.traj_id}') - - - # Choose to which dictionary the trajectory will be added - if failed: - traj_dict = self.failed_trajectories - - else: - traj_dict = self.trajectories - - - # Add the trajectory to the list (key is the reference JD) - if traj_reduced.jdt_ref not in traj_dict: - traj_dict[traj_reduced.jdt_ref] = traj_reduced - else: - traj_dict[traj_reduced.jdt_ref].traj_id = traj_reduced.traj_id - - - - def removeTrajectory(self, traj_reduced, keepFolder=False): - """ Remove the trajectory from the data base and disk. """ - - # Remove the trajectory data base entry - if traj_reduced.jdt_ref in self.trajectories: - del self.trajectories[traj_reduced.jdt_ref] - - # Remove the trajectory folder on the disk - if not keepFolder and os.path.isfile(traj_reduced.traj_file_path): - traj_dir = os.path.dirname(traj_reduced.traj_file_path) - shutil.rmtree(traj_dir, ignore_errors=True) - if os.path.isfile(traj_reduced.traj_file_path): - log.info(f'unable to remove {traj_dir}') - - - class MeteorPointRMS(object): def __init__(self, frame, time_rel, x, y, ra, dec, azim, alt, mag): """ Container for individual meteor picks. """ @@ -399,7 +258,6 @@ def __init__(self, frame, time_rel, x, y, ra, dec, azim, alt, mag): self.mag = mag - class MeteorObsRMS(object): def __init__(self, station_code, reference_dt, platepar, data, rel_proc_path, ff_name=None): """ Container for meteor observations with the interface compatible with the trajectory correlator @@ -505,6 +363,7 @@ def __init__(self, station_code, reference_dt, platepar, data, rel_proc_path, ff checksum = int(np.sum([entry.x for entry in self.data]) % 10000) self.id = "{:s}_{:s}_{:04d}".format(self.station_code, self.mean_dt.strftime("%Y%m%d-%H%M%S.%f"), checksum) + self.obs_id = self.id @@ -517,7 +376,8 @@ def __init__(self, **entries): class RMSDataHandle(object): - def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode=0, max_trajs=1000, remotehost=None, verbose=False): + def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode=MCMODE_ALL, max_trajs=1000, + verbose=False, archivemonths=0, auto=False, max_toffset=10): """ Handles data interfacing between the trajectory correlator and RMS data files on disk. Arguments: @@ -530,12 +390,22 @@ def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode database file will be loaded from the dir_path. output_dir: [str] Path to the directory where the output files will be saved. None by default, in which case the output files will be saved in the dir_path. + mcmode: [int] the operation mode, candidates, phase1 simple solns, mc phase or a combination max_trajs: [int] maximum number of phase1 trajectories to load at a time when adding uncertainties. Improves throughput. """ self.mc_mode = mcmode + self.auto_mode = auto + + # max diff between observations - used when loading observations to make sure we don't miss any + # towards the end of the time bucket + self.max_toffset = max_toffset + self.dir_path = dir_path + # create the data directory. Of course, if the folder doesnt exist there is nothing to process + # but by creating it we avoid an Exception later. And we can always copy data in. + mkdirP(dir_path) self.dt_range = dt_range @@ -559,15 +429,29 @@ def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode # Create the output directory if it doesn't exist mkdirP(self.output_dir) - # Phase 1 trajectory pickle directory needed to reload previous results. + if dt_range is None or dt_range[0] == datetime.datetime(2000,1,1,0,0,0).replace(tzinfo=datetime.timezone.utc): + daysback = 21 + else: + daysback = (datetime.datetime.now().replace(tzinfo=datetime.timezone.utc) - dt_range[0]).days + 1 + + # Candidate directory, if running in create or load cands modes + self.candidate_dir = os.path.join(self.output_dir, 'candidates') + mkdirP(os.path.join(self.candidate_dir, 'processed')) + + # Phase 1 trajectory pickle directory needed to reload previous results when running phase2. self.phase1_dir = os.path.join(self.output_dir, 'phase1') + mkdirP(os.path.join(self.phase1_dir, 'processed')) + + # Clear down candidates older than daysback days to save space + num_removed_cands = self.purgeProcessedData(os.path.join(self.candidate_dir, 'processed'), days_back=daysback, verbose=verbose) + log.info(f'removed {num_removed_cands} processed candidates') - # create the directory for phase1 simple trajectories, if needed - if self.mc_mode > 0: - mkdirP(os.path.join(self.phase1_dir, 'processed')) - self.purgePhase1ProcessedData(os.path.join(self.phase1_dir, 'processed')) + # Clear down phase1 older than 2x daysback days to save space + num_removed_ph1 = self.purgeProcessedData(os.path.join(self.phase1_dir, 'processed'), days_back=daysback*2, verbose=verbose) + log.info(f'removed {num_removed_ph1} processed phase1') - self.remotehost = remotehost + # In a previous incarnation, if the solver crashed it could leave some `.pickle_processing files`. + self.cleanupPartialProcessing() self.verbose = verbose @@ -575,40 +459,69 @@ def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode # Load database of processed folders database_path = os.path.join(self.db_dir, JSON_DB_NAME) + + # create an empty processing list + self.processing_list = [] + + # maximum number of candidates or trajectories to load in one go. Should improve performance + self.max_trajs = max_trajs + log.info("") - # move any remotely calculated pickles to their target locations - if os.path.isdir(os.path.join(self.output_dir, 'remoteuploads')): - moveRemoteTrajectories(self.output_dir) - - if mcmode != 2: - log.info("Loading database: {:s}".format(database_path)) - self.db = DatabaseJSON(database_path, verbose=self.verbose) - log.info('Archiving older entries....') - try: - self.archiveOldRecords(older_than=3) - except: - pass - log.info(" ... done!") - # Load the list of stations - station_list = self.loadStations() + if mcmode != MCMODE_PHASE2: - # Find unprocessed meteor files - log.info("") - log.info("Finding unprocessed data...") - self.processing_list = self.findUnprocessedFolders(station_list) - log.info(" ... done!") + # no need to load the legacy JSON file if we already have the sqlite databases + if not os.path.isfile(os.path.join(db_dir, 'observations.db')) and \ + not os.path.isfile(os.path.join(db_dir, 'trajectories.db')) and \ + os.path.isfile(database_path): + log.info("Loading old JSON database: {:s}".format(database_path)) + self.old_db = DatabaseJSON(database_path, verbose=self.verbose) + else: + self.old_db = None + + self.observations_db = ObservationsDatabase(db_dir) + if hasattr(self.old_db, 'paired_obs'): + # copy any legacy paired obs data into sqlite + self.observations_db.copyObsJsonRecords(self.old_db.paired_obs, dt_range) + + self.trajectory_db = TrajectoryDatabase(db_dir) + if hasattr(self.old_db, 'failed_trajectories'): + # copy any legacy failed traj data into sqlite, so we avoid recomputing them + self.trajectory_db.copyTrajJsonRecords(self.old_db.failed_trajectories, dt_range, failed=True) + + if self.old_db: + del self.old_db + + self.archiveOldRecords(older_than=archivemonths) + + + if mcmode & MCMODE_CANDS: + # Load the list of stations + station_list = self.loadStations() + + # Find unprocessed meteor files + log.info("") + log.info("Finding unprocessed data...") + self.processing_list = self.findUnprocessedFolders(station_list) + log.info(" ... done!") + + # in phase 1, initialise and collect data second as we load candidates dynamically + self.initialiseRemoteDataHandling() else: - # retrieve pickles from a remote host, if configured - if self.remotehost is not None: - collectRemoteTrajectories(remotehost, max_trajs, self.phase1_dir) + # in phase 2, initialise and collect data first as we need the phase1 traj on disk already + self.trajectory_db = None + self.observations_db = None + self.initialiseRemoteDataHandling() - # reload the phase1 trajectories - dt_beg, dt_end = self.loadPhase1Trajectories(max_trajs=max_trajs) + dt_beg, dt_end = self.loadPhase1Trajectories() self.processing_list = None self.dt_range=[dt_beg, dt_end] - self.db = None + + self.candidate_db = None + if mcmode == MCMODE_CANDS: + self.candidate_db = CandidateDatabase(db_dir, keep=daysback) + ### Define country groups to speed up the proceessing ### @@ -632,93 +545,118 @@ def __init__(self, dir_path, dt_range=None, db_dir=None, output_dir=None, mcmode ### ### + def checkRemoteDataMode(self): + remote_cfg = os.path.join(self.db_dir, 'wmpl_remote.cfg') + if os.path.isfile(remote_cfg): + self.RemoteDatahandler = RemoteDataHandler(remote_cfg) + return self.RemoteDatahandler.mode + else: + return 'none' + + + def initialiseRemoteDataHandling(self): + # Initialise remote data handling, if the config file is present + remote_cfg = os.path.join(self.db_dir, 'wmpl_remote.cfg') + if os.path.isfile(remote_cfg): + log.info('remote data management requested, initialising') + self.RemoteDatahandler = RemoteDataHandler(remote_cfg) + if self.RemoteDatahandler.mode == 'child': + self.RemoteDatahandler.clearStopFlag() + status = self.getRemoteData(verbose=True) + else: + status = self.moveUploadedData(verbose=False) + if not status: + log.info('no remote data yet') + else: + self.RemoteDatahandler = None - def purgePhase1ProcessedData(self, dir_path): - """ Purge old phase1 processed data if it is older than 90 days. """ - - refdt = time.time() - 90*86400 - result = [] - for path, _, files in os.walk(dir_path): - - for file in files: - - file_path = os.path.join(path, file) - - # Check if the file is older than the reference date - try: - file_dt = os.stat(file_path).st_mtime - except FileNotFoundError: - log.warning(f"File not found: {file_path}") - continue - - if ( - os.path.exists(file_path) and (file_dt < refdt) and os.path.isfile(file_path) - ): - - try: - os.remove(file_path) - result.append(file_path) - - except FileNotFoundError: - log.warning(f"File not found: {file_path}") + def purgeProcessedData(self, dir_path, days_back=14, verbose=False): + """ Purge processed candidate or phase1 data if it is older than a default of 14 days. """ - except Exception as e: - log.error(f"Error removing file {file_path}: {e}") - - return result + refdt = time.time() - days_back*86400 + num_removed = 0 + log.info(f'purging processed data from {dir_path} thats older than {days_back} days') + for file_name in glob.glob(os.path.join(dir_path,'*.pickle')): + try: + file_dt = os.stat(file_name).st_mtime + if file_dt < refdt: + if verbose: + log.info(f'removing {file_name}') + os.remove(file_name) + num_removed += 1 + except FileNotFoundError: + log.warning(f"File disappeared: {file_name}") + continue + except Exception as e: + log.error(f"Error removing file {file_name}: {e}") + + return num_removed + + def cleanupPartialProcessing(self): + log.info('checking for partially-processed phase1 files') + i=0 + for i, file_name in enumerate(glob.glob(os.path.join(self.phase1_dir, '*.pickle_processing'))): + new_name = file_name.replace('_processing','') + if os.path.isfile(new_name): + os.remove(file_name) + else: + os.rename(file_name, new_name) + log.info(f'updated {i} partially-processed files') + return - def archiveOldRecords(self, older_than=3): + def archiveOldRecords(self, older_than=0): """ Archive off old records to keep the database size down Keyword Arguments: - older_than: [int] number of months to keep, default 3 + older_than: [int] number of months to keep, default 0 + + if older_than is zero, then purge rather than archiving. To do this, we set arch_prefix to None + and archdate_jd to the earlier of dt_range[0] and 21 days ago. """ - class DummyMetObs(): - def __init__(self, station, obs_id): - self.station_code = station - self.id = obs_id - archdate = datetime.datetime.now(datetime.timezone.utc) - relativedelta(months=older_than) + + if older_than != 0: + archdate = datetime.datetime.now(datetime.timezone.utc) - relativedelta(months=older_than) + if self.dt_range: + archdate = min(archdate, self.dt_range[0]) + arch_prefix = archdate.strftime("%Y%m") + else: + archdate = datetime.datetime.now(datetime.timezone.utc) - datetime.timedelta(days=21) + if self.dt_range: + archdate = min(archdate, self.dt_range[0]) + arch_prefix = None archdate_jd = datetime2JD(archdate) - arch_db_path = os.path.join(self.db_dir, f'{archdate.strftime("%Y%m")}_{JSON_DB_NAME}') - archdb = DatabaseJSON(arch_db_path, verbose=self.verbose) - log.info(f'Archiving db records to {arch_db_path}...') - - for traj in [t for t in self.db.trajectories if t < archdate_jd]: - if traj < archdate_jd: - archdb.addTrajectory(None, self.db.trajectories[traj], False) - self.db.removeTrajectory(self.db.trajectories[traj], keepFolder=True) - - for traj in [t for t in self.db.failed_trajectories if t < archdate_jd]: - if traj < archdate_jd: - archdb.addTrajectory(None, self.db.failed_trajectories[traj], True) - self.db.removeTrajectory(self.db.failed_trajectories[traj], keepFolder=True) - - for station in self.db.processed_dirs: - arch_processed = [dirname for dirname in self.db.processed_dirs[station] if - datetime.datetime.strptime(dirname[14:22], '%Y%m%d').replace(tzinfo=datetime.timezone.utc) < archdate] - for dirname in arch_processed: - archdb.addProcessedDir(station, dirname) - self.db.processed_dirs[station].remove(dirname) - - for station in self.db.paired_obs: - arch_processed = [obs_id for obs_id in self.db.paired_obs[station] if - datetime.datetime.strptime(obs_id[7:15], '%Y%m%d').replace(tzinfo=datetime.timezone.utc) < archdate] - for obs_id in arch_processed: - archdb.addPairedObservation(DummyMetObs(station, obs_id)) - self.db.paired_obs[station].remove(obs_id) - - archdb.save() - self.db.save() + log.info(f'{"Purging" if older_than == 0 else "Archiving"} database records prior to {archdate.isoformat()}....') + + # purge or archive the Observations and Trajectories + # note: no need to do candidates as we already only keep at most 21 days of these + self.observations_db.archiveObsDatabase(self.db_dir, arch_prefix, archdate_jd) + self.trajectory_db.archiveTrajDatabase(self.db_dir, arch_prefix, archdate_jd) + + log.info(" ... done!") + return + + def closeObservationsDatabase(self): + if self.observations_db: + self.observations_db.closeObsDatabase() + return + + def closeCandidatesDatabase(self): + if self.candidate_db: + self.candidate_db.closeCandDatabase() + + def closeTrajectoryDatabase(self): + if self.trajectory_db: + self.trajectory_db.closeTrajDatabase() return def loadStations(self): """ Load the station names in the processing folder. """ - station_list = [] + avail_station_list = [] for dir_name in sorted(os.listdir(self.dir_path)): @@ -726,31 +664,23 @@ def loadStations(self): if os.path.isdir(os.path.join(self.dir_path, dir_name)): if re.match("^[A-Z]{2}[A-Z0-9]{4}$", dir_name): log.info("Using station: " + dir_name) - station_list.append(dir_name) + avail_station_list.append(dir_name) else: log.info("Skipping directory: " + dir_name) - return station_list - - + return avail_station_list def findUnprocessedFolders(self, station_list): """ Go through directories and find folders with unprocessed data. """ processing_list = [] - # skipped_dirs = 0 - # Go through all station directories for station_name in station_list: station_path = os.path.join(self.dir_path, station_name) - # Add the station name to the database if it doesn't exist - if station_name not in self.db.processed_dirs: - self.db.processed_dirs[station_name] = [] - # Go through all directories in stations for night_name in os.listdir(station_path): @@ -770,23 +700,10 @@ def findUnprocessedFolders(self, station_list): night_path = os.path.join(station_path, night_name) night_path_rel = os.path.join(station_name, night_name) - # # If the night path is not in the processed list, add it to the processing list - # if night_path_rel not in self.db.processed_dirs[station_name]: - # processing_list.append([station_name, night_path_rel, night_path, night_dt]) - processing_list.append([station_name, night_path_rel, night_path, night_dt]) - # else: - # skipped_dirs += 1 - - - # if skipped_dirs: - # log.info("Skipped {:d} processed directories".format(skipped_dirs)) - return processing_list - - def initMeteorObs(self, station_code, ftpdetectinfo_path, platepars_recalibrated_dict): """ Init meteor observations from the FTPdetectinfo file and recalibrated platepars. """ @@ -806,8 +723,6 @@ def initMeteorObs(self, station_code, ftpdetectinfo_path, platepars_recalibrated return meteor_list - - def loadUnpairedObservations(self, processing_list, dt_range=None): """ Load unpaired meteor observations, i.e. observations that are not a part of any trajectory. """ @@ -815,17 +730,20 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): unpaired_met_obs_list = [] prev_station = None station_count = 1 + for station_code, rel_proc_path, proc_path, night_dt in processing_list: # Check that the night datetime is within the given range of times, if the range is given if (dt_range is not None) and (night_dt is not None): dt_beg, dt_end = dt_range - # Skip all folders which are outside the limits - if (night_dt < dt_beg) or (night_dt > dt_end): + # Skip all folders which are outside the limits + # allow a day before dt_beg to capture data overlapping from an earlier timezone + if (night_dt < dt_beg + datetime.timedelta(days=-1)) or (night_dt > dt_end): continue - + log.info("") + log.info(f"Processing station: {station_code} {rel_proc_path}") ftpdetectinfo_name = None platepar_recalibrated_name = None @@ -834,11 +752,15 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): if os.path.isfile(proc_path): continue - log.info("") - log.info("Processing station: " + station_code) + # Find FTPdetectinfo and platepar files and skip if they're not both present + file_list = os.listdir(proc_path) + joined_file_list = ' '.join(file_list) + + if 'FTPdetectinfo' not in joined_file_list or 'platepars_all_recalibrated.json' not in joined_file_list: + continue - # Find FTPdetectinfo and platepar files - for name in os.listdir(proc_path): + # okay, we at least have the required files, lets try loading them + for name in file_list: # Find FTPdetectinfo if name.startswith("FTPdetectinfo") and name.endswith('.txt') and \ @@ -858,25 +780,37 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): except: pass - + # Skip these observations if no data files were found inside if (ftpdetectinfo_name is None) or (platepar_recalibrated_name is None): - log.info(" Skipping {:s} due to missing data files...".format(rel_proc_path)) - - # Add the folder to the list of processed folders - self.db.addProcessedDir(station_code, rel_proc_path) + log.info(f" Skipping {rel_proc_path} due to missing data files...") + continue + if len(platepars_recalibrated_dict) == 0: + #log.info(f" Skipping {rel_proc_path} due to no observations...") continue + # More accurate check that the night datetime is within the given range of times, if the range is given + if (dt_range is not None) and (night_dt is not None): + dt_beg, dt_end = dt_range + + # Find the time of the latest detection in this dataset. + # We add on 10s to the time of the latest detection to allow for the duration of an RMS FF block + + latest_obs = datetime.datetime.strptime(list(platepars_recalibrated_dict.keys())[-1][10:25], '%Y%m%d_%H%M%S') + latest_obs = latest_obs.replace(tzinfo=datetime.timezone.utc) + datetime.timedelta(seconds=10) + + # Filter out any folders whose start-date is after the bucket end date, or whose + # latest observaton is before the bucket start date. + + if (latest_obs < dt_beg) or (night_dt > dt_end): + # log.info(f'skipping {rel_proc_path} as no relevant observations') + continue + if station_code != prev_station: station_count += 1 prev_station = station_code - # Save database to mark those with missing data files (only every 250th station, to speed things up) - if (station_count % 250 == 0) and (station_code != prev_station): - self.saveDatabase() - - # Load platepars with open(os.path.join(proc_path, platepar_recalibrated_name)) as f: platepars_recalibrated_dict = json.load(f) @@ -889,6 +823,12 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): added_count = 0 for cams_met_obs in cams_met_obs_list: + obs_dt = jd2Date(cams_met_obs.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc) + + if dt_range and (obs_dt < dt_beg or obs_dt > dt_end): + #log.info(f'skipping {cams_met_obs.ff_name} as outside current bucket') + continue + # Get the platepar if cams_met_obs.ff_name in platepars_recalibrated_dict: pp_dict = platepars_recalibrated_dict[cams_met_obs.ff_name] @@ -923,7 +863,7 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): # Init the new meteor observation object met_obs = MeteorObsRMS( station_code, - jd2Date(cams_met_obs.jdt_ref, dt_obj=True, tzinfo=datetime.timezone.utc), + obs_dt, pp, meteor_data, rel_proc_path, @@ -934,11 +874,9 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): continue # Add only unpaired observations - if not self.db.checkObsIfPaired(met_obs): - + if not self.checkIfObsPaired(met_obs.id, verbose=verbose): # print(" ", station_code, met_obs.reference_dt, rel_proc_path) added_count += 1 - unpaired_met_obs_list.append(met_obs) log.info(" Added {:d} observations!".format(added_count)) @@ -946,10 +884,8 @@ def loadUnpairedObservations(self, processing_list, dt_range=None): log.info("") log.info(" Finished loading unpaired observations!") - self.saveDatabase() return unpaired_met_obs_list - def yearMonthDayDirInDtRange(self, dir_name): """ Given a directory name which is either YYYY, YYYYMM or YYYYMMDD, check if it is in the given @@ -971,21 +907,21 @@ def yearMonthDayDirInDtRange(self, dir_name): date_fmt = "%Y" # Check if the directory name starts with a year - if not re.match("^\d{4}", dir_name): # noqa: W605 + if not re.match(r"^\d{4}", dir_name): return False elif len(dir_name) == 6: date_fmt = "%Y%m" # Check if the directory name starts with a year and month - if not re.match("^\d{6}", dir_name): # noqa: W605 + if not re.match(r"^\d{6}", dir_name): return False elif len(dir_name) == 8: date_fmt = "%Y%m%d" # Check if the directory name starts with a year, month and day - if not re.match("^\d{8}", dir_name): # noqa: W605 + if not re.match(r"^\d{8}", dir_name): return False else: @@ -1039,8 +975,7 @@ def yearMonthDayDirInDtRange(self, dir_name): return True else: - return False - + return False def trajectoryFileInDtRange(self, file_name, dt_range=None): """ Check if the trajectory file is in the given datetime range. """ @@ -1069,108 +1004,175 @@ def trajectoryFileInDtRange(self, file_name, dt_range=None): else: return False + def updateTrajectoryDatabase(self, dt_range=None, verbose=False): + """ + Update the trajectory database to make sure its in line with whats on disk, + at the same time checking for and removing any duplicate trajectories. - def removeDeletedTrajectories(self): - """ Purge the database of any trajectories that no longer exist on disk. - These can arise because the monte-carlo stage may update the data. + Arguments: + dt_range: [datetime, datetime] range of dates to load data for """ - if not os.path.isdir(self.output_dir): return - if self.db is None: + if self.trajectory_db is None: return - log.info(" Removing deleted trajectories from: " + self.output_dir) - if self.dt_range is not None: - log.info(" Datetime range: {:s} - {:s}".format( - self.dt_range[0].strftime("%Y-%m-%d %H:%M:%S"), - self.dt_range[1].strftime("%Y-%m-%d %H:%M:%S"))) - - jdt_start = datetime2JD(self.dt_range[0]) - jdt_end = datetime2JD(self.dt_range[1]) - - trajs_to_remove = [] - - keys = [k for k in self.db.trajectories.keys() if k >= jdt_start and k <= jdt_end] - for trajkey in keys: - traj_reduced = self.db.trajectories[trajkey] - # Update the trajectory path to make sure we're working with the correct filesystem - traj_path = self.generateTrajOutputDirectoryPath(traj_reduced) - traj_file_name = os.path.split(traj_reduced.traj_file_path)[1] - traj_path = os.path.join(traj_path, traj_file_name) - - if self.verbose: - log.info(f' testing {traj_path}') + if dt_range is None: + dt_beg, dt_end = self.dt_range + else: + dt_beg, dt_end = dt_range - if not os.path.isfile(traj_path): - traj_reduced.traj_file_path = traj_path - trajs_to_remove.append(traj_reduced) + traj_dir_path = os.path.join(self.output_dir, OUTPUT_TRAJ_DIR) - for traj in trajs_to_remove: - log.info(f' removing deleted {traj.traj_file_path}') + log.info("Updating trajectory database...") + if self.dt_range is not None: + log.info(f" Datetime range: {dt_beg.strftime('%Y-%m-%d %H:%M:%S')} - {dt_end.strftime('%Y-%m-%d %H:%M:%S')}") + log.info(f" Removing deleted trajectories no longer in: {traj_dir_path}") + + jdt_range = [datetime2JD(dt_beg), datetime2JD(dt_end)] + + traj_list = self.trajectory_db.getTrajBasics(self.output_dir, jdt_range) + i = 0 + for traj in traj_list: + if not os.path.isfile(os.path.join(self.output_dir, traj['traj_file_path'])): + log.info(f' removing nonexistent traj {jd2Date(traj["jdt_ref"],dt_obj=True).strftime("%Y%m%d_%H%M%S.%f")} {traj["traj_file_path"]} from database') + self.removeTrajectory(TrajectoryReduced(None, json_dict=traj)) + i += 1 + log.info(f' removed {i} deleted trajectories') + + # + # Now look for duplicate trajectories and ones with shared observations. In theory these should not exist + # but its possible for them to arise because during distributed calculations, candidates may be found before + # the last solver run has completed. + # + # lambda to use to find traj with common obs + def atleastOneObs(obs_ids,next_obs_ids): + if obs_ids is None or next_obs_ids is None: + return False + if len(obs_ids)==0 or len(next_obs_ids)==0: + return False + if isinstance(obs_ids[0], int) or isinstance(next_obs_ids[0], int): + return False + return any(i in next_obs_ids for i in obs_ids) - # remove from the database but not from the disk: they're already not on the disk and this avoids - # accidentally deleting a different traj with a timestamp which is within a millisecond - self.db.removeTrajectory(traj, keepFolder=True) + log.info(" Looking for duplicate trajectories...") + # create a dataframe and sort it by date. Duplicates will almost always have very similar dates + # refresh the traj_list first to avoid operating on already-removed trajectories - return + traj_list = self.trajectory_db.getTrajBasics(self.output_dir, jdt_range) + traj_df = pd.DataFrame(traj_list) + # remove legacy trajs without obs_ids + if 'obs_ids' in traj_df.columns: + traj_df = traj_df[traj_df.obs_ids != "None"] + if len(traj_df) > 0: + # sort by date + traj_df.sort_values(by='jdt_ref', inplace=True, ignore_index=True) - def loadComputedTrajectories(self, traj_dir_path, dt_range=None): - """ Load already estimated trajectories from disk within a date range. + # During candidate finding, the solver attempts to add new observations to existing trajectories. + # If the target is a legacy trajectory without obs_ids, the solver assigns simple numerical IDs rather than + # true observation ids, which are strings. We can't process these legacy values safely here so exclude them. - Arguments: - traj_dir_path: [str] Full path to a directory with trajectory pickles. - """ + traj_df['validrow']=traj_df.apply(lambda row: not isinstance(row.obs_ids[0], int), axis=1) + traj_df = traj_df[traj_df.validrow] - # defend against the case where there are no existing trajectories and traj_dir_path doesn't exist - if not os.path.isdir(traj_dir_path): - return - - if self.db is None: - return - - if dt_range is None: - dt_beg, dt_end = self.dt_range - else: - dt_beg, dt_end = dt_range - - log.info(" Loading trajectories from: " + traj_dir_path) - if self.dt_range is not None: - log.info(" Datetime range: {:s} - {:s}".format( - dt_beg.strftime("%Y-%m-%d %H:%M:%S"), - dt_end.strftime("%Y-%m-%d %H:%M:%S"))) + # Now add a column containing the next trajectory's observations + traj_df['obs_ids_next'] = traj_df.obs_ids.shift(-1) + traj_df['ign_obs_ids_next'] = traj_df.ign_obs_ids.shift(-1) + traj_df['traj_id_next'] = traj_df.traj_id.shift(-1) + traj_df['traj_path_next'] = traj_df.traj_file_path.shift(-1) + # get a list of any trajectories with exactly the same observations. + # Then remove whichever one isn't on disk. + + log.info(' - looking for exact matches by observation id') + same_obs = traj_df.query('obs_ids == obs_ids_next') + for idx, rw in same_obs.iterrows(): + + if not os.path.isfile(rw.traj_file_path): + # trajectory already doesn't exist on disk so remove it from the database + self.trajectory_db.removeTrajectoryById(rw.traj_id) + + else: + # load the trajectory from disk and determine which one it is + traj1 = loadPickle(*os.path.split(rw.traj_file_path)) + + if traj1.traj_id == rw.traj_id: + remove_id = rw.traj_id_next + remove_path = rw.traj_path_next + else: + remove_id = rw.traj_id + remove_path = rw.traj_file_path + log.info(f' - removing duplicate trajectory {remove_id} from database') + self.trajectory_db.removeTrajectoryById(remove_id) + + # only delete the disk file if they're in different physical locations + if rw.traj_file_path != rw.traj_path_next: + log.info(f' - and removing files from {remove_path}') + shutil.rmtree(os.path.split(remove_path)[0]) + + # remove the row from the dataframe to avoid reprocessing it + traj_df.drop(idx) + + if not traj_df.empty: + # If we still have some data, look for trajectories which share at least one observation. + # These are candidates for being merged. + # So we keep the trajectory with most observations, and unpair the non-shared ones in the other before + # deleting it. In theory, the next pass will identify the unpaired obs as possibly to add to the remaining + # trajectory. At worst it will identify the unpaired obs as a potential new candidate. + tmpdf = traj_df.apply(lambda row: atleastOneObs(row.obs_ids, row.obs_ids_next), axis=1) + if tmpdf.empty or len(tmpdf.shape) > 1: + log.info(' - no mergeable events to analyse') + else: + traj_df['overlapids'] = traj_df.apply(lambda row: atleastOneObs(row.obs_ids, row.obs_ids_next), axis=1) + common_obs = traj_df[traj_df.overlapids] + + for idx, rw in common_obs.iterrows(): + + log.info(f' - checking mergeable events {rw.traj_id} and {rw.traj_id_next}') + traj_ids = rw.obs_ids + rw.ign_obs_ids + next_ids = rw.obs_ids_next + rw.ign_obs_ids_next + if len(traj_ids) >= len(next_ids): + remove_id = rw.traj_id_next + remove_path = rw.traj_path_next + unpair_ids = [id for id in next_ids if id not in traj_ids] + else: + remove_id = rw.traj_id + remove_path = rw.traj_file_path + unpair_ids = [id for id in traj_ids if id not in next_ids] + + log.info(f' - removing {remove_id}') + self.trajectory_db.removeTrajectoryById(remove_id) + + if len(unpair_ids) > 0: + log.info(f' - unpairing {unpair_ids} from {remove_id}') + self.observations_db.unpairObs(unpair_ids) + + # only remove the physical on-disk files if the locations are different! + if (rw.traj_file_path != rw.traj_path_next) and os.path.isfile(remove_path): + log.info(f' - removing {os.path.split(remove_path)[0]} from disk') + shutil.rmtree(os.path.split(remove_path)[0]) + + # Finally, scan the disk for trajectories that need to be added to the database. + # These can arise during distributed processing or phase2 analysis if the jdt_ref changes significantly. + + log.info(" Adding found trajectories from: " + traj_dir_path) counter = 0 # Construct a list of all ddirectory paths to visit. The trajectory directories are sorted in # YYYY/YYYYMM/YYYYMMDD, so visit them in that order to check if they are in the datetime range dir_paths = [] - #iterate over the days in the range - jdt_beg = int(np.floor(datetime2JD(dt_beg))) - jdt_end = int(np.ceil(datetime2JD(dt_end))) - - yyyy = 0 - mm = 0 - dd = 0 start_time = datetime.datetime.now() - for jdt in range(jdt_beg, jdt_end + 1): - - curr_dt = jd2Date(jdt, dt_obj=True) - if curr_dt.year != yyyy: - yyyy = curr_dt.year - log.info("- year " + str(yyyy)) - if curr_dt.month != mm: - mm = curr_dt.month - yyyymm = f'{yyyy}{mm:02d}' - log.info(" - month " + str(yyyymm)) + #iterate over the days in the range + dt_diff = max((dt_end - dt_beg).days, 1) + 2 - if curr_dt.day != dd: - dd = curr_dt.day - yyyymmdd = f'{yyyy}{mm:02d}{dd:02d}' - log.info(" - day " + str(yyyymmdd)) + for d in range(dt_diff): + curr_dt = dt_beg + datetime.timedelta(days=d) + yyyy = curr_dt.year + yyyymm = f'{yyyy}{curr_dt.month:02d}' + yyyymmdd = f'{yyyy}{curr_dt.month:02d}{curr_dt.day:02d}' yyyymmdd_dir_path = os.path.join(traj_dir_path, f'{yyyy}', f'{yyyymm}', f'{yyyymmdd}') @@ -1183,109 +1185,40 @@ def loadComputedTrajectories(self, traj_dir_path, dt_range=None): full_traj_dir = os.path.join(yyyymmdd_dir_path, traj_dir) if os.path.isdir(full_traj_dir) and (full_traj_dir not in dir_paths): - for file_name in glob.glob1(full_traj_dir, '*_trajectory.pickle'): + for file_name in glob.glob('*_trajectory.pickle', root_dir=full_traj_dir): if self.trajectoryFileInDtRange(file_name, dt_range=dt_range): - self.db.addTrajectory(os.path.join(full_traj_dir, file_name)) + if self.trajectory_db.addTrajectory(TrajectoryReduced(os.path.join(full_traj_dir, file_name)), force_add=False, verbose=True): + counter += 1 # Print every 1000th trajectory - if counter % 1000 == 0: - log.info(f" Loaded {counter:6d} trajectories, currently on {file_name}") - counter += 1 + if counter % 1000 == 0 and counter > 0: + log.info(f" Added {counter:6d} trajectories") dir_paths.append(full_traj_dir) dur = (datetime.datetime.now() - start_time).total_seconds() - log.info(f" Loaded {counter:6d} trajectories in {dur:.0f} seconds") - - + log.info(f" Added {counter:6d} trajectories in {dur:.0f} seconds") def getComputedTrajectories(self, jd_beg, jd_end): """ Returns a list of computed trajectories between the Julian dates. """ - - return [self.db.trajectories[key] for key in self.db.trajectories - if (self.db.trajectories[key].jdt_ref >= jd_beg) - and (self.db.trajectories[key].jdt_ref <= jd_end)] - - - def removeDuplicateTrajectories(self, dt_range): - """ Remove trajectories with duplicate IDs - keeping the one with the most station observations - """ - - log.info('removing duplicate trajectories') - - tr_in_scope = self.getComputedTrajectories(datetime2JD(dt_range[0]), datetime2JD(dt_range[1])) - tr_to_check = [{'jdt_ref':traj.jdt_ref,'traj_id':traj.traj_id, 'traj': traj} for traj in tr_in_scope if hasattr(traj,'traj_id')] - - if len(tr_to_check) == 0: - log.info('no trajectories in range') - return - - tr_df = pd.DataFrame(tr_to_check) - tr_df['dupe']=tr_df.duplicated(subset=['traj_id']) - dupeids = tr_df[tr_df.dupe].sort_values(by=['traj_id']).traj_id - duperows = tr_df[tr_df.traj_id.isin(dupeids)] - - log.info(f'there are {len(duperows.traj_id.unique())} duplicate trajectories') - - # iterate over the duplicates, finding the best and removing the others - for traj_id in duperows.traj_id.unique(): - num_stats = 0 - best_traj_dt = None - best_traj_path = None - # find duplicate with largest number of observations - for testdt in duperows[duperows.traj_id==traj_id].jdt_ref.values: - - if len(dh.db.trajectories[testdt].participating_stations) > num_stats: - - best_traj_dt = testdt - num_stats = len(dh.db.trajectories[testdt].participating_stations) - # sometimes the database contains duplicates that differ by microseconds in jdt. These - # will have overwritten each other in the folder so make a note of the location. - best_traj_path = dh.db.trajectories[testdt].traj_file_path - - # now remove all except the best - for testdt in duperows[duperows.traj_id==traj_id].jdt_ref.values: - - traj = dh.db.trajectories[testdt] - if testdt != best_traj_dt: - - # get the current trajectory's location. If its the same as that of the best trajectory - # don't try to delete the solution from disk even if there's a small difference in jdt_ref - keepFolder = False - if traj.traj_file_path == best_traj_path: - keepFolder = True - # Update the trajectory path to make sure we're working with the correct filesystem - traj_path = self.generateTrajOutputDirectoryPath(traj) - traj_file_name = os.path.split(traj.traj_file_path)[1] - traj.traj_file_path = os.path.join(traj_path, traj_file_name) - log.info(f'removing duplicate {traj.traj_id} keep {traj_file_name} {keepFolder}') - - self.db.removeTrajectory(traj, keepFolder=keepFolder) - - else: - if self.verbose: - log.info(f'keeping {traj.traj_id} {traj.traj_file_path}') - - return - + jd_range = [jd_beg, jd_end] + json_dicts = self.trajectory_db.getTrajectories(self.output_dir, jd_range) + trajs = [TrajectoryReduced(None, json_dict=j) for j in json_dicts] + return trajs def getPlatepar(self, met_obs): """ Return the platepar of the meteor observation. """ return met_obs.platepar - - def getUnpairedObservations(self): """ Returns a list of unpaired meteor observations. """ return self.unpaired_observations - def countryFilter(self, station_code1, station_code2): """ Only pair observations if they are in proximity to a given country. """ @@ -1300,9 +1233,30 @@ def countryFilter(self, station_code1, station_code2): # If a given country is not in any of the groups, allow it to be paired return True + + def checkIfObsPaired(self, obs_id, verbose=False): + return self.observations_db.checkObsPaired(obs_id, verbose) + + def addPairedObs(self, matched_obs, jdt_ref, verbose=False): + """ + mark a list of observations as paired + + parameters: + matched_obs : a tuple containing the observations. + jdt_ref : the julian date of the Trajectory they are paired with. + + """ + if len(matched_obs[0])==3: + obs_ids = [met_obs.id for _, met_obs, _ in matched_obs] + else: + obs_ids = [met_obs.id for _, met_obs in matched_obs] + jdt_refs = [jdt_ref] * len(obs_ids) + self.observations_db.addPairedObservations(obs_ids, jdt_refs, verbose=verbose) - def findTimePairs(self, met_obs, unpaired_observations, max_toffset): + return + + def findTimePairs(self, met_obs, unpaired_observations, max_toffset, verbose=False): """ Finds pairs in time between the given meteor observations and all other observations from different stations. @@ -1322,6 +1276,9 @@ def findTimePairs(self, met_obs, unpaired_observations, max_toffset): # Go through all meteors from other stations for met_obs2 in unpaired_observations: + if self.checkIfObsPaired(met_obs2.id, verbose=verbose): + continue + # Take only observations from different stations if met_obs.station_code == met_obs2.station_code: continue @@ -1337,7 +1294,6 @@ def findTimePairs(self, met_obs, unpaired_observations, max_toffset): return found_pairs - def getTrajTimePairs(self, traj_reduced, unpaired_observations, max_toffset): """ Find unpaired observations which are close in time to the given trajectory. """ @@ -1366,9 +1322,9 @@ def getTrajTimePairs(self, traj_reduced, unpaired_observations, max_toffset): return found_traj_obs_pairs - def generateTrajOutputDirectoryPath(self, traj, make_dirs=False): - """ Generate a path to the trajectory output directory. + """ + Generate a path to the trajectory output directory. Keyword arguments: make_dirs: [bool] Make the tree of output directories. False by default. @@ -1377,11 +1333,11 @@ def generateTrajOutputDirectoryPath(self, traj, make_dirs=False): # Generate a list of station codes if isinstance(traj, TrajectoryReduced): # If the reducted trajectory object is given - station_list = traj.participating_stations + traj_station_list = traj.participating_stations else: # If the full trajectory object is given - station_list = [obs.station_id for obs in traj.observations if obs.ignore_station is False] + traj_station_list = [obs.station_id for obs in traj.observations if obs.ignore_station is False] # Datetime of the reference trajectory time @@ -1399,7 +1355,7 @@ def generateTrajOutputDirectoryPath(self, traj, make_dirs=False): # Name of the trajectory directory # sort the list of country codes otherwise we can end up with duplicate trajectories - ctry_list = list(set([stat_id[:2] for stat_id in station_list])) + ctry_list = list(set([stat_id[:2] for stat_id in traj_station_list])) ctry_list.sort() traj_dir = dt.strftime("%Y%m%d_%H%M%S.%f")[:-3] + "_" + "_".join(ctry_list) @@ -1411,9 +1367,15 @@ def generateTrajOutputDirectoryPath(self, traj, make_dirs=False): return out_path + def saveTrajectoryResults(self, traj, save_plots, save_phase1=False, verbose=False): + """ + Save trajectory results to the disk. - def saveTrajectoryResults(self, traj, save_plots): - """ Save trajectory results to the disk. """ + Parameters: + traj: [traj] the trajectory to save + save_plots: [bool] true if we also want to generate plots of the data + save_phase1:[bool] true if we also want to save a phase1 copy of the traj + """ # Generate the name for the output directory (add list of country codes at the end) @@ -1427,7 +1389,7 @@ def saveTrajectoryResults(self, traj, save_plots): # if additional observations are found then the refdt or country list may change quite a bit traj.longname = os.path.split(output_dir)[-1] - if self.mc_mode == 1: + if self.mc_mode & MCMODE_PHASE1: # The MC phase may change the refdt so save a copy of the the original name. traj.pre_mc_longname = traj.longname @@ -1438,17 +1400,13 @@ def saveTrajectoryResults(self, traj, save_plots): savePickle(traj, output_dir, traj.file_name + '_trajectory.pickle') log.info(f'saved {traj.traj_id} to {output_dir}') - if self.mc_mode == 1: - savePickle(traj, self.phase1_dir, traj.pre_mc_longname + '_trajectory.pickle') - elif self.mc_mode == 2: - # we save this in MC mode the MC phase may alter the trajectory details and if later on + if (self.mc_mode & MCMODE_PHASE1 and not self.mc_mode & MCMODE_PHASE2) or save_phase1: + self.saveCandOrTraj(traj, f'{traj.longname}_trajectory.pickle', verbose=verbose) + + elif self.mc_mode & MCMODE_PHASE2: + # the MC phase may alter the trajectory details and if later on # we're including additional observations we need to use the most recent version of the trajectory - savePickle(traj, os.path.join(self.phase1_dir, 'processed'), traj.pre_mc_longname + '_trajectory.pickle') - - if self.remotehost is not None: - log.info('saving to remote host') - uploadTrajToRemote(remotehost, traj.file_name + '_trajectory.pickle', output_dir) - log.info(' ...done') + savePickle(traj, os.path.join(self.phase1_dir, 'processed'), f'{traj.pre_mc_longname}_trajectory.pickle') # Save the plots if save_plots: @@ -1459,35 +1417,16 @@ def saveTrajectoryResults(self, traj, save_plots): pass traj.save_results = False - - - def markObservationAsProcessed(self, met_obs): - """ Mark the given meteor observation as processed. """ - - if self.db is None: - return - self.db.addProcessedDir(met_obs.station_code, met_obs.rel_proc_path) - - - - def markObservationAsPaired(self, met_obs): - """ Mark the given meteor observation as paired in a trajectory. """ - - if self.db is None: - return - self.db.addPairedObservation(met_obs) - - - - def addTrajectory(self, traj, failed_jdt_ref=None): - """ Add the resulting trajectory to the database. + def addTrajectory(self, traj, failed_jdt_ref=None, verbose=False): + """ + Add the resulting trajectory to the database. Arguments: traj: [Trajectory object] failed_jdt_ref: [float] Reference Julian date of the failed trajectory. None by default. """ - if self.db is None: + if self.trajectory_db is None: return # Set the correct output path traj.output_dir = self.generateTrajOutputDirectoryPath(traj) @@ -1500,15 +1439,15 @@ def addTrajectory(self, traj, failed_jdt_ref=None): if failed_jdt_ref is not None: traj_reduced.jdt_ref = failed_jdt_ref - self.db.addTrajectory(None, traj_obj=traj_reduced, failed=(failed_jdt_ref is not None)) + self.trajectory_db.addTrajectory(traj_reduced, failed=(failed_jdt_ref is not None), verbose=verbose) - - - def removeTrajectory(self, traj_reduced): - """ Remove the trajectory from the data base and disk. """ + def removeTrajectory(self, traj_reduced, remove_phase1=False): + """ + Remove the trajectory from the data base and disk. + """ # in mcmode 2 the database isn't loaded but we still need to delete updated trajectories - if self.mc_mode == 2: + if self.mc_mode & MCMODE_PHASE2: if os.path.isfile(traj_reduced.traj_file_path): traj_dir = os.path.dirname(traj_reduced.traj_file_path) shutil.rmtree(traj_dir, ignore_errors=True) @@ -1518,53 +1457,53 @@ def removeTrajectory(self, traj_reduced): traj_dir = os.path.join(base_dir, traj_reduced.pre_mc_longname) if os.path.isdir(traj_dir): shutil.rmtree(traj_dir, ignore_errors=True) - else: - log.warning(f'unable to find {traj_dir}') - else: - log.warning(f'unable to find {traj_reduced.traj_file_path}') + return - # remove the processed pickle now we're done with it - self.cleanupPhase2TempPickle(traj_reduced, True) + if (self.mc_mode & MCMODE_PHASE1 or self.mc_mode & MCMODE_CANDS) and remove_phase1: + # remove any solution from the phase1 folder + phase1_traj = os.path.join(self.phase1_dir, traj_reduced.pre_mc_longname + '_trajectory.pickle') + if os.path.isfile(phase1_traj): + try: + os.remove(phase1_traj) + log.info(f'removed {phase1_traj}') + except Exception: + pass - return - self.db.removeTrajectory(traj_reduced) + # Remove the trajectory folder from the disk + if os.path.isfile(traj_reduced.traj_file_path): + traj_dir = os.path.dirname(traj_reduced.traj_file_path) + shutil.rmtree(traj_dir, ignore_errors=True) + if os.path.isfile(traj_reduced.traj_file_path): + log.warning(f'unable to remove {traj_dir}') + self.trajectory_db.removeTrajectory(traj_reduced) - def cleanupPhase2TempPickle(self, traj, success=False): - """ - At the start of phase 2 monte-carlo sim calculation, the phase1 pickles are renamed to indicate they're being processed. - Once each one is processed (fail or succeed) we need to clean up the file. If the MC step failed, we still want to keep - the pickle, because we might later on get new data and it might become solvable. Otherwise, we can just delete the file - since the MC solver will have saved an updated one already. + def checkCandIfFailed(self, candidate): + """ + Check if the given candidate has been processed with the same observations and has failed to be + computed before. """ - if self.mc_mode != 2: - return - fldr_name = os.path.split(self.generateTrajOutputDirectoryPath(traj, make_dirs=False))[-1] - pick = os.path.join(self.phase1_dir, fldr_name + '_trajectory.pickle_processing') - if os.path.isfile(pick): - os.remove(pick) - else: - log.warning(f'unable to find _processing file {pick}') - if not success: - # save the pickle in case we get new data later and can solve it - savePickle(traj, os.path.join(self.phase1_dir, 'processed'), fldr_name + '_trajectory.pickle') - return - + jdt_ref = min([obs.jdt_ref for obs, _, _ in candidate]) + stations = [obs.station_id for obs, _, _ in candidate] + return self.trajectory_db.checkCandIfFailed(jdt_ref, stations) def checkTrajIfFailed(self, traj): - """ Check if the given trajectory has been computed with the same observations and has failed to be - computed before. - - """ + """ + Check if the given trajectory has been computed with the same observations and has failed to be + computed before. - if self.db is None: - return - return self.db.checkTrajIfFailed(traj) + Parameters: + traj: full trajectory object + """ + if self.trajectory_db is None: + return + traj_reduced = TrajectoryReduced(None, traj_obj=traj) + return self.trajectory_db.checkTrajIfFailed(traj_reduced) def loadFullTraj(self, traj_reduced): - """ Load the full trajectory object. + """ Load the full trajectory object corresponding to a traj_reduced object. Arguments: traj_reduced: [TrajectoryReduced object] @@ -1604,15 +1543,11 @@ def loadFullTraj(self, traj_reduced): return None - def loadPhase1Trajectories(self, max_trajs=1000): + def loadPhase1Trajectories(self): """ Load trajectories calculated by the intersecting-planes phase 1. These trajectories do not include uncertainties which are calculated in the Monte-Carlo phase 2 - keyword arguments: - maxtrajs: [int] maximum number of trajectories to load in each pass, to avoid taking too long per pass. - - returns: dt_beg, dt_end: [datetime] The earliest and latest date/time of the loaded trajectories. Used later to set the number of time buckets to process data in. @@ -1620,7 +1555,7 @@ def loadPhase1Trajectories(self, max_trajs=1000): """ pickles = glob.glob1(self.phase1_dir, "*_trajectory.pickle") pickles.sort() - pickles = pickles[:max_trajs] + pickles = pickles[:self.max_trajs] self.phase1Trajectories = [] if len(pickles) == 0: return None, None @@ -1645,12 +1580,12 @@ def loadPhase1Trajectories(self, max_trajs=1000): if not hasattr(traj, 'pre_mc_longname'): traj.pre_mc_longname = os.path.split(traj_dir)[-1] - # Check if the traj object as fixed time offsets + # Check if the traj object has fixed time offsets if not hasattr(traj, 'fixed_time_offsets'): traj.fixed_time_offsets = {} - # now we've loaded the phase 1 solution, move it to prevent accidental reprocessing - procfile = os.path.join(self.phase1_dir, pick + '_processing') + # now we've loaded the phase 1 solution, move it to prevent reprocessing + procfile = os.path.join(self.phase1_dir, 'processed', pick) if os.path.isfile(procfile): os.remove(procfile) os.rename(os.path.join(self.phase1_dir, pick), procfile) @@ -1662,29 +1597,346 @@ def loadPhase1Trajectories(self, max_trajs=1000): log.info(f'File {pick} skipped for now') return dt_beg, dt_end + def loadCandidates(self, verbose=False): + """ + Load candidates from the 'candidates' folder and then move the file to the 'candidates/processed' folder + Used only in phase1 solving mode + """ + candidate_trajectories = [] + save_path = self.candidate_dir + procpath = os.path.join(save_path, 'processed') + os.makedirs(procpath, exist_ok=True) + + for fil in os.listdir(save_path)[:self.max_trajs]: + if '.pickle' not in fil: + continue + try: + loadedpickle = loadPickle(save_path, fil) + candidate_trajectories.append(loadedpickle) + + # now move the loaded file so we don't try to reprocess it + full_name = os.path.join(save_path, fil) + procfile = os.path.join(procpath, fil) + shutil.copy(full_name, procfile) + os.remove(full_name) + + except Exception: + log.info(f'Candidate {fil} went away, probably picked up by another process') + log.info("-----------------------") + log.info('LOADED {} CANDIDATES'.format(len(candidate_trajectories))) + log.info("-----------------------") + + return candidate_trajectories + + def moveUploadedData(self, verbose=False): + """ + Used in 'master' mode: this moves uploaded data to the target locations on the server + and merges in any uploaded sqlite databases - def saveDatabase(self): - """ Save the data base. """ + """ + log.info('merging in any remotely processed data') + for node in self.RemoteDatahandler.nodes: + if node.nodename == 'localhost' or self.observations_db is None or self.trajectory_db is None: + continue - def _breakHandler(signum, frame): - """ Do nothing if CTRL + C is pressed. """ - log.info("The data base is being saved, the program cannot be exited right now!") - pass + # if the remote node upload path doesn't exist skip it + if not os.path.isdir(os.path.join(node.dirpath,'files')): + continue - if self.db is None: - return - # Prevent quitting while a data base is being saved - original_signal = signal.getsignal(signal.SIGINT) - signal.signal(signal.SIGINT, _breakHandler) + # merge the databases + for obsdb_path in glob.glob(os.path.join(node.dirpath,'files','observations*.db')): + if self.observations_db.mergeObsDatabase(obsdb_path): + os.remove(obsdb_path) + try: + os.remove(f'{obsdb_path}-wal') + os.remove(f'{obsdb_path}-shm') + except Exception: + log.warning(f'unable to fully merge the remote obs database {obsdb_path}') + pass + + + for trajdb_path in glob.glob(os.path.join(node.dirpath,'files','trajectories*.db')): + if self.trajectory_db.mergeTrajDatabase(trajdb_path): + os.remove(trajdb_path) + else: + log.warning(f'unable to fully merge the remote traj database {trajdb_path}') + + i = 0 + remote_trajdir = os.path.join(node.dirpath, 'files', 'trajectories') + if os.path.isdir(remote_trajdir): + for i,traj in enumerate(os.listdir(remote_trajdir)): + if os.path.isdir(os.path.join(remote_trajdir, traj)): + targ_path = os.path.join(self.output_dir, 'trajectories', traj[:4], traj[:6], traj[:8], traj) + src_path = os.path.join(node.dirpath,'files', 'trajectories', traj) + for src_name in os.listdir(src_path): + src_name = os.path.join(src_path, src_name) + if not os.path.isfile(src_name): + log.warning(f'{src_name} missing') + else: + os.makedirs(targ_path, exist_ok=True) + shutil.copy(src_name, targ_path) + shutil.rmtree(src_path,ignore_errors=True) + if i > 0: + log.info(f'moved {i+1} trajectories') + + # if the node was in mode 1 then move any uploaded phase1 solutions + remote_ph1dir = os.path.join(node.dirpath, 'files', 'phase1') + if os.path.isdir(remote_ph1dir) and node.mode==1: + os.makedirs(self.phase1_dir, exist_ok=True) + i = 0 + for i, fil in enumerate([x for x in os.listdir(remote_ph1dir) if '.pickle' in x]): + full_name = os.path.join(remote_ph1dir, fil) + shutil.copy(full_name, self.phase1_dir) + os.remove(full_name) + + if i > 0: + log.info(f'moved {i+1} new phase 1 solutions from {node.nodename}') + + # if the node was in mode 1 then move any uploaded processed candidates + remote_canddir = os.path.join(node.dirpath, 'files', 'candidates', 'processed') + if os.path.isdir(remote_canddir) and node.mode==1: + i = 0 + targ_dir = os.path.join(self.candidate_dir, 'processed') + for i, fil in enumerate([x for x in os.listdir(remote_canddir) if '.pickle' in x]): + full_name = os.path.join(remote_canddir, fil) + shutil.copy(full_name, targ_dir) + os.remove(full_name) + + if i > 0: + log.info(f'moved {i+1} processed candidates from {node.nodename}') + + # if the node was in mode 2 then move any processed phase1 solutions + remote_ph1dir = os.path.join(node.dirpath, 'files', 'phase1', 'processed') + if os.path.isdir(remote_ph1dir) and node.mode==2: + targ_dir = os.path.join(self.phase1_dir, 'processed') + os.makedirs(targ_dir, exist_ok=True) + i = 0 + for i, fil in enumerate([x for x in os.listdir(remote_ph1dir) if '.pickle' in x]): + full_name = os.path.join(remote_ph1dir, fil) + shutil.copy(full_name, targ_dir) + os.remove(full_name) + + if i > 0: + log.info(f'moved {i+1} processed phase 1 solutions from {node.nodename}') + + return True + + def checkAndRedistribCands(self, wait_time=6, verbose=False): + """ + Check child nodes and + 1) if the stop flag has appeared, move any pending data to prevent it getting stuck + 2) move data if it has been waiting more than wait_time hours, default six + 3) if the node is idle, assign it extra data + + Parameters: + wait_time : time in hours to wait before data is considered stale + + """ + for node in self.RemoteDatahandler.nodes: + if node.nodename == 'localhost' or self.observations_db is None or self.trajectory_db is None: + continue + # if the remote node upload path doesn't exist skip it + if not os.path.isdir(os.path.join(node.dirpath,'files')): + continue - # Save the data base - log.info("Saving data base to disk...") - self.db.save() + # if the stop file has appeared, then move any pending candidates or phase1 files + + if os.path.isfile(os.path.join(node.dirpath, 'files','stop')): + files_to_move = glob.glob(os.path.join(node.dirpath, 'files', 'candidates', '*.pickle')) + if len(files_to_move) > 0: + log.info(f'{node.nodename} stopfile has appeared, moving candidates') + for full_name in files_to_move: + shutil.copy(full_name, self.candidate_dir) + os.remove(full_name) + files_to_move = glob.glob(os.path.join(node.dirpath, 'files', 'phase1', '*.pickle')) + if len(files_to_move) > 0: + log.info(f'{node.nodename} stopfile has appeared, moving phase1 files') + for full_name in files_to_move: + shutil.copy(full_name, self.phase1_dir) + os.remove(full_name) + else: + # if the stop file isn't present and the nodes are idle, give them something to do + + targ_dir = os.path.join(node.dirpath, 'files', 'candidates') + if len(glob.glob(os.path.join(targ_dir, '*.pickle'))) == 0 and node.mode == MCMODE_PHASE1 and node.capacity !=0: + # the node is waiting for data + log.info(f'{node.nodename} idle, giving it extra candidates') + i = 0 + # limit child capacity to 5000 if its set to -1 + max_capacity = node.capacity if node.capacity >= 0 else 1000 + for i, full_name in enumerate(glob.glob(os.path.join(self.candidate_dir, '*.pickle'))): + log.info(f'moving {full_name} to {node.nodename}') + shutil.copy(full_name, targ_dir) + os.remove(full_name) + i +=1 + if i >= max_capacity: + break + + targ_dir = os.path.join(node.dirpath, 'files', 'phase1') + if len(glob.glob(os.path.join(targ_dir, '*.pickle'))) == 0 and node.mode == MCMODE_PHASE2 and node.capacity !=0: + # the node is waiting for data + log.info(f'{node.nodename} idle, giving it extra phase1 data') + i = 0 + # limit child capacity to 5000 if its set to -1 + max_capacity = node.capacity if node.capacity >= 0 else 5000 + for i, full_name in enumerate(glob.glob(os.path.join(self.phase1_dir, '*.pickle'))): + log.info(f'moving {full_name} to {node.nodename}') + shutil.copy(full_name, targ_dir) + os.remove(full_name) + i +=1 + if i >= max_capacity: + break + + # if the files have been in the node folder for more than wait_time hours, move them + # + refdt = time.time() - wait_time*3600 + log.info(f'moving any stale data assigned to {node.nodename}') + for full_name in glob.glob(os.path.join(node.dirpath, 'files', 'candidates', '*.pickle')): + if os.stat(full_name).st_mtime < refdt: + shutil.copy(full_name, self.candidate_dir) + os.remove(full_name) + for full_name in glob.glob(os.path.join(node.dirpath, 'files', 'phase1', '*.pickle')): + if os.stat(full_name).st_mtime < refdt: + shutil.copy(full_name, self.phase1_dir) + os.remove(full_name) - # Restore the signal functionality - signal.signal(signal.SIGINT, original_signal) + return + def getRemoteData(self, verbose=False): + """ + Used in 'child' mode: Wrapper around the remote data handling function to + download data from the master for local processing. + """ + if not self.RemoteDatahandler: + log.info('remote data handler not initialised') + return False + # collect candidates or phase1 solutions from the master node + if self.mc_mode == MCMODE_PHASE1 or self.mc_mode == MCMODE_BOTH: + status = self.RemoteDatahandler.collectRemoteData('candidates', self.output_dir, verbose=verbose) + elif self.mc_mode == MCMODE_PHASE2: + status = self.RemoteDatahandler.collectRemoteData('phase1', self.output_dir, verbose=verbose) + else: + status = False + return status + + def getCandidateId(self, matched_observations, verbose=False): + """ + given a set of observations, create a candidate ID + + Parameters: + matched_observations: list of observations + + Returns: [string] candidate id + """ + + ref_dt = jd2Date(min([obs.jdt_ref for obs, _, _ in matched_observations]), dt_obj=True, tzinfo=datetime.timezone.utc) + ctry_list = list(set([met_obs.station_code[:2] for _, met_obs, _ in matched_observations])) + ctry_list.sort() + ctries = '_'.join(ctry_list) + cand_id = f'{ref_dt.timestamp():.6f}_{ctries}' + return cand_id + + + def saveCandidates(self, candidate_trajectories, verbose=False): + """ + Save candidates to file by constructing a name, checking if we already processed it and then + calling saveCandsorTraj if needed. The function checkAndAddCand adds to candidates.db so that we can + avoid reprocessing the same candidate on a future pass. + + Parameters: + candidate_trajectories : list of candidates + + """ + num_saved = 0 + for matched_observations in candidate_trajectories: + cand_id = self.getCandidateId(matched_observations) + ref_dt = jd2Date(min([obs.jdt_ref for obs, _, _ in matched_observations]), dt_obj=True, tzinfo=datetime.timezone.utc) + obs_ids = [met_obs.id for _, met_obs, _ in matched_observations] + + # check if the candidate was already found and added to the database. + + if self.candidate_db.checkAndAddCand(cand_id, ref_dt.timestamp(), obs_ids, verbose=False): + picklename = f'{cand_id}.pickle' + + if verbose: + log.info(f'Candidate {picklename} contains {len(matched_observations)} observations') + + if self.saveCandOrTraj(matched_observations, picklename, 'candidates', verbose=True): + num_saved += 1 + log.info(f'skipped {len(candidate_trajectories)-num_saved} as marked already-processed') + + log.info("-----------------------") + log.info(f'Saved {num_saved} candidates') + log.info("-----------------------") + + def saveCandOrTraj(self, traj, file_name, savetype='phase1', verbose=True): + """ + Save the candidates (if in candidate-finding mode) or phase 1 trajectories. + If remote data processing is enabled, this function distributes candidates amongst + any nodes that are in the relevant mode. + + Parameters: + traj : The trajectory or candidate to save + file_name : The filename to use + save_type : The type of object we're saving, 'phase1' or 'candidate'. + + """ + if savetype == 'phase1': + save_dir = self.phase1_dir + required_mode = MCMODE_PHASE2 + else: + save_dir = self.candidate_dir + required_mode = MCMODE_PHASE1 + + if self.RemoteDatahandler and self.RemoteDatahandler.mode == 'master': + + # Select a random bucket, check its not already full, and then save the pickle there. + # Make sure to break out once all buckets have been tested + # Fallback/default is to use the local dir. + tested_buckets = [] + bucket_num = -1 + bucket_list = self.RemoteDatahandler.nodes + bucket_list[-1].dirpath = save_dir + + while bucket_num not in tested_buckets: + bucket_num = secrets.randbelow(len(bucket_list)) + bucket = bucket_list[bucket_num] + + # if the child isn't the right mode, or the stop-flag exists, skip it + stop_sts = os.path.isfile(os.path.join(bucket.dirpath, 'files', 'stop')) + + if (bucket.mode != required_mode and bucket.mode != -1) or stop_sts: + tested_buckets.append(bucket_num) + continue + + #set a temporary save-dir name so we can check capacity + if bucket.nodename != 'localhost': + tmp_save_dir = os.path.join(bucket.dirpath, 'files', savetype) + # limit children to 5000 if set to -1 ie unlimited + bucket_capacity = bucket.capacity if bucket.capacity >= 0 else 5000 + else: + # saving to localhost + tmp_save_dir = save_dir + bucket_capacity = bucket.capacity + + os.makedirs(tmp_save_dir, exist_ok=True) + + current_workload = len(glob.glob(os.path.join(tmp_save_dir, '*.pickle'))) + if bucket_capacity < 0 or current_workload < bucket_capacity: + + if tmp_save_dir != save_dir: + # log it if we are saving to a child node, so we can track what got farmed out + log.info(f'saving {file_name} to {tmp_save_dir}') + + # set the save dir if the bucket is usable + save_dir = tmp_save_dir + break + + tested_buckets.append(bucket_num) + + savePickle(traj, save_dir, file_name) + return True @@ -1717,7 +1969,7 @@ def _breakHandler(signum, frame): arg_parser.add_argument('dir_path', type=str, help='Path to the root data directory. Trajectory helper files will be stored here as well.') arg_parser.add_argument('-t', '--maxtoffset', metavar='MAX_TOFFSET', - help='Maximum time offset between the stations. Default is 5 seconds.', type=float, default=10.0) + help='Maximum time offset between the stations. Default is 10 seconds.', type=float, default=10.0) arg_parser.add_argument('-s', '--maxstationdist', metavar='MAX_STATION_DIST', help='Maximum distance (km) between stations of paired meteors. Default is 600 km.', type=float, @@ -1776,25 +2028,68 @@ def _breakHandler(signum, frame): help="Use best N stations in the solution (default is use 15 stations).") arg_parser.add_argument('--mcmode', '--mcmode', type=int, default=0, - help="Run just simple soln (1), just monte-carlos (2) or both (0, default).") + help="Operation mode - see readme. For standalone solving either don't set this or set it to 0") + + arg_parser.add_argument('--archivemonths', '--archivemonths', type=int, default=0, + help="Months back to archive old data. Default 0 which means purge don't archive.") arg_parser.add_argument('--maxtrajs', '--maxtrajs', type=int, default=None, - help="Max number of trajectories to reload in each pass when doing the Monte-Carlo phase") + help="Max number of trajectories to reload in each pass when doing phase 1 or Monte-Carlo phase solving") arg_parser.add_argument('--autofreq', '--autofreq', type=int, default=360, help="Minutes to wait between runs in auto-mode") - arg_parser.add_argument('--remotehost', '--remotehost', type=str, default=None, - help="Remote host to collect and return MC phase solutions to. Supports internet-distributed processing.") - arg_parser.add_argument('--verbose', '--verbose', help='Verbose logging.', default=False, action="store_true") + arg_parser.add_argument('--addlogsuffix', '--addlogsuffix', help='add a suffix to the log to show what stage it is.', default=False, action="store_true") + # Parse the command line arguments cml_args = arg_parser.parse_args() ############################ - + db_dir = cml_args.dbdir + if db_dir is None: + db_dir = cml_args.dir_path + os.makedirs(db_dir, exist_ok=True) + + # mcmode values + # mcmode = 1 -> load candidates and do simple solutions + # mcmode = 2 -> load simple solns and do MC solutions + # mcmode = 4 -> find candidates only + # mcmode = 7 -> do everything + # mcmode = 0 -> same as mode 7 + # bitwise combinations are permissioble so: + # 4+1 will find candidates and then run simple solutions to populate "phase1" + # 1+2 will load candidates from "candidates" and solve them completely + + mcmode = MCMODE_ALL if cml_args.mcmode == 0 else cml_args.mcmode + + + mcmodestr = getMcModeStr(mcmode, 1) + pid_file = None + if mcmodestr: + pid_file = os.path.join(db_dir, f'.{mcmodestr}.pid') + open(pid_file,'w').write(f'{os.getpid()}') + + # signal handler created inline here as it needs access to db_dir + def signal_handler(sig, frame): + signal.signal(sig, signal.SIG_IGN) # ignore additional signals + log.info('======================================') + log.info('CTRL-C pressed, exiting gracefully....') + log.info('======================================') + remote_cfg = os.path.join(db_dir, 'wmpl_remote.cfg') + if os.path.isfile(remote_cfg): + rdh = RemoteDataHandler(remote_cfg) + if rdh and rdh.mode == 'child': + rdh.setStopFlag() + if os.path.isfile(pid_file): + os.remove(pid_file) + log.info('DONE') + log.info('======================================') + sys.exit(0) + + signal.signal(signal.SIGINT, signal_handler) ### Init logging - roll over every day ### @@ -1806,8 +2101,7 @@ def _breakHandler(signum, frame): log_dir = cml_args.dir_path # Create a log dir if it doesn't exist - if not os.path.isdir(log_dir): - os.makedirs(log_dir) + os.makedirs(log_dir, exist_ok=True) # Init the logger #log = logging.getLogger("traj_correlator") @@ -1821,6 +2115,11 @@ def _breakHandler(signum, frame): # Init the file handler timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S") log_file = os.path.join(log_dir, f"correlate_rms_{timestamp}.log") + if cml_args.addlogsuffix: + modestr = getMcModeStr(cml_args.mcmode, 1) + if modestr: + log_file = os.path.join(log_dir, f"correlate_rms_{timestamp}_{modestr}.log") + file_handler = logging.handlers.TimedRotatingFileHandler(log_file, when="midnight", backupCount=7) file_handler.setFormatter(log_formatter) log.addHandler(file_handler) @@ -1869,21 +2168,14 @@ def _breakHandler(signum, frame): if cml_args.maxerr is not None: trajectory_constraints.max_arcsec_err = cml_args.maxerr - remotehost = cml_args.remotehost - if cml_args.mcmode !=2 and remotehost is not None: - log.info('remotehost only applicable in mcmode 2') - remotehost = None - + # set the maximum number of trajectories to reprocess when doing the MC uncertainties # set a default of 10 for remote processing and 1000 for local processing - if cml_args.remotehost is not None: - max_trajs = 10 - else: - max_trajs = 1000 + max_trajs = 1000 if cml_args.maxtrajs is not None: max_trajs = int(cml_args.maxtrajs) - if cml_args.mcmode == 2: + if mcmode == MCMODE_PHASE2: log.info(f'Reloading at most {max_trajs} phase1 trajectories.') # Set the number of CPU cores @@ -1893,8 +2185,22 @@ def _breakHandler(signum, frame): trajectory_constraints.mc_cores = cpu_cores log.info("Running using {:d} CPU cores.".format(cpu_cores)) + if mcmode == MCMODE_CANDS: + log.info('Saving Candidates only') + elif mcmode == MCMODE_PHASE1: + log.info('Loading Candidates if needed') + elif mcmode == MCMODE_ALL: + log.info('Full processing mode') + + if cml_args.verbose: + log.info('verbose flag set') + verbose = True + else: + verbose = False + # Run processing. If the auto run more is not on, the loop will break after one run previous_start_time = None + while True: # Clock for measuring script time @@ -1947,12 +2253,12 @@ def _breakHandler(signum, frame): # Init the data handle dh = RMSDataHandle( - cml_args.dir_path, dt_range=event_time_range, - db_dir=cml_args.dbdir, output_dir=cml_args.outdir, - mcmode=cml_args.mcmode, max_trajs=max_trajs, remotehost=remotehost, verbose=cml_args.verbose) + cml_args.dir_path, dt_range=event_time_range, db_dir=cml_args.dbdir, output_dir=cml_args.outdir, + mcmode=mcmode, max_trajs=max_trajs, verbose=verbose, archivemonths=cml_args.archivemonths, auto=cml_args.auto, + max_toffset=cml_args.maxtoffset) - # If there is nothing to process, stop, unless we're in mcmode 2 (processing_list is not used in this case) - if not dh.processing_list and cml_args.mcmode < 2: + # If there is nothing to process and we're in Candidate mode, stop + if not dh.processing_list and (mcmode & MCMODE_CANDS): log.info("") log.info("Nothing to process!") log.info("Probably everything is already processed.") @@ -1962,7 +2268,7 @@ def _breakHandler(signum, frame): ### GENERATE DAILY TIME BINS ### - if cml_args.mcmode != 2: + if mcmode != MCMODE_PHASE2: # Find the range of datetimes of all folders (take only those after the year 2000) proc_dir_dts = [entry[3] for entry in dh.processing_list if entry[3] is not None] proc_dir_dts = [dt for dt in proc_dir_dts if dt > datetime.datetime(2000, 1, 1, 0, 0, 0, @@ -1980,14 +2286,19 @@ def _breakHandler(signum, frame): if proc_dir_dts == []: proc_dir_dts=[dt_beg - datetime.timedelta(days=1), dt_end + datetime.timedelta(days=1)] - # Determine the limits of data + # Determine the limits of data - add one day to proc_dir_dt_end because each proc dir holds + # up to one day's worth of detections proc_dir_dt_beg = min(proc_dir_dts) - proc_dir_dt_end = max(proc_dir_dts) + proc_dir_dt_end = max(proc_dir_dts) + datetime.timedelta(days=1) + + # in candidate-only mode, we want to write data out frequently so that the solvers can get to work. + # Hence set the bin-size to 6 hours. + bin_length = 0.25 if mcmode == MCMODE_CANDS else 1.0 # Split the processing into daily chunks dt_bins = generateDatetimeBins( proc_dir_dt_beg, proc_dir_dt_end, - bin_days=1, tzinfo=datetime.timezone.utc, reverse=False) + bin_days=bin_length, tzinfo=datetime.timezone.utc, reverse=False) # check if we've created an extra bucket (might happen if requested timeperiod is less than 24h) if event_time_range is not None: @@ -1998,12 +2309,13 @@ def _breakHandler(signum, frame): dt_bins = [(dh.dt_range[0], dh.dt_range[1])] if dh.dt_range is not None: - # there's some data to process - log.info("") - log.info("ALL TIME BINS:") - log.info("----------") - for bin_beg, bin_end in dt_bins: - log.info("{:s}, {:s}".format(str(bin_beg), str(bin_end))) + # there's some data to process and we're in candidate mode + if mcmode & MCMODE_CANDS: + log.info("") + log.info("ALL TIME BINS:") + log.info("----------") + for bin_beg, bin_end in dt_bins: + log.info("{:s}, {:s}".format(str(bin_beg), str(bin_end))) ### ### @@ -2012,27 +2324,61 @@ def _breakHandler(signum, frame): # Go through all chunks in time for bin_beg, bin_end in dt_bins: - log.info("") - log.info("PROCESSING TIME BIN:") - log.info("{:s}, {:s}".format(str(bin_beg), str(bin_end))) - log.info("-----------------------------") - log.info("") + if mcmode != MCMODE_PHASE2: + + # Update the trajectory database, removing any that no longer exist on disk, + # adding any that exist on disk but are missing in the database, and + # removing any duplicates from both disk and database + + dh.updateTrajectoryDatabase(dt_range=(bin_beg, bin_end)) + + if mcmode & MCMODE_CANDS: + log.info("") + log.info("PROCESSING TIME BIN:") + log.info("{:s}, {:s}".format(str(bin_beg), str(bin_end))) + log.info("-----------------------------") + log.info("") - # Load data of unprocessed observations - if cml_args.mcmode != 2: dh.unpaired_observations = dh.loadUnpairedObservations(dh.processing_list, dt_range=(bin_beg, bin_end)) - - # refresh list of calculated trajectories from disk - dh.removeDeletedTrajectories() - dh.loadComputedTrajectories(os.path.join(dh.output_dir, OUTPUT_TRAJ_DIR), dt_range=[bin_beg, bin_end]) - if cml_args.mcmode != 2: - dh.removeDuplicateTrajectories(dt_range=[bin_beg, bin_end]) + log.info(f'loaded {len(dh.unpaired_observations)} observations') # Run the trajectory correlator tc = TrajectoryCorrelator(dh, trajectory_constraints, cml_args.velpart, data_in_j2000=True, enableOSM=cml_args.enableOSM) bin_time_range = [bin_beg, bin_end] - tc.run(event_time_range=event_time_range, mcmode=cml_args.mcmode, bin_time_range=bin_time_range) + num_done = tc.run(event_time_range=event_time_range, mcmode=mcmode, bin_time_range=bin_time_range, verbose=verbose) + + if dh.RemoteDatahandler and dh.RemoteDatahandler.mode == 'child' and num_done > 0: + log.info('uploading to master node') + # close the databases and upload the data to the master node + if mcmode != MCMODE_PHASE2: + dh.closeTrajectoryDatabase() + dh.closeObservationsDatabase() + + + if dh.RemoteDatahandler.uploadToMaster(dh.output_dir, verbose=verbose): + + # if we successfully uploaded data, truncate the tables here so they are clean for the next run + # otherwise do not truncate it, so we push it next time instead + if mcmode != MCMODE_PHASE2: + dh.trajectory_db = TrajectoryDatabase(dh.db_dir, purge_records=True) + dh.observations_db = ObservationsDatabase(dh.db_dir, purge_records=True) + + if dh.RemoteDatahandler and dh.RemoteDatahandler.mode == 'master': + # move any uploaded data and then check and rebalance any pending cands or phase1s + dh.moveUploadedData(verbose=verbose) + dh.checkAndRedistribCands(wait_time=6, verbose=verbose) + + # If we're in either of these modes, the correlator will have scooped up available data + # from candidates or phase1 folders so no need to keep looping. + if mcmode == MCMODE_PHASE1 or mcmode == MCMODE_PHASE2 or mcmode == MCMODE_BOTH: + break + + if mcmode & MCMODE_CANDS: + dh.closeObservationsDatabase() + dh.closeCandidatesDatabase() + dh.closeTrajectoryDatabase() + else: # there were no datasets to process log.info('no data to process yet') @@ -2042,16 +2388,30 @@ def _breakHandler(signum, frame): # Store the previous start time previous_start_time = copy.deepcopy(t1) + + # Break after one loop if auto mode is not on if cml_args.auto is None: + # clear the remote data ready flag to indicate we're shutting down + if dh.RemoteDatahandler and dh.RemoteDatahandler.mode == 'child': + dh.RemoteDatahandler.setStopFlag() + if pid_file and os.path.isfile(pid_file): + os.remove(pid_file) break else: - + if dh.observations_db: + dh.closeObservationsDatabase() + if dh.trajectory_db: + dh.closeTrajectoryDatabase() # Otherwise wait to run AUTO_RUN_FREQUENCY hours after the beginning wait_time = (datetime.timedelta(hours=AUTO_RUN_FREQUENCY) - (datetime.datetime.now(datetime.timezone.utc) - t1)).total_seconds() + # remove the remote data stop flag to indicate we're open for business + if dh.RemoteDatahandler and dh.RemoteDatahandler.mode == 'child': + dh.RemoteDatahandler.clearStopFlag() + # Run immediately if the wait time has elapsed if wait_time < 0: continue @@ -2070,4 +2430,4 @@ def _breakHandler(signum, frame): while next_run_time > datetime.datetime.now(datetime.timezone.utc): print("Waiting {:s} to run the trajectory solver... ".format(str(next_run_time - datetime.datetime.now(datetime.timezone.utc)))) - time.sleep(2) + time.sleep(10) diff --git a/wmpl/Trajectory/Trajectory.py b/wmpl/Trajectory/Trajectory.py index 72cf0f8d..aff7a63e 100644 --- a/wmpl/Trajectory/Trajectory.py +++ b/wmpl/Trajectory/Trajectory.py @@ -17,6 +17,7 @@ from operator import attrgetter import base64 import hashlib +import logging try: import git @@ -64,6 +65,9 @@ def njit(func, *args, **kwargs): # Text size of image legends LEGEND_TEXT_SIZE = 6 +# Grab the logger from the main thread +log = logging.getLogger("traj_correlator") + class ObservedPoints(object): def __init__(self, jdt_ref, meas1, meas2, time_data, lat, lon, ele, meastype, station_id=None, \ @@ -382,7 +386,7 @@ def __init__(self, jdt_ref, meas1, meas2, time_data, lat, lon, ele, meastype, st else: - print('Excluded time range', self.excluded_time, 'is outside the observation times!') + log.info(f'Excluded time range {self.excluded_time} is outside the observation times!') ###################################################################################################### @@ -1961,7 +1965,7 @@ def checkMCTrajectories(mc_results, timing_res=np.inf, geometric_uncert=False): mc_results = [mc_traj for mc_traj in mc_results if (mc_traj.orbit.ra_g is not None) \ and (mc_traj.orbit.dec_g is not None)] - print("{:d} successful MC runs done...".format(len(mc_results))) + log.info(f'{len(mc_results)} successful MC runs done...') return mc_results @@ -1984,7 +1988,7 @@ def _MCTrajSolve(params): i, traj, observations = params - print('Run No.', i + 1) + log.info(f'Run No. {i + 1}') traj.run(_mc_run=True, _orig_obs=observations) @@ -2035,7 +2039,7 @@ def monteCarloTrajectory(traj, mc_runs=None, mc_pick_multiplier=1, noise_sigma=1 mc_runs = mc_runs*mc_pick_multiplier - print("Doing", mc_runs, "successful Monte Carlo runs...") + log.info(f"Doing {mc_runs} successful Monte Carlo runs...") # Init the trajectory noise generator @@ -2050,7 +2054,7 @@ def monteCarloTrajectory(traj, mc_runs=None, mc_pick_multiplier=1, noise_sigma=1 # If there are no MC runs which were successful, recompute using geometric uncertainties if len(mc_results) < 2: - print("No successful MC runs, computing geometric uncertanties...") + log.info("No successful MC runs, computing geometric uncertanties...") # Run the MC solutions geometric_uncert = True @@ -2069,7 +2073,7 @@ def monteCarloTrajectory(traj, mc_runs=None, mc_pick_multiplier=1, noise_sigma=1 # Break the function of there are no trajectories to process if len(mc_results) < 2: - print('!!! Not enough good Monte Carlo runs for uncertaintly estimation!') + log.info('!!! Not enough good Monte Carlo runs for uncertaintly estimation!') return traj, None @@ -2083,12 +2087,12 @@ def monteCarloTrajectory(traj, mc_runs=None, mc_pick_multiplier=1, noise_sigma=1 # Assign geometric uncertainty flag, if it was changed traj_best.geometric_uncert = geometric_uncert - print('Computing uncertainties...') + log.info('Computing uncertainties...') # Calculate the standard deviation of every trajectory parameter uncertainties = calcMCUncertainties(mc_results, traj_best) - print('Computing covariance matrices...') + log.info('Computing covariance matrices...') # Calculate orbital and inital state vector covariance matrices (angles in degrees) traj_best.orbit_cov, traj_best.state_vect_cov = calcCovMatrices(mc_results) @@ -2553,7 +2557,7 @@ def __init__(self, jdt_ref, output_dir='.', max_toffset=None, meastype=4, verbos self.fixed_time_offsets[station] = float(offset) self.fixed_time_offsets_copy[station] = float(offset) - print("Fixed timing given:", self.fixed_time_offsets) + log.info(f"Fixed timing given: {self.fixed_time_offsets}") self.estimate_timing_vel = False @@ -2836,7 +2840,7 @@ def infillTrajectory(self, meas1, meas2, time_data, lat, lon, ele, station_id=No # Skip the observation if all points were ignored if ignore_list is not None: if np.all(ignore_list): - print('All points from station {:s} are ignored, not using this station in the solution!'.format(station_id)) + log.info(f'All points from station {station_id} are ignored, not using this station in the solution!') # Init a new structure which will contain the observed data from the given site @@ -3079,7 +3083,7 @@ def calcVelocity(self, state_vect, radiant_eci, observations, weights, calc_res= # RuntimeError: Optimal parameters not found: gtol=0.000000 is too small, func(x) is # orthogonal to the columns of the Jacobian to machine precision. except RuntimeError: - print("A velocity fit failed with a RuntimeError, skipping this iteration.") + log.info("A velocity fit failed with a RuntimeError, skipping this iteration.") popt = [np.nan] velocities_prev_point.append(popt[0]) @@ -3206,8 +3210,8 @@ def calcAvgVelocityAboveHt(self, observations, bottom_ht, weights): # If there are less than 4 points, don't estimate the initial velocity this way! if len(all_times) < 4: - print('!!! Error, there are less than 4 points for velocity estimation above the given height of {:.2f} km!'.format(bottom_ht/1000)) - print('Using automated velocity estimation with the sliding fit...') + log.info(f'!!! Error, there are less than 4 points for velocity estimation above the given height of {bottom_ht/1000:.2f} km!') + log.info('Using automated velocity estimation with the sliding fit...') return None, None # Fit a line through the time vs. state vector distance data @@ -3294,7 +3298,7 @@ def fitJacchiaLag(self, observations): obs.jacchia_fit = np.abs(obs.jacchia_fit) if self.verbose: - print('Jacchia fit params for station:', obs.station_id, ':', obs.jacchia_fit) + log.info(f'Jacchia fit params for station: {obs.station_id}: {obs.jacchia_fit}') # Get the time and lag points from all sites @@ -3398,9 +3402,7 @@ def estimateTimingAndVelocity(self, observations, weights, estimate_timing_vel=T if self.verbose: - print('Initial function evaluation:', timingResiduals(p0, observations, - self.stations_time_dict, - weights=weights)) + log.info(f'Initial function evaluation: {timingResiduals(p0, observations, self.stations_time_dict, weights=weights)}') # Set bounds for timing to +/- given maximum time offset bounds = [] @@ -3410,14 +3412,15 @@ def estimateTimingAndVelocity(self, observations, weights, estimate_timing_vel=T ### Try different methods of optimization until it is successful ## - # If there are more than 5 stations, use the advanced L-BFGS-B method by default - if len(self.observations) >= 5: + # If there are more than seven stations, use the advanced L-BFGS-B method by default + # - threshold increased from five as modern hardware can handle it fine + if len(self.observations) >= 7: methods = [None] opt_list = ['maxiter'] maxiter_list = [15000] else: - # If there are less than 5, try faster methods first + # If there are less than seven, try faster methods first methods = ['SLSQP', 'TNC', None] opt_list = ['maxiter','maxfun','maxiter'] maxiter_list = [1000, None, 15000] @@ -3437,21 +3440,21 @@ def estimateTimingAndVelocity(self, observations, weights, estimate_timing_vel=T self.timing_res = timing_mini.fun if self.verbose: - print('Successful timing optimization with', opt_method) - print("Final function evaluation:", timing_mini.fun) + log.info(f'Successful timing optimization with {opt_method}') + log.info(f'Final function evaluation: {timing_mini.fun}') break else: - print('Unsuccessful timing optimization with', opt_method) + log.info(f'Unsuccessful timing optimization with {opt_method}') ### ### if not timing_mini.success: - print('Timing difference and initial velocity minimization failed with the message:') - print(timing_mini.message) - print('Try increasing the range of time offsets!') + log.info('Timing difference and initial velocity minimization failed with the message:') + log.info(timing_mini.message) + log.info('Try increasing the range of time offsets!') v_init_mini = v_init velocity_fit = np.zeros(2) @@ -3491,7 +3494,7 @@ def estimateTimingAndVelocity(self, observations, weights, estimate_timing_vel=T time_diffs[i] = t_diff_copy if self.verbose: - print('STATION ' + str(obs.station_id) + ' TIME OFFSET = ' + str(t_diff_copy) + ' s (fixed offset applied)') + log.info(f'STATION {str(obs.station_id)} TIME OFFSET = {str(t_diff_copy)} s (fixed offset applied)') # Otherwise read the estimated offset else: @@ -3502,7 +3505,7 @@ def estimateTimingAndVelocity(self, observations, weights, estimate_timing_vel=T time_diffs[i] = t_diff if self.verbose: - print('STATION ' + str(obs.station_id) + ' TIME OFFSET = ' + str(t_diff) + ' s') + log.info(f'STATION {str(obs.station_id)} TIME OFFSET = {str(t_diff)} s') # Skip NaN and inf time offsets @@ -3649,7 +3652,7 @@ def estimateTimingAndVelocity(self, observations, weights, estimate_timing_vel=T if self.verbose: - print('ESTIMATED Vinit: {:.2f} +/- {:.2f} m/s'.format(v_init_mini, vel_stddev)) + log.info(f'ESTIMATED Vinit: {v_init_mini:.2f} +/- {vel_stddev:.2f} m/s') @@ -4147,10 +4150,10 @@ def toJson(self): """ Convert the Trajectory object to a JSON string. """ # Get a list of builtin types - try : - import __builtin__ + if sys.version_info.major < 3: + import __builtin__ builtin_types = [t for t in __builtin__.__dict__.itervalues() if isinstance(t, type)] - except: + else: # Python 3.x import builtins builtin_types = [getattr(builtins, d) for d in dir(builtins) if isinstance(getattr(builtins, d), type)] @@ -4742,7 +4745,7 @@ def _uncer(str_format, std_name, multi=1.0, deg=False): except Exception: pass if verbose: - print(out_str) + log.info(out_str) # Save the report to a file if save_results: @@ -5672,7 +5675,7 @@ def savePlots(self, output_dir, file_name, show_plots=True, ret_figs=False): plt.clf() plt.close() except: - print('OSM plots not available') + log.info('OSM plots not available') pass ###################################################################################################### @@ -6137,7 +6140,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ # Make sure there are at least 2 stations if numStationsNotIgnored(self.observations) < 2: - print('At least 2 sets of measurements from 2 stations are needed to estimate the trajectory!') + log.info('At least 2 sets of measurements from 2 stations are needed to estimate the trajectory!') return None @@ -6190,8 +6193,8 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ plane_intersection = PlaneIntersection(obs1, obs2) if self.verbose: - print('Convergence angle between stations', obs1.station_id, 'and', obs2.station_id) - print(' Q =', np.degrees(plane_intersection.conv_angle), 'deg') + log.info(f'Convergence angle between stations {obs1.station_id} and {obs2.station_id}') + log.info(f' Q = {np.degrees(plane_intersection.conv_angle)} deg') self.intersection_list.append(plane_intersection) @@ -6218,7 +6221,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ self.radiant_eq = eci2RaDec(self.avg_radiant) if self.verbose: - print('Multi-Track Weighted IP radiant:', np.degrees(self.radiant_eq)) + log.info(f'Multi-Track Weighted IP radiant: {np.degrees(self.radiant_eq)}') # Choose the intersection with the largest convergence angle as the best solution @@ -6227,7 +6230,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ self.best_conv_inter = max(self.intersection_list, key=attrgetter('conv_angle')) if self.verbose: - print('Best Convergence Angle IP radiant:', np.degrees(self.best_conv_inter.radiant_eq)) + log.info(f'Best Convergence Angle IP radiant: {np.degrees(self.best_conv_inter.radiant_eq)}') # Set the 3D position of the radiant line as the state vector, at the beginning point @@ -6266,18 +6269,18 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ # Print weights if self.verbose: - print('LoS statistical weights:') + log.info('LoS statistical weights:') for obs in self.observations: - print("{:>12s}, {:.3f}".format(obs.station_id, obs.weight)) + log.info(f"{obs.station_id:>12s}, {obs.weight:.3f}") ###################################################################################################### if self.verbose: - print('Intersecting planes solution:', self.state_vect) + log.info(f'Intersecting planes solution: {self.state_vect}') - print('Minimizing angle deviations...') + log.info('Minimizing angle deviations...') ### LEAST SQUARES SOLUTION ### @@ -6291,7 +6294,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ ) if self.verbose: - print('Initial angle sum:', angle_sum) + log.info(f'Initial angle sum: {angle_sum}') # Set the initial guess for the state vector and the radiant from the intersecting plane solution @@ -6313,8 +6316,8 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ # If the minimization diverged, bound the solution to +/-10% of state vector if np.max(np.abs(minimize_solution.x[:3] - self.state_vect)/self.state_vect) > 0.1: - print('WARNING! Unbounded state vector optimization failed!') - print('Trying bounded minimization to +/-10% of state vector position.') + log.info('WARNING! Unbounded state vector optimization failed!') + log.info('Trying bounded minimization to +/-10% of state vector position.') # Limit the minimization to 10% of original estimation in the state vector bounds = [] @@ -6325,19 +6328,19 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ for val in self.best_conv_inter.radiant_eci: bounds.append(sorted([0.75*val, 1.25*val])) - print('BOUNDS:', bounds) - print('p0:', p0) + log.info(f'BOUNDS: {bounds}') + log.info(f'p0: {p0}') minimize_solution = scipy.optimize.minimize(minimizeAngleCost, p0, args=(self.observations, \ weights, (_rerun_timing and self.gravity_correction), self.gravity_factor, self.v0z), bounds=bounds, method='SLSQP') if self.verbose: - print('Minimization info:') - print(' Message:', minimize_solution.message) - print(' Iterations:', minimize_solution.nit) - print(' Success:', minimize_solution.success) - print(' Final function value:', minimize_solution.fun) + log.info('Minimization info:') + log.info(f' Message: {minimize_solution.message}') + log.info(f' Iterations: {minimize_solution.nit}') + log.info(f' Success: {minimize_solution.success}') + log.info(f' Final function value: {minimize_solution.fun}') # Set the minimization status @@ -6360,13 +6363,13 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ self.radiant_eq_mini = eci2RaDec(self.radiant_eci_mini) if self.verbose: - print('Position and radiant LMS solution:') - print(' State vector:', self.state_vect_mini) - print(' Ra', np.degrees(self.radiant_eq_mini[0]), 'Dec:', np.degrees(self.radiant_eq_mini[1])) + log.info('Position and radiant LMS solution:') + log.info(f' State vector: {self.state_vect_mini}') + log.info(f' Ra {np.degrees(self.radiant_eq_mini[0])} Dec: {np.degrees(self.radiant_eq_mini[1])}') else: - print('Angle minimization failed altogether!') + log.info('Angle minimization failed altogether!') # If the solution did not succeed, set the values to intersecting plates solution self.radiant_eci_mini = self.best_conv_inter.radiant_eci @@ -6407,7 +6410,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ if self.verbose and self.estimate_timing_vel: - print('Estimating initial velocity and timing differences...') + log.info('Estimating initial velocity and timing differences...') @@ -6457,6 +6460,10 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ # Calculate lag self.calcLag(self.observations) + if self.verbose: + log.info('timing data entering optimisation') + for obs in self.observations: + log.info(f'{obs.station_id}: {obs.time_data}') # Estimate the timing difference between stations and the initial velocity and update the time ( @@ -6475,7 +6482,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ # If estimating the timing failed, skip any further steps if not self.timing_minimization_successful: - print('unable to minimise timing') + log.warning('unable to minimise timing') return None @@ -6522,10 +6529,10 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ self.observations = [] if self.verbose: - print() - print("---------------------------------------------------------------------------------") - print("Updating the solution after the timing estimation...") - print("---------------------------------------------------------------------------------") + log.info("") + log.info("---------------------------------------------------------------------------------") + log.info("Updating the solution after the timing estimation...") + log.info("---------------------------------------------------------------------------------") # Reinitialize the observations with proper timing for obs in temp_observations: @@ -6649,10 +6656,10 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ self.observations = [] if self.verbose: - print() - print("---------------------------------------------------------------------------------") - print("Updating the solution after rejecting", picks_rejected, "bad picks...") - print("---------------------------------------------------------------------------------") + log.info("") + log.info("---------------------------------------------------------------------------------") + log.info(f"Updating the solution after rejecting {picks_rejected} bad picks...") + log.info("---------------------------------------------------------------------------------") # Reinitialize the observations without the bad picks for obs in temp_observations: @@ -6665,7 +6672,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ else: if self.verbose: - print("All picks are within 3 sigma...") + log.info("All picks are within 3 sigma...") else: @@ -6697,7 +6704,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ reference_init=minimize_solution.success, v_init_stddev_direct=self.v_init_stddev) if self.verbose: - print(self.orbit.__repr__(v_init_ht=self.v_init_ht)) + log.info(f'{self.orbit.__repr__(v_init_ht=self.v_init_ht)}') ###################################################################################################### @@ -6767,7 +6774,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ if self.save_results: if self.verbose: - print('Saving Monte Carlo results...') + log.info('Saving Monte Carlo results...') # Save the picked trajectory structure with Monte Carlo points savePickle(traj_best, mc_output_dir, mc_file_name + '_trajectory.pickle') @@ -6787,7 +6794,7 @@ def run(self, _rerun_timing=False, _rerun_bad_picks=False, _mc_run=False, _orig_ if self.save_results: if self.verbose: - print('Saving results with original picks...') + log.info('Saving results with original picks...') # Save the picked trajectory structure with original points savePickle(self, self.output_dir, self.file_name + '_trajectory.pickle') diff --git a/wmpl/Utils/Math.py b/wmpl/Utils/Math.py index bb6069b5..d916bc28 100644 --- a/wmpl/Utils/Math.py +++ b/wmpl/Utils/Math.py @@ -1113,11 +1113,13 @@ def generateDatetimeBins(dt_beg, dt_end, bin_days=7, utc_hour_break=12, tzinfo=N else: bin_beg = dt_beg + datetime.timedelta(days=i * bin_days) - bin_beg = bin_beg.replace(hour=int(utc_hour_break), minute=0, second=0, microsecond=0) + if bin_days > 0.999: + bin_beg = bin_beg.replace(hour=int(utc_hour_break), minute=0, second=0, microsecond=0) # Generate the bin ending edge bin_end = bin_beg + datetime.timedelta(days=bin_days) - bin_end = bin_end.replace(hour=int(utc_hour_break), minute=0, second=0, microsecond=0) + if bin_days > 0.999: + bin_end = bin_end.replace(hour=int(utc_hour_break), minute=0, second=0, microsecond=0) # Check that the ending bin is not beyond the end dt end_reached = False diff --git a/wmpl/Utils/remoteDataHandling.py b/wmpl/Utils/remoteDataHandling.py index 59f59a19..3ff2e8f6 100644 --- a/wmpl/Utils/remoteDataHandling.py +++ b/wmpl/Utils/remoteDataHandling.py @@ -23,176 +23,375 @@ import os import paramiko import logging -import glob import shutil +import uuid +import time -from wmpl.Utils.OSTools import mkdirP -from wmpl.Utils.Pickling import loadPickle +from configparser import ConfigParser log = logging.getLogger("traj_correlator") -def collectRemoteTrajectories(remotehost, max_trajs, output_dir): - """ - Collect trajectory pickles from a remote server for local phase2 (monte-carlo) processing - NB: do NOT use os.path.join here, as it will break on Windows - """ +class RemoteNode(): + def __init__(self, nodename, dirpath, capacity, mode, active=False): + self.nodename = nodename + self.dirpath = dirpath + self.capacity = int(capacity) + self.mode = int(mode) + self.active = active - ftpcli, remote_dir, sshcli = getSFTPConnection(remotehost) - if ftpcli is None: - return - - remote_phase1_dir = os.path.join(remote_dir, 'phase1').replace('\\','/') - - log.info(f'Looking in {remote_phase1_dir} on remote host for up to {max_trajs} trajectories') - - try: - files = ftpcli.listdir(remote_phase1_dir) - files = [f for f in files if '.pickle' in f and 'processing' not in f] - files = files[:max_trajs] - if len(files) == 0: - log.info('no data available at this time') - ftpcli.close() - sshcli.close() - return +class RemoteDataHandler(): + def __init__(self, cfg_file): + self.initialised = False + if not os.path.isfile(cfg_file): + log.warning(f'unable to find {cfg_file}, not enabling remote processing') + return - for trajfile in files: - fullname = os.path.join(remote_phase1_dir, trajfile).replace('\\','/') - localname = os.path.join(output_dir, trajfile) - ftpcli.get(fullname, localname) - ftpcli.rename(fullname, f'{fullname}_processing') - - log.info(f'Obtained {len(files)} trajectories') - - - except Exception as e: - log.warning('Problem with download') - log.info(e) - - ftpcli.close() - sshcli.close() + self.nodenames = None + self.nodes = None + self.capacity = None - return + self.host = None + self.user = None + self.key = None - -def uploadTrajToRemote(remotehost, trajfile, output_dir): - """ - At the end of MC phase, upload the trajectory pickle and report to a remote host for integration - into the solved dataset - """ - - ftpcli, remote_dir, sshcli = getSFTPConnection(remotehost) - if ftpcli is None: + self.ssh_client = None + self.sftp_client = None + + cfg = ConfigParser() + cfg.read(cfg_file) + self.mode = cfg['mode']['mode'].lower() + if self.mode not in ['master', 'child']: + log.warning('remote cfg: mode must be master or child, not enabling remote processing') + return + if self.mode == 'master': + if 'children' not in cfg.sections(): + log.warning('remote cfg: children section missing, not enabling remote processing') + return + + # create a list of available nodes, disabling any that are malformed in the config file + self.nodenames = [k for k in cfg['children'].keys()] + self.nodes = [k.split(',') for k in cfg['children'].values()] + self.nodes = [RemoteNode(nn,x[0],x[1],x[2]) for nn,x in zip(self.nodenames,self.nodes) if len(x)==3] + self.nodes.append(RemoteNode('localhost', None, -1, -1)) + activenodes = [n.nodename for n in self.nodes if n.capacity!=0] + log.info(f' using nodes {activenodes}') + else: + # 'child' mode + if 'sftp' not in cfg.sections() or 'key' not in cfg['sftp'] or 'host' not in cfg['sftp'] or 'user' not in cfg['sftp']: + log.warning('remote cfg: sftp user, key or host missing, not enabling remote processing') + return + + self.host = cfg['sftp']['host'] + self.user = cfg['sftp']['user'] + self.key = os.path.normpath(os.path.expanduser(cfg['sftp']['key'])) + if 'port' not in cfg['sftp']: + self.port = 22 + else: + self.port = int(cfg['sftp']['port']) + + self.initialised = True return - - remote_phase2_dir = os.path.join(remote_dir, 'remoteuploads').replace('\\','/') - try: - ftpcli.mkdir(remote_phase2_dir) - except Exception: - pass - - localname = os.path.join(output_dir, trajfile) - remotename = os.path.join(remote_phase2_dir, trajfile).replace('\\','/') - ftpcli.put(localname, remotename) - localname = localname.replace('_trajectory.pickle', '_report.txt') - remotename = remotename.replace('_trajectory.pickle', '_report.txt') - if os.path.isfile(localname): - ftpcli.put(localname, remotename) - - ftpcli.close() - sshcli.close() - return - - -def moveRemoteTrajectories(output_dir): - """ - Move remotely processed pickle files to their target location in the trajectories area, - making sure we clean up any previously-calculated trajectory and temporary files - """ + def getSFTPConnection(self, verbose=False): + if not self.initialised: + return False + + if self.sftp_client: + return True + + log.info(f'Connecting to {self.host}:{self.port} as {self.user}....') - phase2_dir = os.path.join(output_dir, 'remoteuploads') + if not os.path.isfile(os.path.expanduser(self.key)): + log.warning(f'ssh keyfile {self.key} missing') + return False + + self.ssh_client = paramiko.SSHClient() + if verbose: + log.info('created paramiko ssh client....') + self.ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) + pkey = paramiko.RSAKey.from_private_key_file(self.key) + try: + if verbose: + log.info('connecting....') + self.ssh_client.connect(hostname=self.host, username=self.user, port=self.port, + pkey=pkey, look_for_keys=False, timeout=10) + if verbose: + log.info('connected....') + self.sftp_client = self.ssh_client.open_sftp() + if verbose: + log.info('created client') + return True + + except Exception as e: - if os.path.isdir(phase2_dir): - log.info('Checking for remotely calculated trajectories...') - pickles = glob.glob1(phase2_dir, '*.pickle') + log.warning('sftp connection to remote host failed') + log.warning(e) + self.closeSFTPConnection() + return False + + def closeSFTPConnection(self): + if self.sftp_client: + self.sftp_client.close() + self.sftp_client = None + if self.ssh_client: + self.ssh_client.close() + self.ssh_client = None + return + + def putWithRetry(self, local_name, remname): + for i in range(10): + try: + self.sftp_client.put(local_name, remname) + return True + except Exception: + time.sleep(1) + log.warning(f'upload of {local_name} failed after 10 retries') + return False + + def getWithRetry(self, rem_name, local_name): + for i in range(10): + try: + self.sftp_client.get(rem_name, local_name) + return True + except Exception: + time.sleep(1) + log.warning(f'download of {rem_name} failed after 10 retries') + return False + + def renameWithRetry(self, rem_name, new_rem_name): + try: + # if stat succeeds, then the remote file was already moved to processed folder + # in which case we can simply remove the original remote file + self.sftp_client.stat(new_rem_name) + try: + self.sftp_client.remove(rem_name) + return True + except Exception: + log.warning(f'processed copy already exists but unable to remove {rem_name}') + return False + except Exception: + # if stat fails then the processed file doesn't exist so we can safely rename + for i in range(10): + try: + self.sftp_client.rename(rem_name, new_rem_name) + return True + except Exception: + time.sleep(1) + log.warning(f'rename of {rem_name} failed after 10 retries') + return False + + ######################################################## + # functions used by the client nodes + + def collectRemoteData(self, datatype, output_dir, verbose=False): + """ + Collect trajectory or candidate pickles from a remote server for local processing + + parameters: + datatype = 'candidates' or 'phase1' + output_dir = folder to put the pickles into generally dh.output_dir + """ + + if not self.initialised or not self.getSFTPConnection(verbose=verbose): + return False + + for pth in ['files', 'files/candidates', 'files/phase1', 'files/trajectories', + 'files/candidates/processed','files/phase1/processed']: + try: + self.sftp_client.mkdir(pth) + except Exception: + pass + self.sftp_client.chmod(pth, 0o777) + + try: + rem_dir = f'files/{datatype}' + files = self.sftp_client.listdir(rem_dir) + files = [f for f in files if '.pickle' in f and 'processing' not in f] + if len(files) == 0: + log.info('no data available at this time') + self.closeSFTPConnection() + return False + + local_dir = os.path.join(output_dir, datatype) + if not os.path.isdir(local_dir): + os.makedirs(local_dir, exist_ok=True) + num_received = 0 + for trajfile in files: + fullname = f'{rem_dir}/{trajfile}' + processed_name = f'{rem_dir}/processed/{trajfile}' + localname = os.path.join(local_dir, trajfile) + if verbose: + log.info(f'downloading {fullname} to {localname}') + + res = self.getWithRetry(fullname, localname) + if res: + num_received += 1 + self.renameWithRetry(fullname, processed_name) + + log.info(f'Obtained {num_received} {"trajectories" if datatype=="phase1" else "candidates"}') + + except Exception as e: + log.warning('Problem with download') + log.info(e) + + self.closeSFTPConnection() + return True + + def uploadToMaster(self, source_dir, verbose=False): + """ + upload the trajectory pickle and report to a remote host for integration + into the solved dataset + + parameters: + source_dir = root folder containing data, generally dh.output_dir + """ + + if not self.initialised or not self.getSFTPConnection(verbose=verbose): + return + + # flag to indicate success. Any upload failures will set this to False + success_flag = True + + for pth in ['files', 'files/candidates', 'files/phase1', 'files/trajectories', + 'files/candidates/processed','files/phase1/processed']: + try: + self.sftp_client.mkdir(pth) + self.sftp_client.chmod(pth, 0o777) + except Exception: + pass + + phase1_dir = os.path.join(source_dir, 'phase1') + if os.path.isdir(phase1_dir): + + # upload any phase1 trajectories + i=0 + proc_dir = os.path.join(phase1_dir, 'processed') + os.makedirs(proc_dir, exist_ok=True) + + for fil in os.listdir(phase1_dir): + local_name = os.path.join(phase1_dir, fil) + if os.path.isdir(local_name): + continue + remname = f'files/phase1/{fil}' + + if verbose: + log.info(f'uploading {local_name} to {remname}') + + # If the upload is successful, move the local file to 'processed' + # Otherwise set the success flag to false + + if self.putWithRetry(local_name, remname): + + if os.path.isfile(os.path.join(proc_dir, fil)): + os.remove(os.path.join(proc_dir, fil)) + shutil.move(local_name, proc_dir) + i += 1 + + else: + success_flag = False + + if i > 0: + log.info(f'uploaded {i} phase1 solutions') + + # now upload any data in the 'trajectories' folder, flattening it to make it simpler to handle + i=0 + + traj_dir = os.path.join(source_dir, 'trajectories') + if os.path.isdir(traj_dir): + for (dirpath, dirnames, filenames) in os.walk(traj_dir): + if len(filenames) > 0: - for pick in pickles: - traj = loadPickle(phase2_dir, pick) - phase1_name = traj.pre_mc_longname - traj_dir = f'{output_dir}/trajectories/{phase1_name[:4]}/{phase1_name[:6]}/{phase1_name[:8]}/{phase1_name}' - if os.path.isdir(traj_dir): - shutil.rmtree(traj_dir) - processed_traj_file = os.path.join(output_dir, 'phase1', phase1_name + '_trajectory.pickle_processing') + # flag to indicate whether this specific trajectory upload succeeded + traj_success_flag = True - if os.path.isfile(processed_traj_file): - log.info(f' Moving {phase1_name} to processed folder...') - dst = os.path.join(output_dir, 'phase1', 'processed', phase1_name + '_trajectory.pickle') - shutil.copyfile(processed_traj_file, dst) - os.remove(processed_traj_file) + rem_path = f'files/trajectories/{os.path.basename(dirpath)}' + try: + self.sftp_client.mkdir(rem_path) + self.sftp_client.chmod(rem_path, 0o777) + except Exception: + pass - phase2_name = traj.longname - traj_dir = f'{output_dir}/trajectories/{phase2_name[:4]}/{phase2_name[:6]}/{phase2_name[:8]}/{phase2_name}' - mkdirP(traj_dir) - log.info(f' Moving {phase2_name} to {traj_dir}...') - src = os.path.join(phase2_dir, pick) - dst = os.path.join(traj_dir, pick[:15]+'_trajectory.pickle') + # upload all files in the folder. If any upload fails, set the traj sucess flag to false + for fil in filenames: - shutil.copyfile(src, dst) - os.remove(src) + local_name = os.path.join(dirpath, fil) + rem_file = f'{rem_path}/{fil}' - report_file = src.replace('_trajectory.pickle','_report.txt') - if os.path.isfile(report_file): - dst = dst.replace('_trajectory.pickle','_report.txt') - shutil.copyfile(report_file, dst) - os.remove(report_file) + if verbose: + log.info(f'uploading {local_name} to {rem_file}') - log.info(f'Moved {len(pickles)} trajectories.') + if self.putWithRetry(local_name, rem_file): + if 'pickle' in local_name: + i += 1 + else: + traj_success_flag = False - return + # if this trajectory uploaded, remove the local files + # Otherwise set the overall status to False + if traj_success_flag: + shutil.rmtree(dirpath, ignore_errors=True) + else: + success_flag = traj_success_flag + + if i > 0: + log.info(f'uploaded {i} trajectories') -def getSFTPConnection(remotehost): + # if everything uploaded we can remove the entire 'trajectories' folder + if success_flag: + shutil.rmtree(traj_dir, ignore_errors=True) - hostdets = remotehost.split(':') + # finally the databases - upload these with a random name for uniqueness at the server side + # Again, if any upload fails mark the status False + uuid_str = str(uuid.uuid4()) - if len(hostdets) < 2 or '@' not in hostdets[0]: - log.warning(f'{remotehost} malformed, should be user@host:port:/path/to/dataroot') - return None, None, None - - if len(hostdets) == 3: - port = int(hostdets[1]) - remote_data_dir = hostdets[2] + db_success_flag = True + for fname in ['observations', 'trajectories']: + local_name = os.path.join(source_dir, f'{fname}.db') - else: - port = 22 - remote_data_dir = hostdets[1] + if os.path.isfile(local_name): + rem_file = f'files/{fname}-{uuid_str}.db' - user,host = hostdets[0].split('@') - log.info(f'Connecting to {host}....') + if verbose: + log.info(f'uploading {local_name} to {rem_file}') + if not self.putWithRetry(local_name, rem_file): + db_success_flag = False - ssh_client = paramiko.SSHClient() - ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) + if db_success_flag: + log.info('uploaded databases') + else: + log.warning('unable to upload at least one of the databases, will retry in next loop') + success_flag = db_success_flag + self.closeSFTPConnection() - if not os.path.isfile(os.path.expanduser('~/.ssh/trajsolver')): - log.warning('ssh keyfile ~/.ssh/trajsolver missing') - ssh_client.close() - return None, None, None + return success_flag - pkey = paramiko.RSAKey.from_private_key_file(os.path.expanduser('~/.ssh/trajsolver')) - try: - ssh_client.connect(hostname=host, username=user, port=port, pkey=pkey, look_for_keys=False) - ftp_client = ssh_client.open_sftp() - return ftp_client, remote_data_dir, ssh_client - - except Exception as e: - - log.warning('sftp connection to remote host failed') - log.warning(e) - ssh_client.close() - - return None, None, None + def setStopFlag(self, verbose=False): + if not self.initialised or not self.getSFTPConnection(): + return + try: + readyfile = os.path.join(os.getenv('TMP', default='/tmp'),'stop') + open(readyfile,'w').write('stop') + self.sftp_client.put(readyfile, 'files/stop') + except Exception: + log.warning('unable to set stop flag, master will not continue to assign data') + time.sleep(2) + self.closeSFTPConnection() + log.info('set stop flag') + return + + def clearStopFlag(self, verbose=False): + if not self.initialised or not self.getSFTPConnection(): + return + try: + self.sftp_client.remove('files/stop') + log.info('removed stop flag') + except: + pass + self.closeSFTPConnection() + return diff --git a/wmpl_remote.cfg.sample b/wmpl_remote.cfg.sample new file mode 100644 index 00000000..da8c6cc4 --- /dev/null +++ b/wmpl_remote.cfg.sample @@ -0,0 +1,31 @@ +# Configuration file for WMPL distributed processing. +# Rename to `wmpl_remote.cfg` and place in the data directory. + +[mode] +# if mode is 'master' then [children] lists the child processing nodes and capacity of each +# if mode is 'child' then the [sftp] section says how the child will connect to the parent. +# Each node must have its own copy of this file and each child must have its own credentials + +mode = child + +# details of the child nodes. Each line must have three values separated by commas +# * folder that each node will use. These must map to each node's sftp user's homedir. +# * capacity - number of candidates or trajectories to solve. Allows loadbalancing +# * operation mode - 0: disabled, 1: solving candidates, 2: monte-carlo phase +# the node names can be used to differentiate children + +[children] +node1 = c:/temp/wmpl/node1,200,1 +node2 = c:/temp/wmpl/node2,400,1 +node3 = c:/temp/wmpl/node3,0,2 +node4 = c:/temp/wmpl/node4,0,2 +node5 = + + +[sftp] +# sftp login details for client to connect to the parent when running in 'child' mode +# if the port is nonstandard (ie not 22) then uncomment and set as required +host = testserver.somedomain.com +user = node1 +key = ~/.ssh/somekey +#port=2222 \ No newline at end of file