From 54faaefa03302814251ae130997ce297de97f8b0 Mon Sep 17 00:00:00 2001 From: mano8 Date: Sun, 29 Dec 2024 15:50:17 +0100 Subject: [PATCH 1/2] Improve Readme --- README.md | 143 +++++++++++++++++++++++++++++++++++------------------- 1 file changed, 93 insertions(+), 50 deletions(-) diff --git a/README.md b/README.md index 8cff82b..98acd4f 100644 --- a/README.md +++ b/README.md @@ -40,20 +40,36 @@ The `emon_fina` module facilitates the analysis and processing of time-series da #### Features - - Data Reading: Efficiently read data from PhpFina file formats. - - Time-Series Analysis: Compute daily statistics such as min, max, mean, and more. - - Filtering: Validate and filter data based on custom thresholds. - - Utilities: Timestamp manipulation and interval computation tools. +- Data Reading: Efficiently read data from PhpFina file formats. +- Time-Series Analysis: Compute daily statistics such as min, max, mean, and more. +- Filtering: Validate and filter data based on custom thresholds. +- Utilities: Timestamp manipulation and interval computation tools. -#### Usage Example: +#### PhpFina File Structure -The examples below demonstrate how to retrieve and analyze data from PhpFina timeseries .dat files. For additional examples, refer to the [`emon_fina` Jupiter NoteBook](https://github.com/vemonitor/emon_tools/blob/main/notebook/emon_fina.ipynb). +PhpFina is a lightweight binary file format used by EmonCMS for storing time-series data. Each PhpFina feed consists of two files: + +1. `.dat` File: Contains the actual time-series data values, stored as binary floats. Each value corresponds to a specific timestamp based on the feed's start time and interval. + +2. `.meta` File: Contains metadata about the feed. Its structure includes: + - **Offset 0-7**: Reserved for future use or ignored by the library. + - **Offset 8-15**: Contains two 4-byte little-endian integers: + - `interval`: The time interval (in seconds) between consecutive data points. + - `start_time`: The Unix timestamp of the first data point. + - Computed Values: + - `npoints`: The total number of data points, calculated as `data_size // 4` (where each data point is 4 bytes). + - `end_tim`e: Computed as `start_time + npoints * interval - interval`. + +#### Usage Examples: + +The examples below demonstrate how to retrieve and analyze data from PhpFina timeseries .dat files. For additional examples, refer to the [`emon_fina`](https://github.com/vemonitor/emon_tools/blob/main/notebook/emon_fina.ipynb) Jupiter NoteBook. -Every PhpFina timeseries feed engine is acompagned with `.meta` file who contain meta values of actual status of `.dat` file. Meta data is readed on initialize objects ##### Retrieving data -`FinaData` initialization: +###### 1. Initialize `FinaData`: + +This initializes the `FinaData` class, allowing you to interact with the time-series data files: ```python from emon_tools.emon_fina import FinaData @@ -64,12 +80,26 @@ fdf = FinaData( ) ``` -Values output can be set as: -> In above example we get 8 days (8 * 24 * 3600) from meta `time_start` value. +Access metadata of the .meta file: + +```python +print(fdf.meta) +# Example Output: +# { +# "interval": 10, +# "start_time": 1575981140, +# "npoints": 4551863, +# "end_time": 1621499760 +# } +``` + +##### 2. Retrieve Values: + +Retrieve specific ranges of data values from the `.dat` file based on time intervals or date ranges. -1. 1D numpy array by timestamp +1. 1D NumPy Array by time window: -Retrieve data values from the Fina data file for a specified time window. +Extract values starting from a specific timestamp and within a time window: ```python values = fdf.get_fina_values( @@ -79,9 +109,9 @@ values = fdf.get_fina_values( ) ``` -2. 1D numpy array by srting datetime +2. 1D NumPy Array by datetime interval: -Retrieve values from the Fina data file based on a specified date range. +Extract values within a specific date range: ```python ts = fdf.get_fina_values_by_date( @@ -91,9 +121,9 @@ ts = fdf.get_fina_values_by_date( ) ``` -3. 2D TimeSeries numpy array by timestamp +3. 2D Time-Series NumPy Array by time window: -Retrieve a 2D time series array of timestamps and values from the Fina data file. +Retrieve a 2D array containing timestamps and corresponding values: ```python ts = fdf.get_fina_time_series( @@ -103,11 +133,9 @@ ts = fdf.get_fina_time_series( ) ``` +4. 2D Time-Series NumPy Array by datetime interval: - -4. 2D TimeSeries numpy array by srting datetime - -Retrieve a 2D time series array of timestamps and values for a specific date range. +Retrieve a 2D array of timestamps and values for a specific date range: ```python ts = fdf.get_fina_time_series_by_date( @@ -117,7 +145,9 @@ ts = fdf.get_fina_time_series_by_date( ) ``` -5. pandas DataFrame TimeSeries +5. Pandas DataFrame Time-Series: + +Convert time-series data into a Pandas DataFrame for easier manipulation: `FinaDataFrame` initialization: @@ -128,31 +158,38 @@ fdf = FinaDataFrame( feed_id=1, data_dir="/path/to/phpfina/files ) -``` -Retrieve time series data within a specified time window -and return it as a Pandas DataFrame. - -```python ts = fdf.get_fina_df_time_series( - start=fr.meta.start_time, + start=fdf.meta.start_time, step=10, window=8 * 24 * 3600 ) -``` -Retrieve time series data by specifying a date range and convert it to a Pandas DataFrame. +# Or by date_range -```python ts = fdf.get_fina_time_series_by_date( start_date='2019-12-12 00:00:00', end_date='2019-12-13 00:00:00', step=10 ) ``` -And optionaly ploted dirrectly. -`FinaDataFrame` initialization: +Access metadata of the `.meta` file: + +```python +print(fdf.meta) +# Example Output: +# { +# "interval": 10, +# "start_time": 1575981140, +# "npoints": 4551863, +# "end_time": 1621499760 +# } +``` + +##### 3. Plotting Data: + +Visualize the retrieved time-series data: ```python from emon_tools.fina_plot import PlotData @@ -162,7 +199,9 @@ PlotData.plot(data=ts) ##### Compute Daily Statistics -`FinaDataFrame` initialization: +###### 1. Initialize `FinaStats`: + +This initializes the `FinaStats` class for statistical computations: ```python from emon_tools.emon_fina import FinaStats @@ -174,20 +213,22 @@ stats = FinaStats( ) ``` -Once initialized, you can access the metadata of the PhpFina `.meta` file. For example, a file with `feed_id=1` might return: +Access metadata of the .meta file: ```python -stats.meta - { - "interval": 10, - "start_time": 1575981140, - "npoints": 4551863, - "end_time": 1621499760 - } +print(stats.meta) +# Example Output: +# { +# "interval": 10, +# "start_time": 1575981140, +# "npoints": 4551863, +# "end_time": 1621499760 +# } ``` -On grabing phpfina timeseries feed engine, missed data points are set as Nan values, -We can get file integrity daily statistics to compute real and total values of phpfina `.dat` file +###### 2. Integrity Statistics: + +Analyze the integrity of the .dat file by computing the presence of valid and missing data: ```python # Compute daily statistics @@ -196,7 +237,9 @@ daily_stats = stats.get_stats(stats_type=StatsType.INTEGRITY) -Or we can get daily values statistics from your phpfina timeseries feed engine file +###### 3. Value Statistics: + +Compute daily statistics (e.g., min, max, mean) for data values: ```python # Compute daily statistics @@ -205,8 +248,9 @@ daily_stats = stats.get_stats(stats_type=StatsType.VALUES) -Phpfina timeseries feed engine file can contain bad data, in this case we can limit values from statistics without bad values. -Here statistics are calculated only with values between -50 and 50. +###### 4. Filtered Value Statistics: + +Restrict statistical calculations to a specific value range: ```python # Compute daily statistics @@ -219,8 +263,9 @@ daily_stats = stats.get_stats( -You can limit daily statistics from desired window, by setting `start_time` and/or `steps_window` properties. -In above example we get daily stats values for 8 days from timestamp value 1575981140 +###### 5. Windowed Statistics: +Limit statistics to a specific time window: + ```python # Compute daily statistics daily_stats = stats.get_stats( @@ -249,5 +294,3 @@ Contributions are welcome! To contribute: ## License This project is licensed under the MIT License. See LICENSE for more details. - -.... \ No newline at end of file From 998ba57ca8646d8c73db27e671e379090e8eed54 Mon Sep 17 00:00:00 2001 From: mano8 Date: Sun, 29 Dec 2024 16:01:35 +0100 Subject: [PATCH 2/2] Improve code style --- emon_tools/fina_plot.py | 8 +++++++- emon_tools/fina_time_series.py | 3 ++- tests/test_fina_reader.py | 4 +++- tests/test_fina_stats.py | 2 +- 4 files changed, 13 insertions(+), 4 deletions(-) diff --git a/emon_tools/fina_plot.py b/emon_tools/fina_plot.py index fee48d6..dc6df3b 100644 --- a/emon_tools/fina_plot.py +++ b/emon_tools/fina_plot.py @@ -127,7 +127,13 @@ def plot_values(data: Union[np.ndarray, pd.DataFrame]): plt.subplot(1, 1, 1) if isinstance(data, np.ndarray): plt.plot(data[:, 0], data[:, 1], label="Fina Stats Values") - plt.fill_between(data[:, 0], data[:, 1], data[:, 3], alpha=0.2, label='Daily values min-max range') + plt.fill_between( + data[:, 0], + data[:, 1], + data[:, 3], + alpha=0.2, + label='Daily values min-max range' + ) plt.plot(data[:, 0], data[:, 2], label='Daily mean values', lw=2) else: plt.fill_between( diff --git a/emon_tools/fina_time_series.py b/emon_tools/fina_time_series.py index e5c9cdb..745b128 100644 --- a/emon_tools/fina_time_series.py +++ b/emon_tools/fina_time_series.py @@ -5,7 +5,8 @@ classes to work with Pandas DataFrames for Fina data results. Classes: - - FinaDataFrame: Extends FinaData with methods to convert time-series data into Pandas DataFrames. + - FinaDataFrame: + Extends FinaData with methods to convert time-series data into Pandas DataFrames. - FinaDfStats: Extends FinaStats to compute and return statistics as Pandas DataFrames. Dependencies: diff --git a/tests/test_fina_reader.py b/tests/test_fina_reader.py index 34b8679..2f43f65 100644 --- a/tests/test_fina_reader.py +++ b/tests/test_fina_reader.py @@ -301,5 +301,7 @@ def test_sanitize_path_outside_directory(self, valid_fina_reader): Test _sanitize_path with a filename that attempts to access outside the allowed directory. """ filename = "../outside_file.dat" - with pytest.raises(ValueError, match="Attempt to access files outside the allowed directory."): + with pytest.raises( + ValueError, + match="Attempt to access files outside the allowed directory."): valid_fina_reader._sanitize_path(filename) diff --git a/tests/test_fina_stats.py b/tests/test_fina_stats.py index 0796ee9..7f46200 100644 --- a/tests/test_fina_stats.py +++ b/tests/test_fina_stats.py @@ -17,7 +17,7 @@ def mock_reader(self): """Fixture to mock the FinaReader.""" mock_reader = MagicMock() # Two days of data at 10-second intervals - npoints = (3600 * 24 * 2) + npoints = 3600 * 24 * 2 start_time = Utils.get_start_day(1575981140) mock_reader.read_meta.return_value = MetaData( start_time=start_time,