Skip to content

emon_fina

Eli Serra edited this page Mar 24, 2025 · 5 revisions

The emon_fina module facilitates the analysis and processing of time-series data, particularly from PhpFina file formats.

Installation

he emon_fina module is part of the emon-tools package and offers flexible installation options tailored to various use cases. Depending on your requirements, you can choose to install the complete package or specific subsets of dependencies. Below are the available installation options:

  1. Install the Complete emon-tools Package

To install the entire emon-tools package along with all dependencies, run the following command:

pip install emon-tools["all"]

Included Dependencies:

  • aiohttp
  • numpy
  • pandas
  • matplotlib

Tip: This is the best option if you plan to use all features, including data manipulation and visualization.

  1. Default Installation for emon_fina

If you only need the basic functionalities of emon_fina, you can install it with its minimal dependencies:

pip install emon-tools["fina"]

Included Dependencies:

  • numpy

Tip: Choose this option if you need lightweight installations with basic numeric computation capabilities.

  1. For DataFrame and Time-Series Support

To enable advanced time-series analysis with pandas DataFrames, use the following command:

pip install emon-tools["fina, time_series"]

Included Dependencies:

  • numpy
  • pandas

Tip: Use this option if you work extensively with pandas for data manipulation and analysis.

  1. For Plotting and Visualization If you need to visualize time-series data using matplotlib, install the following:
pip install emon-tools["fina, plot"]

Included Dependencies:

  • numpy
  • pandas
  • matplotlib

Tip:

Features

  • Data Reading: Efficiently read data from PhpFina file formats.
  • Time-Series Analysis: Compute daily statistics such as min, max, mean, and more.
  • Filtering: Validate and filter data based on custom thresholds.
  • Utilities: Timestamp manipulation and interval computation tools.

PhpFina File Structure

PhpFina is a lightweight binary file format used by EmonCMS for storing time-series data. Each PhpFina feed consists of two files:

  1. .dat File: Contains the actual time-series data values, stored as binary floats. Each value corresponds to a specific timestamp based on the feed's start time and interval.

  2. .meta File: Contains metadata about the feed. Its structure includes:

  • Offset 0-7: Reserved for future use or ignored by the library.
  • Offset 8-15: Contains two 4-byte little-endian integers:
    • interval: The time interval (in seconds) between consecutive data points.
    • start_time: The Unix timestamp of the first data point.
  • Computed Values:
    • npoints: The total number of data points, calculated as data_size // 4 (where each data point is 4 bytes).
    • end_time: Computed as start_time + npoints * interval - interval.

Usage Examples:

The examples below demonstrate how to retrieve and analyze data from PhpFina timeseries .dat files. For additional examples, refer to the emon_fina Jupiter NoteBook.

Retrieving data

1. Initialize FinaData:

This initializes the FinaData class, allowing you to interact with the time-series data files:

from emon_tools.emon_fina import FinaData

fdf = FinaData(
    file_name="1",
    data_dir="/path/to/phpfina/files
)

Access metadata of the .meta file:

print(fdf.meta)
# Example Output:
# {
#     "interval": 10,
#     "start_time": 1575981140,
#     "npoints": 4551863,
#     "end_time": 1621499760,
#     "size": 18207452
# }

2. Retrieve Values:

Extract data within defined time ranges from .dat files, accommodating various base recording intervals and output options. The format of the retrieved data depends on three factors:

  1. The base interval at which data is recorded in the file
  2. The interval selected for data retrieval
  3. The chosen output type

These factors determine whether the data is presented in its raw form, aggregated, or processed in other ways to meet specific analysis needs.

  1. 1D NumPy Array by time window:

Extract values starting from a specific timestamp and within a time window:

from emon_tools.emon_fina import FinaData
from emon_tools.emon_fina.fina_models import FinaByTimeParamsModel
from emon_tools.emon_fina.emon_fina import OutputType

fdf = FinaData(
    file_name="1",
    data_dir="/path/to/phpfina/files"
)

values = fdf.get_fina_values(
    props=FinaByTimeParamsModel(
        start_time=fdf.meta.start_time,
        time_window=8 * 24 * 3600,
        time_interval=10,
        output_type=OutputType.VALUES
    )
)
  1. 1D NumPy Array by datetime interval:

Extract values within a specific date range:

from emon_tools.emon_fina import FinaData
from emon_tools.emon_fina.fina_models import FinaByDateRangeParamsModel
from emon_tools.emon_fina.emon_fina import OutputType

fdf = FinaData(
    file_name="1",
    data_dir="/path/to/phpfina/files"
)

ts = fdf.get_df_data_by_date_range(
    props=FinaByDateRangeParamsModel(
        start_date='2019-12-12 00:00:00',
        end_date='2019-12-13 00:00:00',
        time_interval=3600,
        output_type=OutputType.VALUES
    )
)
  1. 2D Time-Series NumPy Array by time window:

Retrieve a 2D array containing timestamps and corresponding values:

from emon_tools.emon_fina import FinaData
from emon_tools.emon_fina.fina_models import FinaByTimeParamsModel
from emon_tools.emon_fina.emon_fina import OutputType

fdf = FinaData(
    file_name="1",
    data_dir="/path/to/phpfina/files"
)

ts = fdf.get_fina_values(
    start_time=fr.meta.start_time,
    time_interval=10,
    time_window=8 * 24 * 3600,
    output_type=OutputType.TIME_SERIES
)
  1. 2D Time-Series NumPy Array by datetime interval:

Retrieve a 2D array of timestamps and values for a specific date range:

from emon_tools.emon_fina import FinaData
from emon_tools.emon_fina.fina_models import FinaByDateRangeParamsModel
from emon_tools.emon_fina.emon_fina import OutputType

fdf = FinaData(
    file_name="1",
    data_dir="/path/to/phpfina/files"
)

ts = fdf.get_data_by_date_range(
    props=FinaByDateRangeParamsModel(
        start_date='2019-12-12 00:00:00',
        end_date='2019-12-13 00:00:00',
        time_interval=10,
        output_type=OutputType.TIME_SERIES
    )
)
  1. Pandas DataFrame Time-Series:

Convert time-series data into a Pandas DataFrame for easier manipulation:

FinaDataFrame initialization:

from emon_tools.fina_time_series import FinaDataFrame
from emon_tools.emon_fina.fina_models import FinaByTimeParamsModel
from emon_tools.emon_fina.emon_fina import OutputType

fdf = FinaDataFrame(
    file_name="1",
    data_dir="/path/to/phpfina/files"
)

ts = fdf.get_df_data(
    props=FinaByTimeParamsModel(
        start_time=fr.meta.start_time,
        time_window=0,
        time_interval=86400,
        output_type=OutputType.TIME_SERIES
    )
)

# Or by date_range

ts = fdf.get_df_data_by_date_range(
    props=FinaByDateRangeParamsModel(
        start_date='2019-12-12 00:00:00',
        end_date='2019-12-13 00:00:00',
        time_interval=10,
        output_type=OutputType.TIME_SERIES
    )
)

Access metadata of the .meta file:

print(fdf.meta)
# Example Output:
# {
#     "interval": 10,
#     "start_time": 1575981140,
#     "npoints": 4551863,
#     "end_time": 1621499760,
#     "size": 18207452
# }

3. Plotting Data:

Visualize the retrieved time-series data:

from emon_tools.fina_plot import PlotData

PlotData.plot(data=ts)

Compute Daily Statistics

1. Retrieve file integrity statistics:

Analyze the integrity of the .dat file by computing the presence of valid and missing data:

from emon_tools.emon_fina.fina_time_series import FinaDataFrame
from emon_tools.emon_fina.emon_fina import OutputType
from emon_tools.emon_fina.fina_plot import PlotStats

# initialyse FinaDataFrame
fr = FinaDataFrame(
    file_name="1",
    data_dir=DATA_DIR
)

# get file integrity statistics DataFrame
df = fdf.get_fina_values(
    start_time=fr.meta.start_time,
    time_interval=86400,
    time_window=0, # Get all values
    output_type=OutputType.INTEGRITY
)

# Plot file integrity statistics

PlotStats.plot_integrity(
    data=df
)

2. Value Statistics:

Compute daily statistics (e.g., min, max, mean) for data values:

from emon_tools.emon_fina.fina_time_series import FinaDataFrame
from emon_tools.emon_fina.emon_fina import OutputType
from emon_tools.emon_fina.fina_plot import PlotStats

# initialyse FinaDataFrame
fr = FinaDataFrame(
    file_name="1",
    data_dir=DATA_DIR
)

# get daily statistics DataFrame
df = fdf.get_fina_values(
    start_time=fr.meta.start_time,
    time_interval=86400,
    time_window=0, # Get all values
    output_type=OutputType.TIME_SERIES_MIN_MAX
)

# Plot daily statistics
PlotStats.plot_values(
    data=df
)

4. Filtered Value Statistics:

Restrict statistical calculations to a specific value range:

from emon_tools.emon_fina.fina_time_series import FinaDataFrame
from emon_tools.emon_fina.emon_fina import OutputType
from emon_tools.emon_fina.fina_plot import PlotStats

# initialyse FinaDataFrame
fr = FinaDataFrame(
    file_name="1",
    data_dir=DATA_DIR
)

# get daily statistics DataFrame
df = fdf.get_fina_values(
    start_time=fr.meta.start_time,
    time_interval=86400,
    time_window=0, # Get all values
    max_value=50,
    min_value=-50,
    output_type=OutputType.TIME_SERIES_MIN_MAX
)

# Plot daily statistics
PlotStats.plot_values(
    data=df
)