Skip to content

Conversation

@Hannibal404
Copy link
Contributor

This change adds a new PMDA (Performance Metrics Domain Agent) for Reliable Datagram Sockets (RDS). It exports key metrics including connection information, socket and connection statistics, and details of send, receive, and retransmit queues for performance analysis using Performance Co-Pilot (PCP).

This PMDA is intended to aid in diagnosing network-related issues on systems using RDS over Infiniband or TCP.

Replaces #2230

This commit adds a new PMDA (Performance Metrics Domain Agent) for
Reliable Datagram Sockets (RDS). It exports key metrics including
connection information, socket and connection statistics, and details
of send, receive, and retransmit queues for performance analysis using
Performance Co-Pilot (PCP).

This PMDA is intended to aid in diagnosing network-related issues
on systems using RDS over Infiniband or TCP.

Signed-off-by: Mohith Kumar Thummaluru <mohith.k.kumar.thummaluru@oracle.com>
Signed-off-by: Mohith Kumar Thummaluru <mohith.k.kumar.thummaluru@oracle.com>
Add manpage for rds pmda and address some linting issues

Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com>
Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com>
@natoscott
Copy link
Member

Install fails for me after building rpm packages with:

[pcpqa@fedora rds]$ sudo ./Install 
Traceback (most recent call last):
  File "/var/lib/pcp/pmdas/rds/pmdards.python", line 25, in <module>
    from modules.rds_ping import rds_ping_all_avlbl_dest
ModuleNotFoundError: No module named 'modules.rds_ping'

I expect it relates to the .python file extensions, and the more dynamic import mechanism used by pmdabcc might be more what you're after here.

Unrelated to this, the new QA test .out file contains several errors as well that shouldn't be there (relating to 'unknown metric name') - but, it fails with the Install for me so I've not been able to observe that second issue locally to advise further (its definitely wrong, I just don't know why).

Signed-off-by: Pradyumn Rahar <pradyumn.rahar@oracle.com>
@Hannibal404
Copy link
Contributor Author

Added simlinks for the modules files to fix the errors.

The QA output had unknown metrics errors due to IB specific metrics on a machine without infiniband. Updated.

@natoscott
Copy link
Member

@Hannibal404 thanks for the updates, I'm still seeing issues though. The test fails because rds Install fails similarly to previously...

[pcpqa@fedora rds]$ sudo ./Install 
Traceback (most recent call last):
  File "/var/lib/pcp/pmdas/rds/pmdards.python", line 50, in <module>
    from modules.rds_ping import rds_ping_all_avlbl_dest
ModuleNotFoundError: No module named 'modules.rds_ping'
Arrgh! failed to create /var/lib/pcp/pmdas/rds/domain.h.python from /var/lib/pcp/pmdas/rds/pmdards.python

I think you may need something more like this code from pmdabcc:

    def init_modules(self):
        """ Initialize modules """
        self.log("Initializing modules:")

        # For packaging, allow both .python and .py suffixed files
        cwd = os.getcwd()
        pmdadir = PCP.pmGetConfig('PCP_PMDASADM_DIR') + '/' + self.read_name()
        for root, _, filenames in os.walk(pmdadir):
            os.chdir(root)
            for filename in fnmatch.filter(filenames, '*.python'):
                if filename in ('pmdabcc.python', 'domain.h.python', 'pmns.python'):
                    continue
                pyf = filename[:-4]
                if not os.path.exists(pyf):
                    os.symlink(filename, pyf)
            os.chdir(pmdadir)
        os.chdir(cwd)

        import pmdautil # pylint: disable=import-outside-toplevel
        self.proc_helper = pmdautil.ProcMon(self.log, self.err)
        for module in self.modules:
            self.log(module)
            try:
                mod = importlib.import_module('modules.%s' % self.modules[module][MODULE])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants