Skip to content

Update JS functions to use document.querySelector for shorter queries #11

@jasondilworth56

Description

@jasondilworth56

I was just look at this repo to use it for an upcoming project and was about to extend it to find phone numbers. Before that, I noticed that the current way to scrape an email and similar is relatively verbose. Is there any interest in updating to querySelector?

As an example, that'd change this:

email = self.browser.execute_script(
     "return (function(){try{for (i in document.getElementsByClassName('pv-contact-info__contact-type')){ "
                "let el = document.getElementsByClassName('pv-contact-info__contact-type')[i]; if("
                "el.className.includes( 'ci-email')){ return el.children[2].children[0].innerText; } }} catch(e){"
                "return '';}})()")

to this:

email = self.browser.execute_script(
                "return (function(){try{return document.querySelector('.pv-contact-info__contact-type.ci-email a')"
                ".innerText;} catch(e){return '';}})()")

Which I think is likely to be more readable long term?

Happy to take it on if there's interest in that, as well as adding some extra methods on Scraper for other information available in the Contact Info tab.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions