Skip to content

Make it possible to specify the parser for BeautifulSoup4 #39

@fumiya5863

Description

@fumiya5863

If you have lxml installed, BeautifulSoup4 will set lxml as the default parser, so it would be better to be able to specify the parser depending on the situation.

doc = BeautifulSoup(html)

This is the default setting because we didn't actually do the parser above.

Depending on the environment, the following issue cases may occur due to the above reasons
#37

As a solution, I think it would be a good idea to add a new parser that can be selected in the following arguments

def __init__(self, url=None, html=None, scrape=False, **kwargs):

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions