Skip to content

Progress report 2 comments #1

@djvill

Description

@djvill

Looks like you've made great progress! I'm really pleased to see that you seem to have pushed through some of the head-spinning "where do I start first?" challenges

Can you share your full script? That would help me give you the best advice possible!

Re: Problem 1 and problem 2, these are really the same problem, in a way. That is, if you create a mega-function, then it's just a matter of iterating over all the HTMLs to get all the data you need--magic!

Let's say for the sake of argument that mnogopuno(), isa_iraova(), etc., all output a single element. Then your mega-function could look something like this:

BCS_Dialect <- function(x) { 
  dat <- input_data(x)
  tibble(html = x,
		 mnogopuno = mnogopuno(dat),
		 isa_iraova = isa_iraova(dat),
		 [...]
		 )
}

If your functions don't all output a single element, then you'll need to get a little more creative. For example, if mnogopuno() outputs a 2-element vector, you could do...

BCS_Dialect <- function(x) { 
  dat <- input_data(x)
  mp <- mnogopuno(dat)
  tibble(html = x,
		 mnogo = mp[1],
		 puno = mp[2],
		 isa_iraova = isa_iraova(dat),
		 [...]
		 )
}

Once you've gotten your mega-function into shape, Problem #1 is easy. All you need is a vector of URLs, which you can pipe into purrr::map_dfr() with your mega-function to create a mega-dataframe: c("https://hr.wikipedia.org/wiki/Kosovo", "https://sr.wikipedia.org/wiki/%D0%9A%D0%BE%D1%81%D0%BE%D0%B2%D0%BE", [...]) %>% map_dfr(BCS_Dialect)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions