`process_data()` throws an error with 2024 data

An error is thrown when using the `process_data()` function against a directory containing the March 2024 downloaded data. 

## Reproducible Example

```
library(fcall)
# Download March 2024 data
download_data(year = 2024, month = 3, dest = "data-raw/2024-03")
# Process data
processed_data <- process_data("data-raw/2024-03")
```
returns the error:
```
Error in `map2()`:
ℹ️ In index: 28.
ℹ️ With name: RCR7.
Caused by error in `scan()`:
! line 61 did not have 535 elements
```

## Error Details

The problem occurs due to missing rows in the `RCR7_Q202403_G20240508.TXT` file.
As described in [Scenario 3](https://ketchbrookanalytics.github.io/fcall/articles/data-structure.html#scenario-3-single_multiple_single), the `RCR7` file expects, for each institution in the data file:
- a row that contains comma-separated values for variables that belong to the first set of *single-occurrence* variables
- a row for each class of the `code` variable with comma-separated values of *multiple-occurrence* variables
- a row that contains comma-separated values of the remaining *single-occurrence* variables

In particular, there are some institutions that have missing entries for `code` class `2000` (i.e., some variables do not have a row that corresponds to the "Risk Weight Factor" for that variable).

Our current approach assumes that the `RCR7` data published by FCA will have a row for each `RegCapCode` (for each multiple-occurrence variable) for each institution. In fact, the text *"THERE IS ONE OCCURENCE FOR EACH RegCapCode VALUE"* is published on the bottom of the `D_RCR7.TXT` file itself.
 
This missing `2000` code for some variables (for certain institutions) is causing `process_data()` to fail.

## Possible Workarounds

There are several options for troubleshooting this error:

1. Avoid processing the `RCR7` file by removing `D_RCR7.TXT` and `RCR7_Q202403_G20240508.TXT` from the directory where the data was downloaded into (i.e., the `dir` argument of `process_data()`).
2. Leverage `process_metadata_file()` and `process_data_file()` to process the non-`RCR7` files you are interested in.
For example, the code below shows how to process only the `RCB` data:
```
RCB_metadata <- fcall::process_metadata_file(file = "data-raw/2024-03/D_RCB.TXT")
RCB_data <- fcall::process_data_file(
  file = "data-raw/2024-03/RCB_Q202403_G20240508.TXT",
  metadata = RCB_metadata,
  dict = RCB__INV_CODE
)
```
Remember that available `dict`s are stored as internal `{fcall}` datasets.

3. Manually add the missing lines to `RCR7_Q202403_G20240508.TXT` (this assumes all values for this code are zero).
You can add `2000,,,,,,,,,,,,,,,,,,` below each instance of a row that starts with `1900` that is not followed by a row that starts with `2000`.
4. Replace the `RCR7_Q202403_G20240508.TXT` file in the directory where the data was downloaded into (i.e., the `dir` argument of `process_data()`) with the attached file below that applies the changes described in # 3 above.

[RCR7_Q202403_G20240508.TXT](https://github.com/ketchbrookanalytics/fcall/files/15367129/RCR7_Q202403_G20240508.TXT)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`process_data()` throws an error with 2024 data #23

Reproducible Example

Error Details

Possible Workarounds

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

process_data() throws an error with 2024 data #23

Description

Reproducible Example

Error Details

Possible Workarounds

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`process_data()` throws an error with 2024 data #23