Skip to content

Conversation

@aquasync
Copy link

I was trying to read a cell value in Excel that contained unicode data. I found that strings with non-ascii values seem to just return a blank string:

excel$ActiveWorkbook()$ActiveSheet()$Range('A1')$Value()
[1] ""
Encoding(s)
[1] "unknown"

This pull request changes FromBstr/AsBstr to use UTF8, so that this now works as expected:

(s<-excel$ActiveWorkbook()$ActiveSheet()$Range('A1')$Value())
[1] "1234³"
Encoding(s)
[1] "UTF-8"

An alternative workaround that might be applicable for others is to use win32com.client via reticulate, but I didn't want to require a python installation:

library(reticulate)
excel2 = import('win32com.client')$Dispatch('excel.application')
excel2$ActiveWorkbook$ActiveSheet$Range('A1')$Value
[1] "1234³"

I've also verified that writing unicode values works now with these changes also.

Note that there are still a bunch of places where we use non-encoding aware conversions, primarily around dispatch (method/property/type names etc), which I presume is fine. The focus here was on errors and values.

Let me know if there's some changes you'd want to see for this to be accepted. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant