Skip to content

Conversation

@groodt
Copy link
Contributor

@groodt groodt commented Aug 27, 2020

Adds cross-compilation for Scala 2.11 and Scala 2.12

TODO:

  • How this should be integrated for publishing to bintray and mvn central?
  • Should we upgrade scalatest all the way to 3.2.2? We've upgraded most of the others. Stay on 3.0.8
  • Fix formatting so the diff is smaller. Done.

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<scala.maven.version>3.2.2</scala.maven.version>
<scala.binary.version>2.11</scala.binary.version>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed because cross-compilation plugin requires / recommends they are not defaulted and only set through Profiles.

I was able to select a Profile inside IntelliJ and the project compiled in my IDE.

<modelVersion>4.0.0</modelVersion>
<groupId>com.linkedin.sparktfrecord</groupId>
<artifactId>spark-tfrecord_2.11</artifactId>
<artifactId>spark-tfrecord_${scala.binary.version}</artifactId>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now possible due to the cross-compilation plugin.

<recompileMode>incremental</recompileMode>
<useZincServer>true</useZincServer>
<scalaVersion>${scala.compiler.version}</scalaVersion>
<scalaVersion>${scala.version}</scalaVersion>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plugin appears to want this property name.



trait BaseSuite extends WordSpecLike with Matchers with BeforeAndAfterAll
trait BaseSuite extends AnyWordSpecLike with Matchers with BeforeAndAfterAll
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New ScalaTest version.

import matchers.should._
import org.scalatest.wordspec.AnyWordSpecLike

class TFRecordDeserializerTest extends AnyWordSpecLike with Matchers {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New ScalaTest version.

I think my IDE reformatted things. Let me know if there are specific formatting options I should use and I can revert.

@groodt groodt changed the title [WIP] - Scala cross-compilation Scala cross-compilation Aug 27, 2020
@groodt groodt changed the title Scala cross-compilation Scala 2.11 and Scala 2.12 cross-compilation Aug 27, 2020
@groodt groodt marked this pull request as ready for review August 27, 2020 04:25
@junshi15
Copy link
Contributor

Thanks @groodt !
Spark 2.4 + Scala 2.12, it seems to work for Spark-Tensorflow-Connector.
tensorflow/ecosystem#155
I see minor difference in some versions.
scala 2.12.10 vs 2.12.12
scalatest 3.0.8 vs 3.2.2
Not sure if these caused the error.

@junshi15
Copy link
Contributor

TensorFlowInferSchema.scala is almost identical between the two. So it is likely due to the version difference. I will take a close look later.

@groodt
Copy link
Contributor Author

groodt commented Aug 27, 2020

TensorFlowInferSchema.scala is almost identical between the two. So it is likely due to the version difference. I will take a close look later.

Might have just been my poor implementation of scalatest. Maybe I had too many classes in scope or something. I found the upgrade a bit baffling. I've upgraded to 3.0.8 by following the other repo and it is now working! 🍾

@junshi15
Copy link
Contributor

Great!
I see a lot of diffs are due to formatting change. Let's limit this PR to cross-compilation. Style changes can come as a separate PR, if it is necessary.

@groodt
Copy link
Contributor Author

groodt commented Aug 27, 2020

Great!
I see a lot of diffs are due to formatting change. Let's limit this PR to cross-compilation. Style changes can come as a separate PR, if it is necessary.

Done.

With scalatest:3.0.8 it seems that no code changes are required. I'm not sure if there is any benefit trying to upgrade to 3.1.x or 3.2.x. WDYT?

I'm also beginning to wonder how much value there is in cross-building for Spark 2.4.x and Scala 2.11 and 2.12 given that it doesn't seem like we'll be able cross-build for Spark 3.x and Scala 2.11 and 2.12? Maybe for Scala 3.x there would be a need to cross-build for 2.12 and 2.13 someday though. Do you think for Spark 3.x there would probably need to be a different branch? Do you think there is still value in this PR to cross-build Spark 2.4.x and Scala 2.11 and 2.12 even though I'm not sure how many people might be using the Spark 2.4.x and Scala 2.12 combination. WDYT?

@junshi15
Copy link
Contributor

junshi15 commented Aug 28, 2020

Done.

With scalatest:3.0.8 it seems that no code changes are required. I'm not sure if there is any benefit trying to upgrade to 3.1.x or 3.2.x. WDYT?

I'm also beginning to wonder how much value there is in cross-building for Spark 2.4.x and Scala 2.11 and 2.12 given that it doesn't seem like we'll be able cross-build for Spark 3.x and Scala 2.11 and 2.12? Maybe for Scala 3.x there would be a need to cross-build for 2.12 and 2.13 someday though. Do you think for Spark 3.x there would probably need to be a different branch? Do you think there is still value in this PR to cross-build Spark 2.4.x and Scala 2.11 and 2.12 even though I'm not sure how many people might be using the Spark 2.4.x and Scala 2.12 combination. WDYT?

I will keep at scalatest:3.0.8. There is no need to upgrade it.

Spark 3.x will work with Scala 2.12 only. Spark 2.4.x will likely be built with Scala 2.11. If these are the two configurations we want to support, do we still need the cross-building plug-in?

@groodt
Copy link
Contributor Author

groodt commented Aug 28, 2020

Spark 3.x will work with Scala 2.12 only. Spark 2.4.x will likely be built with Scala 2.11. If these are the two configurations we want to support, do we still need the cross-building plug-in?

This relates to my point as well. Spark 2.4.x does support Scala 2.12 and is built and released on Maven for both 2.11 and 2.12. So we could still do that if there is demand. Not sure what the plans at LinkedIn might be. I'm not sure if AWS or EMR will plan on supporting Scala 2.12 for Spark 2.4.x. They did release EMR 6.0.0 with Spark 2.4.4 and Scala 2.12, but I think they're going to go straight to Spark 3.0.0 for EMR 6.1.0. It's really up to you.

My main interest personally would be a Spark 3.0.0 and Scala 2.12 release on bintray and maven central.

@junshi15
Copy link
Contributor

This relates to my point as well. Spark 2.4.x does support Scala 2.12 and is built and released on Maven for both 2.11 and 2.12. So we could still do that if there is demand. Not sure what the plans at LinkedIn might be. I'm not sure if AWS or EMR will plan on supporting Scala 2.12 for Spark 2.4.x. They did release EMR 6.0.0 with Spark 2.4.4 and Scala 2.12, but I think they're going to go straight to Spark 3.0.0 for EMR 6.1.0. It's really up to you.

My main interest personally would be a Spark 3.0.0 and Scala 2.12 release on bintray and maven central.

At Linkedin, we are not planning to do cross-building. Spark 2.3 is with Scala 2.11 and Spark 3.0 will have to be with Scala 2.12. But I think this PR is useful if there is demand for Spark 2.4 with Scala 2.12 in the future. It also saves us from creating another branch.

You listed a task
"How this should be integrated for publishing to bintray and mvn central?"
What's your plan? Currently, publishing is done manually by running "mvn deploy" locally. Does this still work with this PR or you plan to automate this? I am fine with manual publishing for now.

@groodt
Copy link
Contributor Author

groodt commented Aug 31, 2020

At Linkedin, we are not planning to do cross-building. Spark 2.3 is with Scala 2.11 and Spark 3.0 will have to be with Scala 2.12. But I think this PR is useful if there is demand for Spark 2.4 with Scala 2.12 in the future. It also saves us from creating another branch.

Ok, I'm happy to continue with this then. We might still want or need a different branch in future if we want to compile and release a version that targets Spark 3.0 I think.

What's your plan? Currently, publishing is done manually by running "mvn deploy" locally. Does this still work with this PR or you plan to automate this? I am fine with manual publishing for now.

I presume that in the past, you would run the following to upload your releases?

mvn deploy -P ossrh,bintray

The following should now work:

mvn deploy -P scala-2.11,ossrh,bintray

Do you have a way to test those?

@junshi15
Copy link
Contributor

Do you have a way to test those?

Yes, let me clone your branch and test it out locally. With your latest change in README, your PR is complete, right?

@groodt
Copy link
Contributor Author

groodt commented Aug 31, 2020

Yes, let me clone your branch and test it out locally. With your latest change in README, your PR is complete, right?

Yes, I think so.

@groodt
Copy link
Contributor Author

groodt commented Sep 1, 2020

@junshi15 Any news on whether this approach works and would be acceptable?

@junshi15
Copy link
Contributor

junshi15 commented Sep 2, 2020

@junshi15 Any news on whether this approach works and would be acceptable?

Sorry, I have not tested it yet due to some urgent tasks at work. The RB looks fine. I will do a local test tomorrow. If things check out fine, we should be able to merge it tomorrow.

@groodt
Copy link
Contributor Author

groodt commented Sep 2, 2020

Ok, thanks!

@junshi15 junshi15 merged commit c1fbbf1 into linkedin:master Sep 2, 2020
@groodt groodt deleted the scala-cross-compile branch September 2, 2020 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants