Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#25: create class for partitions #45

Merged
merged 19 commits into from
Sep 18, 2023

Conversation

seldon851
Copy link
Contributor

@seldon851 seldon851 commented Jun 28, 2023

  • It's now possible to create an AtumContext of sub-partitions
  • AtumAgent now a class and companion object of the class to allow referencing
  • AtumContext now behaves somewhat like a singleton for given AtumPartitions - as a result it's mutable
  • Rename Measurement to Measure
  • Simplified AtumPartitions

Closes #25
Closes #33

@seldon851 seldon851 requested a review from benedeki June 28, 2023 11:55
import java.time.LocalDate
import scala.collection.immutable.ListMap

case class AtumPartitions(name: String, reportDate: LocalDate, partitions: Partitions = ListMap()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name and reportDate are surplus. There are no specific keys always present, so these while to be expected to be common, should be part of the partitons key.

import java.time.LocalDate
import scala.collection.immutable.ListMap

case class AtumPartitions(name: String, reportDate: LocalDate, partitions: Partitions = ListMap()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is kind of a key toe AtumContext definitely should be a member of the class.

* @return
*/
def createAtumContext(atumPartitions: AtumPartitions): AtumContext = {
AtumContext()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would need more logic. According to my insight, the returned value should be a kind of singleton of AtumContext. For same atumPartitions the same AtumContext needs to be returned.
From that POV I am even considering making AtumContext mutable in the regard of Measures. What do you think guys? @seldon851, @Zejnilovic, @dk1844 , @lsulak, @jakipatryk, @yruslan

@seldon851 seldon851 requested a review from benedeki June 30, 2023 12:58
@seldon851 seldon851 marked this pull request as ready for review June 30, 2023 14:19
@github-actions
Copy link

github-actions bot commented Aug 21, 2023

JaCoCo agent module code coverage report - spark:2 - scala 2.12.12

File Coverage [66.28%]
MeasureResult.scala 100% 🍏
AtumAgent.scala 81.82% 🍏
AtumContext.scala 76.03%
Measure.scala 54.61%
Total Project Coverage 65.64%

@github-actions
Copy link

JaCoCo server module code coverage report - scala 2.12.12

There is no coverage information present for the Files changed

Total Project Coverage 12.45%

)

/**
* Sends a single `MeasureResult` to the AtumService API. With not AtumContext involve.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Sends a single `MeasureResult` to the AtumService API. With not AtumContext involve.
* Sends a single `MeasureResult` to the AtumService API. Without AtumContext involve.

lsulak and others added 2 commits August 22, 2023 12:19
* `AtumAgent` now a class and companion object of the class to allow referencing
* `AtumContext` now behaves somewhat like a singleton for given `AtumPartitions` - as a result it's mutable
* Rename `Measurement` to `Measure`
* Simplified `AtumPartitions`
Copy link
Collaborator

@lsulak lsulak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM so far, no big issues, just minor

val result = MeasureResult(measure, measure.function(df))
AtumAgent.publish(checkpointName, atumContext, result)

executeMeasures(checkpointName, atumContext.measurements)
executeMeasures(checkpointName, atumContext.measures)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line can be removed, it's doing the same as the whole foreach basically

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will removed, but as said, this needs reword. createCheckpoint and send it as an "atomic" steo.

* @param measure the measure to be calculated
* @return
*/
def executeMeasure(checkpointName: String, measure: Measurement): DataFrame = {

def executeMeasure(checkpointName: String, measure: Measure): DataFrame = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that execute should do only executing and publishing should do publishing. And it can be all put together in some other operation.

I definitely wouldn't expect from 'executeMeasure' to actually also send data to the server

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these could and should be eventually completely removed. Thee;s no scenario where just particular measure are recorded for a checkpoint.

@@ -78,11 +109,11 @@ object AtumContext {
* @return
*/
def createCheckpoint(checkpointName: String)(implicit atumContext: AtumContext): DataFrame = {
atumContext.measurements.foreach { measure =>
atumContext.measures.foreach { measure =>
val result = MeasureResult(measure, measure.function(df))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you can call executeMeasure function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it will (probably) go away, can this stay? 😉

Copy link
Collaborator

@lsulak lsulak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. Obviously the code is a bit chaotic with the current implementation versus what we want to achieve - we can build something on top of this, but rework a lot. Let's merge it and deal with it later (but soon)

@benedeki benedeki merged commit aec3c76 into master Sep 18, 2023
4 of 5 checks passed
@benedeki benedeki deleted the feature/25create-class-4-segmentation-2 branch September 18, 2023 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants