Skip to content

WeeklyTelcon_20190625

Geoffrey Paulsen edited this page Jul 25, 2023 · 2 revisions

Open MPI Weekly Telecon


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees (on Web-ex)

  • Brendan Cunningham (Intel)
  • David Bernholdt
  • Edgar Gabriel (UT)
  • Geoff Paulsen (IBM)
  • Howard Pritchard (LANL)
  • Jeff Squyres (Cisco)
  • Josh Hursey (IBM)
  • Joshua Ladd (Mellanox)
  • Ralph Castain (Intel)
  • Thomas Naughton
  • Todd Kordenbrock

not there today (I keep this for easy cut-n-paste for future notes)

  • Akshay Venkatesh (nVidia)
  • Aravind Gopalakrishnan (Intel)
  • Arm (UTK)
  • Artem Polyakov
  • Brandon Yates (Intel)
  • Brian Barrett
  • Dan Topa (LANL)
  • Geoffroy Vallee
  • George Bosilca
  • Jake Hemstad
  • Matias Cabral
  • Matthew Dosanjh
  • Michael Heinz (Intel) - Introducing Brandon
  • Nathan Hjelm
  • Noah Evans (Sandia)
  • Peter Gottesman (Cisco)
  • Xin Zhao
  • mohan

Agenda/New Business

Introduce Brandon Yates And Brendan Cunningham (Intel)

  • Will cover while Michael Heinz is gone for 8 weeks.

Next Face to face

  • May not need Fall face-to-face.
  • Could do a more limited audience fall PRTE/ORTE meeting (might change the dates)
    • PRTE vs ORTE - Need to decide and then start acting.
    • submodule is a prefered approach, but just in prototype.
  • Jeff will set a reminder for new agenda items for fall.
    • Next week either schedule or discuss a more focused meeting (either online or in person)

Issue 6640: ucx btl / UCT compiler issue

  • Howard will work on this weekend.
  • Drive a v4.0.2 (along with Vader fixes)
  • Please update README for compilation section
  • Coded to non-stable API, so need to check version, because future versions won't work.
  • Please put in a comment about why checking version and not functionality.

MVAPICH presented a paper of MVAPICH on Summit and Sierra.

  • David Bernholdt - will post paper to devel mailling list.

Host Ordering fix to v3.0.x, v3.1.x, 4.0.x https://github.com/open-mpi/ompi/issues/6501

  • Need someone to work on this.

Infrastrastructure

Email and website transition

  • No update.

Ticket lifecycle bot

OLD Submodule Topic

  • Jeff proposed to just do hwloc first, to get everyone familiar with submodules.
    • Still requires CI to prevent common mistakes.
  • Background - how to build the OMPI stack, moving to PRRTE. PMIx then becomes the infrastructure, but PRRTE needs to be able to stand alone.
  • Proposal to use submodules to implement
  • Concerns that we need to coirdinate Alliena DDT, and Totalview Tools.
    • Reason that this doesn't work is that there's no MPIR interface in PRRTE.
    • So we need to either get PMIx interface into the tools and remove support for MPIR
    • Ripping out an interface the tools depend on, but we can not wait for them to catch up.
  • Concerns about using submodules:
    • One OPAL would move off to it's own repo, and we'd have a reference.
      • A bot would watch that, and then it would file a PR, and a human would merge.
      • We MAY want to automate this at some point, but manually first.
    • Issue, someone locally makes a change to a submodule and commits locally, then bumps their parent repo's reference to point to that local change. If they push THAT, then other users won't have that submodule change.
      • CI catches case this, where someone accidentally pushes a submodule change
    • Other challenge is someone doesn't rev submodule refrence until right before a release.
    • For release branches, they should really point to a submodule release also.
  • New directory structure, will cause a lot of configury work.
    • Brian did some ugly prototyping in an hour, but not too bad.
  • How would this work for install?
    • just use --prefix and let each submodule install to the right place.
    • --enable-debug across multiple projects then it's going to be a bit of a pain.
    • Since similar lineage for each of these projects, then similar configure flags for each component.
  • Figure 2 of document shows:
    • external->opal->HEAD prrte->HEAD pmix->HEAD libevent->v2.1.8-stable release hwloc->v2.0.3 release
    • opal depends on libevent. pmix depends on hwloc
    • How do we ensure that the dependencies are "compatible"?
    • If everyone has the same jenkins driving them to update. Issues should be transient.
  • PRRTE doesn't bundle libevent, and hwloc. So OMPI is only owners of bundling.
    • PRRTE only uses external
  • Don't have the "keep in sync" issue for anything but PMIX and OPAL.
  • OMPI currently uses HWLOC directly. Treematch code uses hwloc directly.
    • Most of code today doesn't use hwloc... just goes through pmix.
  • Two versions of OPAL one for OMPI and one for PRRTE?
    • How do we ensure those are not incompatible?
    • Answer: Test a lot.
  • submodule if have patch on two pieces, have to push lower, then wait for patch to get accepted (to get the hash, and CI to finish) then update higher level patch before pushing that.
  • Remember due to linkers, we need to keep OPAL as stable ABI.
  • Are we going to have official opal "releases", or just have everyone track master?
    • Yes want to do release branches of opal. And cut them at the same time.
    • This will make cherry-picking on release branches a bit tricky
    • Fix that spans both ompi/opal will be complicated.
    • Brian will update document
  • Will there be a separate OPAL VERSION file?
    • Yes, and this is why release branches should be cut in both repos.
  • What to do about PMIx and PRRTE ? Do they get their own release branches?
    • That just triples the work, and doesn't mean we're converging on opal
    • No, just version the dependencies, and submodules will
  • If anyone has a problem with submodules fundamentally, please speak up now.
    • Just normal knee jerk reaction, but it looks like with good CI we can manage the risks.

Minutes

Review v3.0.x Milestones v3.0.4

  • No new updates. A few more PRs went in.
  • Waiting for PMIx update.
  • Waiting for vader/atomic audit

Review v3.1.x Milestones v3.1.4

  • No new updates. A few more PRs went in.
  • Waiting for vader/atomic audit
  • Waiting for New PMIx 2.x release to be embedded into v3.1.x

Review v4.0.x Milestones v4.0.2

  • 2nd Put issue PR 6568 (Vader deadlocking with 4MB transfers)

  • Vector Datatype https://github.com/open-mpi/ompi/issues/5540

    • If you're using complicated data types for real things, it's important.
    • Should it be back ported to release branches? Perhaps not, since only one customer has hit.
    • Not a blocker for v4.0.1
    • Fixed by PR 6695
  • New Datatype work https://github.com/open-mpi/ompi/pull/6695

    • Need review from Giles.
    • Natalie is doing some specific testing.
    • No ABI changes. It's safe to go back to v4.0.x
    • File IO is way better. Because this merged memcpys together.
    • Really just a risk management question if it should go back to v4.0.x
  • PR Waiting on George 6634

    • Not enough, because lots of fragments, causing deadlock.
    • Put protocol is not fixed yet.
    • This could go in by itself, but not enough (solves two different problems)
    • Still waiting on Issue 6568
  • https://github.com/open-mpi/ompi/issues/6568 - put protocol has lost it's pipelining.

    • Right now only shows in vader, because all others prefer get protocol.
    • Vader generate a bunch of 32K frags. so for 4MBs overwhelms vader.
    • Does NOT occur with single copy like CMA or KNEM.
  • Waiting for new PMIx update later this week.

  • UCT compiler error with latest UCX (Issue 6640)

    • Closed.
    • Should drive a v4.0.2
    • This issue isn't just btl_uct, it would also affect PML,SMPL, and OSHMEM componennts.
    • We should check which versions OMPI v4.0.0 was tested with and set a lower bound
    • Artem will double check that we're not breaking backwards compatibility
  • For the btl_uct - Jeff and Nathan discussed some more.

    • Current PR doesn't quite do what we discussed.
    • Want to check both lower and upper versioning of UCX in configury.
    • Right now we have a configure test that passes, but fails to compile.
    • Using undocumented UCT APIs.
    • Question UCT/UCS and UCX releases are not syncronized.
      • Proposal was some lower bounded 1.6. And if we're higher than 1.6, then add a configure option to "compile anyways" with higher UCX versions.
      • Whatever we do, it's not just a btl_uct issue.
    • On the one hand, we can't write tests for every API.
    • On the other hand, it's bad to run configure and get a compiler error.
    • configure checks are just as bad to end customers as compiler errors.
    • Brian proposed that we step back and we need to
    • uct calls are hard to find because grep hits too many "struct"
    • lots of UCS calls, but UCS calls are part of public API (example ucs_status)
      • Valid question, we need to double check, but UCS calls might be part of the public API.
    • UCT - ucx transfer
    • UCS - ucx service
    • Artem will ask mellanox to look at the PML, SPML, and OSC, and ensure they're not using any non-public APIs (no calls to uct layer, or private ucs)
  • Why do we have both btl_uct and PML for ucx?

    • BTL_UCT is a pretty good non-vender solution.
    • There is value in having both vender and non-vender components
      • Part of the goal of Open MPI is to give the research side agency to try different approaches.
    • Agree, but what we've done is chosen to ship production software that relies on unstable APIs (UCT).
  • Some thoughts about feeding back btl_uct appraoch to ucx for them to fix.

    • Not possible in short term because they have different goals.
  • For v4.0.x Have to add some configurey to compile or not compile 'by default'

  • Issue 6607 might want to get into a v4.0.2

    • Jeff's not sure if there's a real problem or not (assume not)
    • Close it.
  • Will need an update of PMIx v3.1.x - Need a new RC.

    • Josh Hursey will post a PR to OMPI v4.0.x when it's ready.
    • MPIR hangs - event timing
  • Vader Blocking Issue 6568 - Needs to be fixed on v4.0.x - Blocker

    • we thought we fixed in Issue 6258 - not MAC specific
    • 6258 fix didn't completely fix 6568
    • Blocker 6655 vader issue with optimize builds - much more concerning
      • If we
    • George identified what the problem on vader issue.
  • Artem observed an issue with v4.0.x with high number of nodes (>256 rsh doesn't work)

    • Sounds like fixes Ralph mention were ported, but still seeing these issues.
    • Mellanox is tracking it down. Set of mca params to make it work.
    • Geoff - ensure we PRed this to v4.0.x - 2 fixes involved. routed framework and otherone, init map
  • Someone should do a roundup of vader issues in last few months, and make sure they've been ported to correct

    • Geoff and Howard will see if anything was missed in vader for v4.0.x in last few months.
    • At least one that Jeff looked at (noted on ticket) was only a v4.0.x
  • PR6651 - btl/vader - Merged.

    • Jeff will check to see if needs this in v3.0.x and v3.1.x
  • PR6652 - Discussion of taking new functionality in a release branch.

    • Really should be bug-fixes. Sometimes, a little liberal with definition of "bug"
    • We've added some items missed in standard as a "bugfix" before, but this is not that situation.
    • Sounds like we should push-back against this in the release branch
  • PR6508 - started to fix host ordering, but quite large, and not complete.

    • Fixing this brings in quite a bit of other things.
    • Problem with this is that it's a significant patch
    • Ralph won't have time in next few months.
    • Interested in having this all the way back to v3.0.x
    • We're asking for help on this.
    • Artem - This sounds like >256 issue.
      • if host list is sorted, don't see this issue.
      • Ralph says there is a fix in master, but only affects -host.
      • Regex has this problem. v4.0.x has fixes version of regex.
      • Mellanox is investigating
      • workaround is avoid tree-based spawn, and some other rsh parameter.
    • In master, the compression has a specific 256 boundary.
  • PR6556 and 6621 should go to the release branches.

  • George sees regular deadlocks on vader for apps that send >2GB

    • Issue Number we need some help on this.
  • PR6625 - Discussed if we want to take the pain of this PR.

    • Good PR to mop up removal.
    • In favor of cleanup, but nervous about changing the values of non-related enums and constants.
    • We were in favor of cleanup on master.
    • Jeff went back to notes, and found that wiki describes we want to keep --enable-mpi1-compatibility
    • Want to get rid of C++ for OMPI v5.0
    • But want to keep OMPI v5.0 the same as v4.x as far as --enable-mpi1-compatibility.
    • Geoff will do this work on master to keep --enable-mpi1-compatibility
    • Could turn on some runtime annoyance factor (opal_show_help) can be disabled via mca parameter.
    • Geoff will implement. LB and UB would be hard. Maybe in type-init.
    • Reach out to George.
  • Good reminder that we now need to be careful about OPAL's ABI.

v5.0.0

  • When do we get rid of 32bit?

  • Still don't have any release manager.

    • Need to identify someone in next few months.
    • Would be nice to
    • Ralph is volunteering
    • Brian
    • Traditionally have one academic and one industry rep as release manager.
  • Still have one fundamental issue, do we do ORTE/PRRTE change for v5.0 or v6.0?

  • Schedule: Even if we want to do ORTE/PRRTE change NOW, it wouldn't get out until fall.

  • meaning so v6.0 wouldn't get out until Summer of next year.

  • Schedule: May 2020 is Ralphs retirement.

  • If we do ORTE/PRRTE change in Open MPI v5.0 Fall of 2019, then we'll have more time from Ralph before he retires.

  • When will MPI v4.0 standard will be passed?

    • Next meeting is theoretically the last meeting, then 3 more meetings.
    • But one thing we WANT (Big Count) is not ready. so talking 5 meetings,
    • So possibly Sept 2020 (w/Big Count), but maybe May 2020 (without Big Count)
    • Don't need to couple our ORTE/PRRTE with MPI 4.0 standard
  • ORTE/PRRTE change does depend on new CI and submodule changes.

  • Submodule and new CI can be done before ORTE/PRRTE changes, and is in good shape.

    • Jeff, Brian and Howard have been discussing.
      • Need CI improvements first for safety-net.
  • Moving our Website to AWS

    • University of Michigan bought us SSL certificate expires in June
    • Will get new certificate from Amazon.
    • email relay changing from host gator to AWS service
    • Shouldn't affect Documentation initiative.
    • AWS admin isn't too complicated.
    • UPDATE HostGator is now gone.
    • Now hosted at AWS, SSL certificate is no longer needed.
  • Discussion of schedule depends on scope discussion

    • if we want to separate Orte out for that? Would be a bit past summer.
    • Giles has a prototype of PRRTE replacing ORTE
  • Want to open up release-manager elections.

    • Now that we're delaying, will decide at face2face.
  • Now the possibility of v4.1 from master is a possibility

    • If we instead do a v4.1, some things we'd need fixed on master.
  • will discuss more at face to face.

  • Brian and Ralph are meeting on the 18th

  • Ralph is putting out a doodle to discuss

Depdendancies

PMIx Update

  • PMIx v3.1.3 is ready to release.
    • Like to put a tarball in ompi's v4.0.x for integrated test
  • PMIx v2.2 update could be ready soon after that.
    • Doesn't have MPIR fix.
    • Missing something else. - Ralph will audit.

ORTE/PRRTE

  • Take a look at Gile's PRRTE work. He may have done SOME of that. He should have done that all in PRRTE layer, maybe just some MPI layer work remains.
    • PR6339 - he's closed, and re-opened a new branch to look at.
    • Howard reviewed PR6339, and likes everything that Giles did, so abandoned his branch
    • This is a good approach, and gets something running, but it's not complete

MTT

  • IBM still has 10% failure rate and build issue. Please fix!!!
  • AWS - Scale testing not sure of status of that.

Back to 2019 WeeklyTelcon-2019

Clone this wiki locally