-
Notifications
You must be signed in to change notification settings - Fork 862
WeeklyTelcon_20211102
Geoffrey Paulsen edited this page Nov 8, 2021
·
1 revision
Oops Not recorded today. :(
- Akshay Venkatesh (NVIDIA)
- Artem Polyakov (NVIDIA)
- Aurelien Bouteiller (UTK)
- Austen Lauria (IBM)
- Brandon Yates (Intel)
- Brendan Cunningham (Cornelis Networks)
- Brian Barrett (AWS) - Welcome Back!
- Charles Shereda (LLNL)
- Christoph Niethammer (HLRS)
- David Bernholdt (ORNL)
- Edgar Gabriel (UH)
- Erik Zeiske (HPE)
- Geoffrey Paulsen (IBM)
- Geoffroy Vallee (ARM)
- George Bosilca (UTK)
- Harumi Kuno (HPE)
- Hessam Mirsadeghi (NVIDIA))
- Howard Pritchard (LANL)
- Jeff Squyres (Cisco)
- Joseph Schuchart (HLRS)
- Josh Hursey (IBM)
- Joshua Ladd (NVIDIA)
- Marisa Roman (Cornelius)
- Mark Allen (IBM)
- Matias Cabral (Intel)
- Matthew Dosanjh (Sandia)
- Michael Heinz (Cornelis Networks)
- Nathan Hjelm (Google)
- Noah Evans (Sandia)
- Raghu Raja
- Ralph Castain (Intel)
- Sam Gutierrez (LANL)
- Scott Breyer (Sandia?)
- Shintaro iwasaki
- Sriraj Paul (Intel)
- Thomas Naughton (ORNL)
- Todd Kordenbrock (Sandia)
- Tomislav Janjusic (NVIDIA)
- William Zhang (AWS)
- Xin Zhao (NVIDIA)
- Does Fortran Fixes affect API? (i.e. needed for v5.0.0?)
- PR https://github.com/open-mpi/ompi/pull/9259
- Think that 9367 addresses the issue with 9259.
- and PR https://github.com/open-mpi/ompi/pull/9367
- Question: should we close either 9259 or 9367? Should we move them both to Draft for now and wait on FORTRAN community?
- Closed both and opened https://github.com/open-mpi/ompi/issues/9484
- Fortran Module, not f08
- Tho shall not
- Mark Allen IBM will look at.
- We've been technically wrong (but correct)
- Not an ABI issue.
- Unless someone complains, just keep to v5.0.x
- PR https://github.com/open-mpi/ompi/pull/9259
- Howard has been implementing isend/recv and isend/replace
- ULFM might not have looked closely enough about how this was defined in the standard.
- What if send completed, but the recv failed?
- Not hard to code, just not well defined. Let the forum discuss.
- Schedule: Pushed to Nov. for 4.0.7
- Thursday we'll build 4.0.7 rc2
- Adding ireduce_scatter 2GB silent wrong answer bug into news.
- Schedule:
- Another RC probably tomorrow
- Cisco MTT just got their testing, back online this morning.
- RHEL7 - Let Encrypt certification expiration last month.
- Will just run v4.1.x to try to get a good run in.
- All running
- Howard added a new osc test named "empty" that includes ompi.h, but doesn't us it.
- Some test harness fix PRs ready to merge.
- Schedule: rc2 went out yesterday.
- https://github.com/open-mpi/ompi/issues/9540 might be ready on v5.0.x
- 8 PRs open.
- PR 9594 - Fixes some BTL issues (against master) will take a few days to review.
- Issue #9554 Jeff asked about Partitions support going to v5.0 or not?
- Matthew is interested
- PR #9495 TCP Onesided for master.
- Tommy's still pushing on UCX Onesided.
- PR 9576 - Ralph filed a ticket about building packages externally.
- Working with fedora packagers. Will be a v5.0.x
- Might need some back and forth with PMIx. The way he updated PMIx might need massive change to OMPI.
- Ball is somewhat in Jeff's Court.
- Across OMPI/PMIx/PRRTE - Just need to
- MPI Info stuff that Yoseph and Howard are working on.
- Marking a few MPI_ calls as deprecated.
- Nevermind, Don't mark as deprecated, since we're not MPI 4.0 compliant, so DONT mark as deprecated yet.
- No additional discussion. *
- Documentation
- Got a change in sphynx tools needed. No sure if there's a release yet.
- This fixes outputting issues in manpages.
- Process to update FAQ is to talk to Jeff or Harumi.
- Any changes in README or FAQ let them know to make changes in NEW docs.
- For now, make changes in ompi-www and README as usual and let them know.
- Got a change in sphynx tools needed. No sure if there's a release yet.
- Issue 9501 regression, needs to be fixed or reverted.
- No test for building from tarball, ensure we don't need pandoc.
- Github Project of [critical v5.0.x issues|https://github.com/open-mpi/ompi/projects/3]
- Issue #8983 If we partially disable OSC/TCP BTL - Not breaking MPI compliance, just breaking One-sided performance badly.
- https://github.com/open-mpi/ompi/pull/8984
- https://github.com/open-mpi/ompi/issues/7830
- users could fall back to using UCX or OFI, and not the BTLs.
- But that's a different can-of-worms
- Brian will take a look at issue.
- Described approach of rc1 on Sept 23, disabling any functionality that are blockers to allow for the rc.
- Worried that blockers might not be fixed in time, so will put in code to issue an error at runtime to prevent getting into those paths, and document it heavily.
- Issue #8983 If we partially disable OSC/TCP BTL - Not breaking MPI compliance, just breaking One-sided performance badly.
- Time and Date of BOF Nov 16 @ 12:15pm US Eastern Time.
- Was accepted for Open MPI
- Our Hybrid BoF will be mostly VIRTUAL BoF
- George may be there in person for tutorial (tho other tutorials will be fully-virtual)
- Bird of a Feather will be Virtual.
- George sent out an email to Amazon, Cisco, IBM, nVidia
- Our Hybrid BoF will be mostly VIRTUAL BoF
- Where do we drop slides? Jeff will send again. Deadline T-minus 1-week.
- Google Slides - Due Tuesday Nov 9th.
- Focus on v5.0
-
Reviewed and Approved against master: https://github.com/open-mpi/ompi/pulls?q=is%3Apr+is%3Aopen+base%3Amaster+review%3Aapproved
-
PR #9502 Joseph put up a PR for extensions. Curious about extensions in release branches.
- Brian will work some stuff in MCA code.
- Would like to see his proposal released in v5.0.x, but extension API might change over time.
- Extentions are ON by default, therefore changing them can break/change the ABI.
- API is not locked in, so best thing to do would be to NOT have this extention on by default.
-
Awaiting Review: https://github.com/open-mpi/ompi/pulls?q=is%3Apr+is%3Aopen+base%3Amaster+review%3Anone
- Most reviewers are NOT
- No update
- Don't do the old system, use this new system for v5.0.0
- No discussion [Open MPI 4.0 API Compliance Github Project|https://github.com/open-mpi/ompi/projects/2]
- Joseph says we're not dropping Info Keys as we SHOULD in the MPI 4.0.
- Can make it work easily for Comms because it would need to go down into the PMLs.
- Issue #9555
- Do we want this in OMPI v5.0.0?
- It'd be nice, because it's going to change behavior.
- But it might also be bad because it's a change in behavior (if users depending on MPI 3.1 behavior)
- But since it wasn't specified in MPI 3.1, so maybe whatever we do is okay.
- Jeff's going to review PR 9246
- Howard will review 7985
- Need to decide what to do with 8057
- Sessions branch, don't want to merge into master until possibly v5.0.1 gets out.
- It will complicate things in finalize/initialize code.
- Looking okay.
- Looks like something was wrong with MTT.
- That machine just got upgraded.
- Install fail is kinda weird.
- No discussion.