-
Notifications
You must be signed in to change notification settings - Fork 862
Meeting 2017 01
Tomislav Janjusic edited this page Jan 6, 2024
·
1 revision
- Start: 9am, Jan 24th
- Finish: 1pm, Jan 26th
- Attendance fee: NONE
- Tuesday
- Cisco Building 6, 375 E Tasman Dr, Milpitas, CA 95035
- https://goo.gl/maps/WEnxXAatcRA2
- Pissarro conference room, 4th floor
- Wednesday
- Cisco Building I (capitol letter I), 285 W Tasman Dr, San Jose, CA 95134
- https://goo.gl/maps/yp53M5nKBsQ2
- Kodiak conference room, 2nd floor
- Thursday
- Cisco Building 2, 3800 Zanker Road, San Jose, California 95134
- https://goo.gl/maps/z3NBLPVFRWr
- Eger conference room, 3rd floor
Webex connectivity to the meeting rooms is possible, but the dialin information is not posted here to the public wiki page. Please contact Jeff Squyres if you'd like to attend remotely.
Please add your name to the wiki list below if you are coming to the meeting:
- Ralph Castain (Intel)
- George Bosilca (UTK)
- Shinji Sumimoto (Fujitsu)
- Howard Pritchard (LANL)
- Geoff Paulsen (IBM)
- Jeff Squyres (Cisco)
- Andrew Friedley (Intel)
- Annu Dasari (Intel)
- Artem Polyakov (Mellanox)
- Joshua Ladd (Mellanox)
- Nathan Hjelmn (LANL)
- Matias Cabral (Intel)
Geoffroy Vallee (ORNL)- David Bernholdt (ORNL)
- Brian Barrett (AWS)
- Sylvain Jeaugey (NVIDIA)
- Chris Chambreau (LLNL)
- ***** Jeff has registered up to this point. If additional people sign up, please let Jeff Squyres know so that he can get you a Cisco badge and signed up on the guest wifi. Thanks!
Attending Remotely:
- Josh Hursey - IBM (Available from 6:30am-3pm Pacific) (Added a ☎️ icon next to the items I'd like to call in for, if possible)
- Murali Emani - Livermore
- Local attendees: download the Cisco Proximity app
- Cisco webex plugin for Chrome: update now!
- ☎️ v3.x planning
- Select release managers
- ☎️ OpenMP + OMPI
- Can we use PMIx to coordinate binding between the layers and the RM?
- Coordinate meeting for broader audience, point people to GoogleGroup mailing list
- Per https://www.mail-archive.com/devel@lists.open-mpi.org/msg19874.html:
- Are the defaults hostile to running by default on RoCE?
- Is the btl openib code bit-rotted / defunct? Should we amend the docs and disable/remove the code?
- ☎️ Continuation of
-prot
forward-adaptation proposal from IBM (which got morphed into-net
discussions in February and August 2016 meetings)- Previous meeting notes indicate that the schitzo framework could be useful here
- Initial proposal: see point 20 in https://github.com/open-mpi/ompi/wiki/Meeting-2016-02
- Discussion: search for
-net
in the Feb meeting minutes: https://github.com/open-mpi/ompi/wiki/Meeting-2016-02-Minutes - Further discussion: search for
-net
in the Aug meeting minutes: https://github.com/open-mpi/ompi/wiki/Meeting-Minutes-2016-08
- Previous meeting notes indicate that the schitzo framework could be useful here
- ☎️ Update: IBM features coming upstream
- ☎️ Check hetero-node detection
- Check nightly tarball generation logic
- Ralph/Nathan: debug static ports, DVM at scale, remaining race conditions in OOB
- ☎️ OMPI v2.1 planning
- PMIx v1.2.1 or v2.0? [telecon: v1.2.1]
- Auto-set RTE barriers off when async modex?
- ORTE-related scaling updates
- PMIx v1.2.1 or v2.0? [telecon: v1.2.1]
- Jeff/Howard: issue / PR cleanup
- Let's review all old issues and PRs.
- For those that are still relevant, let's assign milestones.
- And let's close those that are no longer relevant.
- ☎️ Revisit
--host
and--hostfile
behavior. See PR #1353 - ☎️ Performance regression in MTLs. See Issue #2644
- ☎️ Discuss work remaining to do for Issue #2151 ("'nonblocking3' BVT test fails")
- Discussion started at SC'16, but no progress since then.
- ☎️ Overview of new DOE Exascale (ECP) project focusing on Open MPI
- ☎️ Exposing PMIx functionality
- Martin Schulz proposes MPI_T wrapper - do we want to pursue it?
- ☎️ Memory footprint reduction
- Critical on large-scale, complex architectures
- HWLOC topology tree is a driver - can we devise strategies for not holding this object in memory for the entire execution?
- See this PR for details and possible solutions
- Can we do this for v2.x?
- Should we allow variadiac macros? They're part of C99.
- E.g., OBJ_NEW to allow constructors with arguments.
- Moving verbosity, enable, priority, (and other?) parameters into the MCA super class to be shared across all MCA components?
- Do we still care about 32 bit on x86 platforms?
- I.e., should we disable it in configure?
- Only asking just to trim cases that we don't care about -- there's no specific technical reason to kill / keep 32 bit on x86.
- Jeff/Ralph: finish SPI onboarding, SPI logo on web site?
- ☎️ Fujitsu Status Update
- BTL replacement with UCX and libfabric?
- ☎️
btl
s are opened even ifpml
andosc
components will not use them.- Can we devise a way to avoid loading the
btl
s if they are not going to be used in a run? - This is a bit of a chicken-and-egg problem due to component selection...
- Can we devise a way to avoid loading the
- MTL thread support
- info requested on OMPI threading model
- what level of thread support will OMPI shoot for?
- ☎️ v3.x planning
- Select release managers
- When do we branch?
- Default settings for v3.x
- async modex = ON?
- no RTE barrier on init = ON?
- What if any upgrade path for existing applications? Recompile / relink applications? Command line differences?
- If no need to rev major version, should we?
- Binary / API backwards compatibility testing? per commit?
- ☎️ Performance improvements from master over v2.x? Where do we need to work?
- ☎️ Performance Regression Monitoring options (e.g. a dashboard?)
- OMPIO improvements - MPI persistent request extensions to allow picking up newer ROMIO (IBM?)