HPX V1.2.0 (Nov 12, 2018)
Contents
HPX V1.2.0 (Nov 12, 2018)#
General changes#
Here are some of the main highlights and changes for this release:
Thanks to the work of our Google Summer of Code student, Nikunj Gupta, we now have a new implementation of
hpx_main.hpp
on supported platforms (Linux, BSD and MacOS). This is intended to be a less fragile drop-in replacement for the old implementation relying on preprocessor macros. The new implementation does not require changes if you are using the CMake or pkg-config. The old behaviour can be restored by settingHPX_WITH_DYNAMIC_HPX_MAIN=OFF
during CMake configuration. The implementation on Windows is unchanged.We have added functionality to allow passing scheduling hints to our schedulers. These will allow us to create executors that for example target a specific NUMA domain or allow for HPX threads to be pinned to a particular worker thread.
We have significantly improved the performance of our futures implementation by making the shared state atomic.
We have replaced Boostbook by Sphinx for our documentation. This means the documentation is easier to navigate with built-in search and table of contents. We have also added a quick start section and restructured the documentation to be easier to follow for new users.
We have added a new option to the
--hpx:threads
command line option. It is now possible to usecores
to tell HPX to only use one worker thread per core, unlike the existing optionall
which uses one worker thread per processing unit (processing unit can be a hyperthread if hyperthreads are available). The default value of--hpx:threads
has also been changed tocores
as this leads to better performance in most cases.All command line options can now be passed alongside configuration options when initializing HPX. This means that some options that were previously only available on the command line can now be set as configuration options.
HPXMP is a portable, scalable, and flexible application programming interface using the OpenMP specification that supports multi-platform shared memory multiprocessing programming in C and C++. HPXMP can be enabled within HPX by setting
DHPX_WITH_HPXMP=ON
during CMake configuration.Two new performance counters were added for measuring the time spent doing background work.
/threads/time/background-work-duration
returns the time spent doing background on a given thread or locality, while/threads/time/background-overhead
returns the fraction of time spent doing background work with respect to the overall time spent running the scheduler. The new performance counters are disabled by default and can be turned on by settingHPX_WITH_BACKGROUND_THREAD_COUNTERS=ON
during CMake configuration.The idling behaviour of HPX has been tweaked to allow for faster idling. This is useful in interactive applications where the HPX worker threads may not have work all the time. This behaviour can be tweaked and turned off as before with
HPX_WITH_THREAD_MANAGER_IDLE_BACKOFF=OFF
during CMake configuration.It is now possible to register callback functions for HPX worker thread events. Callbacks can be registered for starting and stopping worker threads, and for when errors occur.
Breaking changes#
The implementation of
hpx_main.hpp
has changed. If you are using custom Makefiles you will need to make changes. Please see the documentation on using Makefiles for more details.The default value of
--hpx:threads
has changed fromall
tocores
. The new optioncores
only starts one worker thread per core.We have dropped support for Boost 1.56 and 1.57. The minimal version of Boost we now test is 1.58.
Our
boost::format
-based formatting implementation has been revised and replaced with a custom implementation. This changes the formatting syntax and requires changes if you are relying onhpx::util::format
orhpx::util::format_to
. The pull request for this change contains more information: PR #3266.The following deprecated options have now been completely removed:
HPX_WITH_ASYNC_FUNCTION_COMPATIBILITY
,HPX_WITH_LOCAL_DATAFLOW
,HPX_WITH_GENERIC_EXECUTION_POLICY
,HPX_WITH_BOOST_CHRONO_COMPATIBILITY
,HPX_WITH_EXECUTOR_COMPATIBILITY
,HPX_WITH_EXECUTION_POLICY_COMPATIBILITY
, andHPX_WITH_TRANSFORM_REDUCE_COMPATIBILITY
.
Closed issues#
Issue #3538 - numa handling incorrect for hwloc 2
Issue #3533 - Cmake version 3.5.1does not work (git ff26b35 2018-11-06)
Issue #3526 - Failed building hpx-1.2.0-rc1 on Ubuntu16.04 x86-64 Virtualbox VM
Issue #3512 - Build on aarch64 fails
Issue #3475 - HPX fails to link if the MPI parcelport is enabled
Issue #3462 - CMake configuration shows a minor and inconsequential failure to create a symlink
Issue #3461 - Compilation Problems with the most recent Clang
Issue #3460 - Deadlock when create_partitioner fails (assertion fails) in debug mode
Issue #3455 - HPX build failing with HWLOC errors on POWER8 with hwloc 1.8
Issue #3438 - HPX no longer builds on IBM POWER8
Issue #3426 - hpx build failed on MacOS
Issue #3424 - CircleCI builds broken for forked repositories
Issue #3422 - Benchmarks in tests.performance.local are not run nightly
Issue #3408 - CMake Targets for HPX
Issue #3399 - processing unit out of bounds
Issue #3395 - Floating point bug in hpx/runtime/threads/policies/scheduler_base.hpp
Issue #3378 - compile error with lcos::communicator
Issue #3376 - Failed to build HPX with APEX using clang
Issue #3366 - Adapted Safe_Object example fails for –hpx:threads > 1
Issue #3360 - Segmentation fault when passing component id as parameter
Issue #3358 - HPX runtime hangs after multiple (~thousands) start-stop sequences
Issue #3352 - Support TCP provider in libfabric ParcelPort
Issue #3342 - undefined reference to __atomic_load_16
Issue #3339 - setting command line options/flags from init cfg is not obvious
Issue #3325 - AGAS migrates components prematurely
Issue #3321 - hpx bad_parameter handling is awful
Issue #3318 - Benchmarks fail to build with C++11
Issue #3304 - hpx::threads::run_as_hpx_thread does not properly handle exceptions
Issue #3300 - Setting pu step or offset results in no threads in default pool
Issue #3297 - Crash with APEX when running Phylanx lra_csv with > 1 thread
Issue #3296 - Building HPX with APEX configuration gives compiler warnings
Issue #3290 - make tests failing at hello_world_component
Issue #3285 - possible compilation error when “using namespace std;” is defined before including “hpx” headers files
Issue #3280 - HPX fails on OSX
Issue #3272 - CircleCI does not upload generated docker image any more
Issue #3270 - Error when compiling CUDA examples
Issue #3267 -
tests.unit.host_.block_allocator
fails occasionallyIssue #3264 - Possible move to Sphinx for documentation
Issue #3263 - Documentation improvements
Issue #3259 -
set_parcel_write_handler
test fails occasionallyIssue #3258 - Links to source code in documentation are broken
Issue #3247 - Rare
tests.unit.host_.block_allocator
test failure on 1.1.0-rc1Issue #3244 - Slowing down and speeding up an interval_timer
Issue #3215 - Cannot build both tests and examples on MSVC with pseudo-dependencies enabled
Issue #3195 - Unnecessary customization point route causing performance penalty
Issue #3088 - A strange thing in parallel::sort.
Issue #2650 - libfabric support for passive endpoints
Issue #1205 - TSS is broken
Closed pull requests#
PR #3542 - Fix numa lookup from pu when using hwloc 2.x
PR #3541 - Fixing the build system of the MPI parcelport
PR #3540 - Updating HPX people section
PR #3539 - Splitting test to avoid OOM on CircleCI
PR #3537 - Fix guided exec
PR #3536 - Updating grants which support the LSU team
PR #3535 - Fix hiding of docker credentials
PR #3534 - Fixing #3533
PR #3532 - fixing minor doc typo –hpx:print-counter-at arg
PR #3530 - Changing APEX default tag to v2.1.0
PR #3529 - Remove leftover security options and documentation
PR #3528 - Fix hwloc version check
PR #3524 - Do not build guided pool examples with older GCC compilers
PR #3523 - Fix logging regression
PR #3522 - Fix more warnings
PR #3521 - Fixing argument handling in induction and reduction clauses for parallel::for_loop
PR #3520 - Remove docs symlink and versioned docs folders
PR #3519 - hpxMP release
PR #3518 - Change all steps to use new docker image on CircleCI
PR #3516 - Drop usage of deprecated facilities removed in C++17
PR #3515 - Remove remaining uses of Boost.TypeTraits
PR #3513 - Fixing a CMake problem when trying to use libfabric
PR #3508 - Remove memory_block component
PR #3507 - Propagating the MPI compile definitions to all relevant targets
PR #3503 - Update documentation colors and logo
PR #3502 - Fix bogus `throws` bindings in scheduled_thread_pool_impl
PR #3501 - Split parallel::remove_if tests to avoid OOM on CircleCI
PR #3500 - Support NONAMEPREFIX in add_hpx_library()
PR #3497 - Note that cuda support requires cmake 3.9
PR #3495 - Fixing dataflow
PR #3493 - Remove deprecated options for 1.2.0 part 2
PR #3492 - Add CUDA_LINK_LIBRARIES_KEYWORD to allow PRIVATE keyword in linkage t…
PR #3491 - Changing Base docker image
PR #3490 - Don’t create tasks immediately with hpx::apply
PR #3489 - Remove deprecated options for 1.2.0
PR #3488 - Revert “Use BUILD_INTERFACE generator expression to fix cmake flag exports”
PR #3487 - Revert “Fixing type attribute warning for transfer_action”
PR #3485 - Use BUILD_INTERFACE generator expression to fix cmake flag exports
PR #3483 - Fixing type attribute warning for transfer_action
PR #3481 - Remove unused variables
PR #3480 - Towards a more lightweight transfer action
PR #3479 - Fix FLAGS - Use correct version of target_compile_options
PR #3478 - Making sure the application’s exit code is properly propagated back to the OS
PR #3476 - Don’t print docker credentials as part of the environment.
PR #3473 - Fixing invalid cmake code if no jemalloc prefix was given
PR #3472 - Attempting to work around recent clang test compilation failures
PR #3471 - Enable jemalloc on windows
PR #3470 - Updates readme
PR #3468 - Avoid hang if there is an exception thrown during startup
PR #3467 - Add compiler specific fallthrough attributes if C++17 attribute is not available
PR #3466 - - bugfix : fix compilation with llvm-7.0
PR #3465 - This patch adds various optimizations extracted from the thread_local_allocator work
PR #3464 - Check for forked repos in CircleCI docker push step
PR #3463 - - cmake : create the parent directory before symlinking
PR #3459 - Remove unused/incomplete functionality from util/logging
PR #3458 - Fix a problem with scope of CMAKE_CXX_FLAGS and hpx_add_compile_flag
PR #3457 - Fixing more size_t -> int16_t (and similar) warnings
PR #3456 - Add #ifdefs to topology.cpp to support old hwloc versions again
PR #3454 - Fixing warnings related to silent conversion of size_t –> int16_t
PR #3451 - Add examples as unit tests
PR #3450 - Constexpr-fying bind and other functional facilities
PR #3446 - Fix some thread suspension timeouts
PR #3445 - Fix various warnings
PR #3443 - Only enable service pool config options if pools are enabled
PR #3441 - Fix missing closing brackets in documentation
PR #3439 - Use correct MPI CXX libraries for MPI parcelport
PR #3436 - Add projection function to find_* (and fix very bad bug)
PR #3435 - Fixing 1205
PR #3434 - Fix threads cores
PR #3433 - Add Heise Online to release announcement list
PR #3432 - Don’t track task dependencies for distributed runs
PR #3431 - Circle CI setting changes for hpxMP
PR #3430 - Fix unused params warning
PR #3429 - One thread per core
PR #3428 - This suppresses a deprecation warning that is being issued by MSVC 19.15.26726
PR #3427 - Fixes #3426
PR #3425 - Use source cache and workspace between job steps on CircleCI
PR #3421 - Add CDash timing output to future overhead test (for graphs)
PR #3420 - Add guided_pool_executor
PR #3419 - Fix typo in CircleCI config
PR #3418 - Add sphinx documentation
PR #3415 - Scheduler NUMA hint and shared priority scheduler
PR #3414 - Adding step to synchronize the APEX release
PR #3413 - Fixing multiple defines of APEX_HAVE_HPX
PR #3412 - Fixes linking with libhpx_wrap error with BSD and Windows based systems
PR #3410 - Fix typo in CMakeLists.txt
PR #3409 - Fix brackets and indentation in existing_performance_counters.qbk
PR #3407 - Fix unused param and extra ; warnings emitted by gcc 8.x
PR #3406 - Adding thread local allocator and use it for future shared states
PR #3405 - Adding DHPX_HAVE_THREAD_LOCAL_STORAGE=ON to builds
PR #3404 - fixing multiple definition of main() in linux
PR #3402 - Allow debug option to be enabled only for Linux systems with dynamic main on
PR #3401 - Fix cuda_future_helper.h when compiling with C++11
PR #3400 - Fix floating point exception scheduler_base idle backoff
PR #3398 - Atomic future state
PR #3397 - Fixing code for older gcc versions
PR #3396 - Allowing to register thread event functions (start/stop/error)
PR #3394 - Fix small mistake in primary_namespace_server.cpp
PR #3393 - Explicitly instantiate configured schedulers
PR #3392 - Add performance counters background overhead and background work duration
PR #3391 - Adapt integration of HPXMP to latest build system changes
PR #3390 - Make AGAS measurements optional
PR #3389 - Fix deadlock during shutdown
PR #3388 - Add several functionalities allowing to optimize synchronous action invocation
PR #3387 - Add cmake option to opt out of fail-compile tests
PR #3386 - Adding support for boost::container::small_vector to dataflow
PR #3385 - Adds Debug option for hpx initializing from main
PR #3384 - This hopefully fixes two tests that occasionally fail
PR #3383 - Making sure thread local storage is enable for hpxMP
PR #3382 - Fix usage of HPX_CAPTURE together with default value capture [=]
PR #3381 - Replace undefined instantiations of uniform_int_distribution
PR #3380 - Add missing semicolons to uses of HPX_COMPILER_FENCE
PR #3379 - Fixing #3378
PR #3377 - Adding build system support to integrate hpxmp into hpx at the user’s machine
PR #3375 - Replacing wrapper for __libc_start_main with main
PR #3374 - Adds hpx_wrap to HPX_LINK_LIBRARIES which links only when specified.
PR #3373 - Forcing cache settings in HPXConfig.cmake to guarantee updated values
PR #3372 - Fix some more c++11 build problems
PR #3371 - Adds HPX_LINKER_FLAGS to HPX applications without editing their source codes
PR #3370 - util::format: add type_specifier<> specializations for %!s(MISSING) and %!l(MISSING)s
PR #3369 - Adding configuration option to allow explicit disable of the new hpx_main feature on Linux
PR #3368 - Updates doc with recent hpx_wrap implementation
PR #3367 - Adds Mac OS implementation to hpx_main.hpp
PR #3365 - Fix order of hpx libs in HPX_CONF_LIBRARIES.
PR #3363 - Apex fixing null wrapper
PR #3361 - Making sure all parcels get destroyed on an HPX thread (TCP pp)
PR #3359 - Feature/improveerrorforcompiler
PR #3357 - Static/dynamic executable implementation
PR #3355 - Reverting changes introduced by #3283 as those make applications hang
PR #3354 - Add external dependencies to HPX_LIBRARY_DIR
PR #3353 - Fix libfabric tcp
PR #3351 - Move obsolete header to tests directory.
PR #3350 - Renaming two functions to avoid problem described in #3285
PR #3349 - Make idle backoff exponential with maximum sleep time
PR #3347 - Replace simple_component* with component* in the Documentation
PR #3346 - Fix CMakeLists.txt example in quick start
PR #3345 - Fix automatic setting of HPX_MORE_THAN_64_THREADS
PR #3344 - Reduce amount of information printed for unknown command line options
PR #3343 - Safeguard HPX against destruction in global contexts
PR #3341 - Allowing for all command line options to be used as configuration settings
PR #3340 - Always convert inspect results to JUnit XML
PR #3336 - Only run docker push on master on CircleCI
PR #3335 - Update description of hpx.os_threads config parameter.
PR #3334 - Making sure early logging settings don’t get mixed with others
PR #3333 - Update CMake links and versions in documentation
PR #3332 - Add notes on target suffixes to CMake documentation
PR #3331 - Add quickstart section to documentation
PR #3330 - Rename resource_partitioner test to avoid conflicts with pseudodependencies
PR #3328 - Making sure object is pinned while executing actions, even if action returns a future
PR #3327 - Add missing std::forward to tuple.hpp
PR #3326 - Make sure logging is up and running while modules are being discovered.
PR #3324 - Replace C++14 overload of std::equal with C++11 code.
PR #3323 - Fix a missing apex thread data (wrapper) initialization
PR #3320 - Adding support for -std=c++2a (define HPX_WITH_CXX2A=On)
PR #3319 - Replacing C++14 feature with equivalent C++11 code
PR #3317 - Fix compilation with VS 15.7.1 and /std:c++latest
PR #3316 - Fix includes for 1d_stencil_*_omp examples
PR #3314 - Remove some unused parameter warnings
PR #3313 - Fix pu-step and pu-offset command line options
PR #3312 - Add conversion of inspect reports to JUnit XML
PR #3311 - Fix escaping of closing braces in format specification syntax
PR #3310 - Don’t overwrite user settings with defaults in registration database
PR #3309 - Fixing potential stack overflow for dataflow
PR #3308 - This updates the .clang-format configuration file to utilize newer features
PR #3306 - Marking migratable objects in their gid to allow not handling migration in AGAS
PR #3305 - Add proper exception handling to run_as_hpx_thread
PR #3303 - Changed std::rand to a better inbuilt PRNG Generator
PR #3302 - All non-migratable (simple) components now encode their lva and component type in their gid
PR #3301 - Add nullptr_t overloads to resource partitioner
PR #3298 - Apex task wrapper memory bug
PR #3295 - Fix mistakes after merge of CircleCI config
PR #3294 - Fix partitioned vector include in partitioned_vector_find tests
PR #3293 - Adding emplace support to promise and make_ready_future
PR #3292 - Add new cuda kernel synchronization with hpx::future demo
PR #3291 - Fixes #3290
PR #3289 - Fixing Docker image creation
PR #3288 - Avoid allocating shared state for wait_all
PR #3287 - Fixing /scheduler/utilization/instantaneous performance counter
PR #3286 - dataflow() and future::then() use sync policy where possible
PR #3284 - Background thread can use relaxed atomics to manipulate thread state
PR #3283 - Do not unwrap ready future
PR #3282 - Fix virtual method override warnings in static schedulers
PR #3281 - Disable set_area_membind_nodeset for OSX
PR #3279 - Add two variations to the future_overhead benchmark
PR #3278 - Fix circleci workspace
PR #3277 - Support external plugins
PR #3276 - Fix missing parenthesis in hello_compute.cu.
PR #3274 - Reinit counters synchronously in reinit_counters test
PR #3273 - Splitting tests to avoid compiler OOM
PR #3271 - Remove leftover code from context_generic_context.hpp
PR #3269 - Fix bulk_construct with count = 0
PR #3268 - Replace constexpr with HPX_CXX14_CONSTEXPR and HPX_CONSTEXPR
PR #3266 - Replace boost::format with custom sprintf-based implementation
PR #3265 - Split parallel tests on CircleCI
PR #3262 - Making sure documentation correctly links to source files
PR #3261 - Apex refactoring fix rebind
PR #3260 - Isolate performance counter parser into a separate TU
PR #3256 - Post 1.1.0 version bumps
PR #3254 - Adding trait for actions allowing to make runtime decision on whether to execute it directly
PR #3253 - Bump minimal supported Boost to 1.58.0
PR #3251 - Adds new feature: changing interval used in interval_timer (issue 3244)
PR #3239 - Changing std::rand() to a better inbuilt PRNG generator.
PR #3234 - Disable background thread when networking is off
PR #3232 - Clean up suspension tests
PR #3230 - Add optional scheduler mode parameter to create_thread_pool function
PR #3228 - Allow suspension also on static schedulers
PR #3163 - libfabric parcelport w/o HPX_PARCELPORT_LIBFABRIC_ENDPOINT_RDM
PR #3036 - Switching to CircleCI 2.0