Publications - William McDoniel

Peer Reviewed Conference Publications

  1. A Timer-Augmented Cost Function for Load Balanced DSMC
    13th International Meeting on High Performance Computing for Computational Science (VECPAR 18), Lecture Notes in Computer Science, Volume 11333, September 2019.
        author    = "William McDoniel and Paolo Bientinesi",
        title     = "A Timer-Augmented Cost Function for Load Balanced DSMC",
        booktitle = "13th International Meeting on High Performance Computing for Computational Science (VECPAR 18)",
        year      = 2019,
        volume    = 11333,
        series    = "Lecture Notes in Computer Science",
        month     = sep,
        url       = ""
    Due to a hard dependency between time steps, large-scale simulations of gas using the Direct Simulation Monte Carlo (DSMC) method proceed at the pace of the slowest processor. Scalability is therefore achievable only by ensuring that the work done each time step is as evenly apportioned among the processors as possible. Furthermore, as the simulated system evolves, the load shifts, and thus this load-balancing typically needs to be performed multiple times over the course of a simulation. Common methods generally use either crude performance models or processor-level timers. We combine both to create a timer-augmented cost function which both converges quickly and yields well-balanced processor decompositions. When compared to a particle-based performance model alone, our method achieves 2x speedup at steady-state on up to 1024 processors for a test case consisting of a Mach 9 argon jet impacting a solid wall.
  2. LAMMPS' PPPM Long-Range Solver for the Second Generation Xeon Phi
    High Performance Computing. ISC 2017., Volume 10266, pp. 61-78, Springer, June 2017.
    Lecture Notes in Computer Science.
        author    = "William McDoniel and Markus Höhnerbach and Rodrigo Canales and {Ahmed E.} Ismail and Paolo Bientinesi",
        title     = "LAMMPS' PPPM Long-Range Solver for the Second Generation Xeon Phi",
        booktitle = "High Performance Computing. ISC 2017.",
        year      = 2017,
        volume    = 10266,
        pages     = "61--78",
        address   = "Cham",
        month     = jun,
        publisher = "Springer",
        note      = "Lecture Notes in Computer Science",
        url       = ""
    Molecular Dynamics is an important tool for computational biologists, chemists, and materials scientists, consuming a sizable amount of supercomputing resources. Many of the investigated systems contain charged particles, which can only be simulated accurately using a long-range solver, such as PPPM. We extend the popular LAMMPS molecular dynamics code with an implementation of PPPM particularly suitable for the second generation Intel Xeon Phi. Our main target is the optimization of computational kernels by means of vectorization, and we observe speedups in these kernels of up to 12x. These improvements carry over to LAMMPS users, with overall speedups ranging between 2-3x, without requiring users to retune input parameters. Furthermore, our optimizations make it easier for users to determine optimal input parameters for attaining top performance.