I'm currently working in an R&D effort at ATOS BULL on bust buffers for supercomputer file systems (ATOS BULL Data Management Tool [PDF]. I also start to be in the loop about the SAGE2 H2020 European project.
I was previously a fellow at CERN working on the R&D of the future data acquisition system (DAQ) of LHCb in collaboration with Intel. The challenge of this project is making a DAQ without any hardware triggers which will be a first time for such a large detector.
Removing the first level hardware trigger will imply the capability of handling 40 Tb/s in the DAQ. In order to manage this high throughput we currently study high-speed fabrics coming from the HPC field : Intel Omni-Path and Mellanox InfiniBand. Both run at 100 Gb/s.
The final system will need to scale on 500 nodes to fulfil the required bandwidth considering 80 Gb/s per node. After being aggregated on those 500 nodes, the data will need to be dispatched on 4000 nodes for triggering decision before reaching long term storage for physics analysis.
As a side project I derived a NUMA profiling tool from MALT during this post-doc, now also open-sourced: NUMAPROF (NUMA Profiler)
I made my PhD. from 2010 to 2014 at the CEA working on memory management for supercomputers. I mostly developed a parallel memory allocator for NUMA architectures. I also worked on a kernel patch prototype to avoid the cost of memory clearing on page first touch.
This work was supervised by Marc Pérache (CEA) and William Jalby (Université de Versailles Saint-Quentin-en-Yvelines), thanks very much to both of them for leading me on this path especially to Marc for the day to day supervision.
I'm mostly interested in:
As side projects I also looking on:
During my PhD and first postdoc I had the chance to teach as assistant :
My main topic is memory management which mean looking what append inside one server. Although I worked on 16 processors (128 cores) Bull BCS servers for big part of my PhD. I also made many tests on the Intel former Xeon-Phi and now Knight Landing with 64 codes and 256 hyperthreads.
I also ran some codes at large scale on available supercomputers and aquire a little bit of knowledge on the topic :