111
I’m Mike Papka, a computer scientist and big data enthusiast at Argonne National Laboratory, where I’m also the director of the Leadership Computing Facility - home to the world’s fifth-fastest supercomputer. AMA!
My laboratory leadership roles and research interests relate to high-performance computing in support of scientific discovery. Argonne National Laboratory (www.anl.gov) is one of the national laboratories under the U.S. Department of Energy (www.energy.gov). I’ll be answering your questions live, starting at 1 pm CST.
VERIFICATION:
http://www.flickr.com/photos/argonne/9470867437/in/set-72157632753426538
http://www.anl.gov/contributors/michael-papka
UPDATE: It's been a lot of fun. I need to run now. Thank you for the questions and comments! I will log back in periodically to answer more questions. - Mike
MichaelPapka9 karma
Answer to Q3: We use a bunch of tools, mostly developed by ALCF, to monitor the state of the machines. The developers themselves use both homegrown and commercial tools to develop their codes to optimize the performance of their application.
MichaelPapka8 karma
Answer to Q1: We don't really use big data to perform research. Rather, we see resources like Mira as instruments combined with scientific codes (programs) as being the generators of big data. The big data is the output of the simulations that is mined by the scientists for insight into the scientific questions they are trying to answer.
MichaelPapka8 karma
Answer to Q2: As I oversee the Leadership Computing Facility here, we want to squeeze as much performance out of our system as possible to enable our scientific user community to make breakthroughs. What we are seeing in today's large systems is a large number of computer cores to get computational power needed to solve the most pressing problems. The challenge is for the scientific codes to be able to use all of the cores efficiently.
DasBIscuits6 karma
Hi Mike, I'll be getting an interview there in October. Any chance I could meet you for a high five? Thanks for taking your time to come to reddit.
AaronSwartzWasAnHero3 karma
Can your computer work out what 9999999999999999999999999999 x 999999999999999999999999999 is?
MichaelPapka7 karma
It isn’t a race to the finish line for bragging rights. Supercomputing is technology that is essential to DOE and Office of Science missions. It’s good for science, it’s good for national security, and it represents a nation’s investment in its intellectual prowess.
MichaelPapka2 karma
A lot of the work happening on the computational resources is ongoing, but we are seeing exciting results such as http://www.alcf.anl.gov/projects/computing-dark-universe and http://www.alcf.anl.gov/projects/high-fidelity-simulation-complex-suspension-flow-practical-rheometry-2
NorbitGorbit1 karma
any safeguards in place to prevent users doing frivolous things with spare computing cycles? / what's the most frivolous thing you've seen done?
MichaelPapka1 karma
Time on Argonne leadership computing resources is awarded through three competitive programs: the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program http://www.doeleadershipcomputing.org/, the ASCR Leadership Computing Challenge program http://science.energy.gov/ascr/facilities/alcc/, and the Director’s Discretionary program https://www.alcf.anl.gov/directors-discretionary-dd-program. All project proposals go through the peer review process. Typical awards are hundreds of thousands to millions of core-hours per project. Using this approach we are confident that the time is used wisely.
fireballs6191 karma
Can you give a brief overview of the large scale computing systems that Argonne has in place? I know for a time it was the fastest in the world, but I don't know if this is still true.
How long do you think it will be before the developments and innovations that go into creating such a computing system reach the consumer market place?
Also, how is that coffee shop in TCS? Any good?
MichaelPapka2 karma
Answer to Q1: Argonne's current largest machine is Mira (http://www.youtube.com/watch?v=pAKEFYLQQdk). Its theoretical peak performance is 10 petaflops. It has 768,000 compute cores and 768 terabytes of memory, all housed within 48 computing racks that weigh 2 tons each. It's water cooled. To give you an idea of Mira’s computational power, it is capable of carrying out 10 quadrillion calculations per second. It debuted as third-fastest in the world in 2012. Our current position is fifth-fastest in the world.
fireballs6191 karma
Is the function of the 768 TB of memory comparable to the function that RAM plays in the every day PC?
In other words, does this computer have 768 TB of RAM?
MichaelPapka1 karma
Answer to Q2: A lot of the components contained in the current generation of systems, like the many-core processor, are already commodity. Other parts, like the 5-D torus, allow us to get the scalability we need, but it doesn't necessarily translate over into the consumer space. The number of cores per processor will continue to go up. The challenge is for the application programmer to use the cores effectively and efficiently.
babycup1 karma
This is great! I happen to be interning at the home of the 15th fastest supercomputer. So here are my questions: what workload management software do you use? And how do you allocate cores to users, by node or by core? Random questions I know, but they relate to some of the projects I've worked on
MichaelPapka1 karma
Answer to Q1: We use an Argonne-developed scheduler named Cobalt that schedules the jobs. Jobs are prioritized based on the type of allocation the user has. The users are allocated a total number of core-hours for a period of time, usually one year, and they can use them as they see fit. We allocate time in core-hours.
Gravy-Leg__1 karma
Mike, what emerging fields do you think will most benefit from supercomputing services like yours?
MichaelPapka3 karma
Biologists have incorporated high performance computing into their scientific pipeline and it has vastly expanded the field. I also think material science is making interesting use of computational science to accelerate discoveries. There are also very exciting things happening when several fields converge, such as climate modeling and computational economics.
pepperstuck1 karma
I know you guys aren't a weapons lab, so what other kinds of things do you need a supercomputer for in science? What are the fields that benefit most?
MichaelPapka2 karma
Many disciplines require HPC to make progress and virtually any process or problem can be advanced with HPC. Ongoing investigations range from the basic to the applied - everything from understanding water to designing better jet engines. Anyone from the science or engineering community whose research requires HPC can apply for time on DOE leadership computing machines.
MichaelPapka1 karma
Study hard and stay in school! My undergraduate degree is in physics and my graduate degrees are in computer science. For me, the combination of science and computer science has been helpful.
MichaelPapka2 karma
I have a long history with the University of Chicago's Flash Center, so their work on Type 1A supernovae is especially interesting to me. I also think William George's use of our computing resources to conduct large-scale simulations to advance the material and measurement science of concrete is neat. These are two examples that I like. Other important problems being addressed on the machine include the chemical stability of candidate electrolytes for lithium-air batteries, designing new pharmaceuticals, extending the performance and lifetime of corrosion-resistant materials, and investigating next-generation fuel sources. There are, of course, many many more examples.
Hashiba1 karma
Hello Mike,
Have you ever had an answer to an unknown question?
thanks for sharing your insights.
MichaelPapka1 karma
I assume you mean have any of the investigations ever turned up something the researcher didn't expect to find? Yes, but that's just part of the scientific process.
MichaelPapka1 karma
Mira is a great example of future machines in that it has a very high core count. Future machines will be highly parallel.
EvilPRGuy1 karma
Thanks for the AMA, this is very interesting so far.
This is really kind of a basic question...what does the input/output look like when doing this kind of big data work? Basically, how do you get the datasets in, and (I'm sure this varies by project) what do you get out that a human can examine? What's the workflow? I'm imagining it isn't an animated .Gif...
blkrockin6 karma
Thanks for doing this Mike!
View HistoryShare Link