My laboratory leadership roles and research interests relate to high-performance computing in support of scientific discovery. Argonne National Laboratory (www.anl.gov) is one of the national laboratories under the U.S. Department of Energy (www.energy.gov). I’ll be answering your questions live, starting at 1 pm CST.

VERIFICATION:

http://www.flickr.com/photos/argonne/9470867437/in/set-72157632753426538

http://www.anl.gov/contributors/michael-papka

UPDATE: It's been a lot of fun. I need to run now. Thank you for the questions and comments! I will log back in periodically to answer more questions. - Mike

Comments: 69 • Responses: 24  • Date: 

DasBIscuits6 karma

Hi Mike, I'll be getting an interview there in October. Any chance I could meet you for a high five? Thanks for taking your time to come to reddit.

MichaelPapka6 karma

Sure. Stop by. :)

blkrockin6 karma

Thanks for doing this Mike!

  • What are some examples of how you use big data in helping perform research?
  • What role does performance and scalability have in the use of tech for your research?
  • What tools or best practices do you use to help monitor performance?
  • Have you ever found yourself in a Snowden-like situation questioning the morality of your research or the results of that research?
  • What do you think about /DarknetPlan (or other meshnet projects)?

MichaelPapka9 karma

Answer to Q3: We use a bunch of tools, mostly developed by ALCF, to monitor the state of the machines. The developers themselves use both homegrown and commercial tools to develop their codes to optimize the performance of their application.

MichaelPapka8 karma

Answer to Q2: As I oversee the Leadership Computing Facility here, we want to squeeze as much performance out of our system as possible to enable our scientific user community to make breakthroughs. What we are seeing in today's large systems is a large number of computer cores to get computational power needed to solve the most pressing problems. The challenge is for the scientific codes to be able to use all of the cores efficiently.

MichaelPapka8 karma

Answer to Q1: We don't really use big data to perform research. Rather, we see resources like Mira as instruments combined with scientific codes (programs) as being the generators of big data. The big data is the output of the simulations that is mined by the scientists for insight into the scientific questions they are trying to answer.

MichaelPapka6 karma

Answer to Snowden question: No

everythingisopposite4 karma

Are there some computers that are smarter than humans?

MichaelPapka8 karma

Computers are only as smart as the people who program them.

Yer_a_wizard_Harry_4 karma

How does it feel to be in fifth place ?

MichaelPapka10 karma

I only cried for the first week. :)

AaronSwartzWasAnHero3 karma

Can your computer work out what 9999999999999999999999999999 x 999999999999999999999999999 is?

MichaelPapka11 karma

1.0E55

MichaelPapka8 karma

I'm pretty sure it can. :)

Briancg2 karma

Why is it so important for the U.S. to reach exascale first?

MichaelPapka7 karma

It isn’t a race to the finish line for bragging rights. Supercomputing is technology that is essential to DOE and Office of Science missions. It’s good for science, it’s good for national security, and it represents a nation’s investment in its intellectual prowess.

heytherealex2 karma

[deleted]

MichaelPapka2 karma

A lot of the work happening on the computational resources is ongoing, but we are seeing exciting results such as http://www.alcf.anl.gov/projects/computing-dark-universe and http://www.alcf.anl.gov/projects/high-fidelity-simulation-complex-suspension-flow-practical-rheometry-2

Gravy-Leg__1 karma

Mike, what emerging fields do you think will most benefit from supercomputing services like yours?

MichaelPapka3 karma

Biologists have incorporated high performance computing into their scientific pipeline and it has vastly expanded the field. I also think material science is making interesting use of computational science to accelerate discoveries. There are also very exciting things happening when several fields converge, such as climate modeling and computational economics.

pepperstuck1 karma

I know you guys aren't a weapons lab, so what other kinds of things do you need a supercomputer for in science? What are the fields that benefit most?

MichaelPapka2 karma

Many disciplines require HPC to make progress and virtually any process or problem can be advanced with HPC. Ongoing investigations range from the basic to the applied - everything from understanding water to designing better jet engines. Anyone from the science or engineering community whose research requires HPC can apply for time on DOE leadership computing machines.

LilySapphire1 karma

[deleted]

MichaelPapka1 karma

Study hard and stay in school! My undergraduate degree is in physics and my graduate degrees are in computer science. For me, the combination of science and computer science has been helpful.

Hashiba1 karma

Hello Mike,

Have you ever had an answer to an unknown question?

thanks for sharing your insights.

MichaelPapka1 karma

I assume you mean have any of the investigations ever turned up something the researcher didn't expect to find? Yes, but that's just part of the scientific process.

mcet121 karma

How does Mira point to the next steps in computing?

MichaelPapka1 karma

Mira is a great example of future machines in that it has a very high core count. Future machines will be highly parallel.

EvilPRGuy1 karma

Thanks for the AMA, this is very interesting so far.

This is really kind of a basic question...what does the input/output look like when doing this kind of big data work? Basically, how do you get the datasets in, and (I'm sure this varies by project) what do you get out that a human can examine? What's the workflow? I'm imagining it isn't an animated .Gif...

CoreBounty1 karma

What specs does your personal computer have?

MichaelPapka3 karma

Apple MacBook Pro.

JeffMurdoch1 karma

[deleted]

MichaelPapka2 karma

I have a long history with the University of Chicago's Flash Center, so their work on Type 1A supernovae is especially interesting to me. I also think William George's use of our computing resources to conduct large-scale simulations to advance the material and measurement science of concrete is neat. These are two examples that I like. Other important problems being addressed on the machine include the chemical stability of candidate electrolytes for lithium-air batteries, designing new pharmaceuticals, extending the performance and lifetime of corrosion-resistant materials, and investigating next-generation fuel sources. There are, of course, many many more examples.

NorbitGorbit1 karma

any safeguards in place to prevent users doing frivolous things with spare computing cycles? / what's the most frivolous thing you've seen done?

MichaelPapka1 karma

Time on Argonne leadership computing resources is awarded through three competitive programs: the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program http://www.doeleadershipcomputing.org/, the ASCR Leadership Computing Challenge program http://science.energy.gov/ascr/facilities/alcc/, and the Director’s Discretionary program https://www.alcf.anl.gov/directors-discretionary-dd-program. All project proposals go through the peer review process. Typical awards are hundreds of thousands to millions of core-hours per project. Using this approach we are confident that the time is used wisely.

Excelero1 karma

Specs?

fireballs6191 karma

Can you give a brief overview of the large scale computing systems that Argonne has in place? I know for a time it was the fastest in the world, but I don't know if this is still true.

How long do you think it will be before the developments and innovations that go into creating such a computing system reach the consumer market place?

Also, how is that coffee shop in TCS? Any good?

MichaelPapka2 karma

Answer to Q1: Argonne's current largest machine is Mira (http://www.youtube.com/watch?v=pAKEFYLQQdk). Its theoretical peak performance is 10 petaflops. It has 768,000 compute cores and 768 terabytes of memory, all housed within 48 computing racks that weigh 2 tons each. It's water cooled. To give you an idea of Mira’s computational power, it is capable of carrying out 10 quadrillion calculations per second. It debuted as third-fastest in the world in 2012. Our current position is fifth-fastest in the world.

fireballs6191 karma

Is the function of the 768 TB of memory comparable to the function that RAM plays in the every day PC?

In other words, does this computer have 768 TB of RAM?

MichaelPapka3 karma

Yes.

MichaelPapka1 karma

Answer to Q2: A lot of the components contained in the current generation of systems, like the many-core processor, are already commodity. Other parts, like the 5-D torus, allow us to get the scalability we need, but it doesn't necessarily translate over into the consumer space. The number of cores per processor will continue to go up. The challenge is for the application programmer to use the cores effectively and efficiently.

babycup1 karma

This is great! I happen to be interning at the home of the 15th fastest supercomputer. So here are my questions: what workload management software do you use? And how do you allocate cores to users, by node or by core? Random questions I know, but they relate to some of the projects I've worked on

MichaelPapka1 karma

Answer to Q1: We use an Argonne-developed scheduler named Cobalt that schedules the jobs. Jobs are prioritized based on the type of allocation the user has. The users are allocated a total number of core-hours for a period of time, usually one year, and they can use them as they see fit. We allocate time in core-hours.