Related News- HPC Wire

Subscribe to Related News- HPC Wire feed
Since 1987 - Covering the Fastest Computers in the World and the People Who Run Them
Updated: 5 min 14 sec ago

New Technologies, Industry Luminaries, and Outstanding Top500 Results Highlight Intel’s SC17 Presence

1 hour 13 min ago

Last week, the rhetorical one-two punch of the Intel® HPC Developer Conference and Supercomputing 2017 offered global HPC aficionados new insights into the direction of advanced HPC technologies, and how those tools will empower the future of discovery and innovation. In case you missed it, here is a breakdown of all the action!

The Intel® HPC Developer Conference 2017 kicked off the week with 700+ attendees hearing industry luminaries share best practices and techniques to realize the full potential of the latest HPC tools and approaches. Intel’s Joe Curley, Gadi Singer, and Dr. Al Gara took the main stage and offered a thought-provoking keynote outlining the intertwined futures of HPC and AI. As individuals who are helping architect the future of HPC, the three speakers discussed the adaptation of AI into workflows, technological opportunities to enable it, and the driving forces behind the future range of architectures and systems and solutions.  Attendees also gained hands-on experience with Intel platforms, obtained insights to maximize software efficiency and advance the work of researchers and organizations of all sizes, and networked with peers and industry experts. Watch the Intel HPC Developers Conference website as we publish the videos and multiple technical sessions over the next few weeks.

Then with the kickoff of SC17, Intel announced outstanding industry acceptance results for Intel® Xeon® Scalable processors and Intel® Omni-Path Architecture (Intel® OPA). Intel also provided additional insights into AI, machine learning and the latest HPC technologies.

Intel detailed how Intel® Xeon® Scalable processors have delivered the fastest adoption rate of any new Intel Xeon processor on the Top5001. The latest processor surpasses the previous generation’s capability with a 63% improvement in performance across 13 common HPC applications, and up to double the number of FLOPS per clock2.  On the November 2017 Top500 list, Intel-powered supercomputers accounted for six of the top 10 systems and a record high of 471 out of 500 systems. Also, Intel powered all 137 new systems added to the November list.

To date, 18 HPC systems utilizing the new processors reside among November 2017’s Top500 list of the world’s fastest supercomputers, each delivering total performance surpassing 25 petaFLOPS. Other organizations using the new Intel Xeon Scalable processors at the heart of their HPC systems report substantial boosts in system speed, resulting in 110 world performance records1.

In addition to the processors, Intel OPA momentum continued with systems using Intel OPA delivering a combined 80 petaFLOPS, surpassing the June 2017 Top500 numbers by nearly 20%. Among those organizations using 100Gb fabric for their Top500 HPC systems, Intel OPA now connects almost 60 percent of nodes3.

The demos in Intel’s booth allowed attendees to see how the power of these technologies enables advancements across the HPC industry. Taking center stage at the Intel’s booth was a virtual-reality motorsports demonstration where visitors experienced the power of advanced technology which will enable the next generation of vehicles.

Attendees seeking a deeper dive into the technologies joined “Nerve Center Sessions” at the Intel pavilion where they gained cutting-edge insights from industry luminaries and joined the presenters for small table discussions afterwards.

With recent AI advancements, are humans the only ones making “intelligent” decisions? Intel Fellow Pradeep Dubey, who is also the Director of Parallel Computing Lab presented Artificial Intelligence and The Virtuous Cycle of Compute. He took the opportunity to explain how the convergence of Big Data, AI, and algorithmic advances transform the relationship between humans and HPC systems.

In case you missed the conference this year you can get more detail from Intel’s SC17 page and follow Intel on Twitter @intelHPC for ongoing insights. And to learn more about the latest Intel HPC and AI technologies, check out www.intel.com/hpc.

 

~~~~~~~~~~

1 https://newsroom.intel.com/news/sc17-intel-boasts-record-breaking-top500-position-fastest-ramp-new-xeon-processor-list/

2 Up to 1.63x Gains based on Geomean of Weather Research Forecasting – Conus 12Km, HOMME, LSTCLS-DYNA Explicit, INTES PERMAS V16, MILC, GROMACS water 1.5M_pme, VASPSi256, NAMDstmv, LAMMPS, Amber GB Nucleosome, Binomial option pricing, Black-Scholes, Monte Carlo European options. Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance/datacenter.

3  Intel estimate based on Top500 data and other public sources

 

The post New Technologies, Industry Luminaries, and Outstanding Top500 Results Highlight Intel’s SC17 Presence appeared first on HPCwire.

SC Bids Farewell to Denver, Heads to Dallas for 30th

Fri, 11/17/2017 - 23:23

After a jam-packed four-day expo and intensive six-day technical program, SC17 has wrapped up another successful event that brought together nearly 13,000 visitors to the Colorado Convention Center in Denver for the largest HPC conference in the world. In keeping with the conference theme of HPC Connects, General Chair Bernd Mohr of Juelich Supercomputing, the event’s first international chair, advanced strong global attendance: there were 2,800 international attendees from 71 countries, including 122 international exhibitors.

Asked what they liked most about the 29th annual Supercomputing Conference, you’ll get a lot of folks geeking out over the technology but ultimately it’s the community that keeps them coming back to connect with old friends and meet new ones. Other fan favorites that had attendees buzzing this year were the electrifying student cluster competition (with first-ever planned power shutoffs!), record-setting SCinet activities, impressive Gordon Bell results and an out-of-this-world keynote; plus a lineup of 12 invited talks, including a presentation by the supercomputing pioneer himself, Gordon Bell.

On Tuesday morning, SC17 Chair Mohr welcomed thousands of SC attendees to the conference center ballroom to share conference details and introduce the SC17 keynote, “Life, the Universe and Computing: The Story of the SKA Telescope.” In front of a stunning widescreen visual display, Professor Philip Diamond, director general of the international Square Kilometer Array (SKA) project, and Dr. Rosie Bolton, SKA Regional Centre project scientist, described the SKA’s vision and strategy to map and study the entire sky in greater detail than ever before. Everyone we spoke with was enthralled by the keynote and several long-time attendees said it was the best one yet. Read more about it in our feature coverage, “HPC Powers SKA Efforts to Peer Deep into the Cosmos.”

In his introduction, Mohr pointed out that the discovery of gold about five miles away from the conference site in 1858 led to the founding of Denver. “[It] is fitting,” said Mohr” because today high performance computing is at the forefront of a new gold rush, a rush to discovery using an ever-growing flood of information and data. Computing is now essential to science discovery like never before. We are the modern pioneers pushing the bounds of science for the betterment of society.”

One of the marvels of the show each year is SCinet, the fastest most powerful scientific network in the world for one week. This year SCinet broke multiple records, achieving 3.63 Tbps of bandwidth, as well as the most floor fiber laid and most circuits ever. SCinet takes three weeks to set up and operates with over $66 million in loaned state-of-the-art equipment and software. In the weeks ahead, we will have more coverage on how this fascinating feat is pulled off as well as the ground-breaking networking research that is enabled.

The 50th edition of the Top500, covered in-depth here, was announced on Monday. The list will go down in history as the year China pulled ahead in multiple dimensions, not just with the number one system (which China has claimed for ten consecutive lists), but with the highest number of systems and largest flops share.

A roundup of benchmark winners from SC17:

Top500: China’s Sunway TaihuLight system (93 petaflops)

Green500: Japan’s RIKEN ZettaScaler Shoubu system B (17 gigaflops/watt)

HPCG: Japan’s RIKEN K computer (0.6027 petaflops)

Graph500: Japan’s RIKEN K computer

For the second year, China won the coveted Gordon Bell prize, the Nobel prize of supercomputing presented by the Association for Computing Machinery each year in association with SC. The 12-member Chinese team employed the world’s fastest supercomputer, Sunway TaihuLight to simulate 20th century’s most devastating earthquake, which occurred in Tangshan, China in 1976. The research project “18.9-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of 18-Hz and 8-Meter Scenarios” achieved greater efficiency than had been previously attained running similar programs on the Titan and TaihuLight supercomputers. You can read about the important practical implications of this work in the ACM writeup.

SC17 Student Cluster Champs: Nanyang Technological University, Singapore (Source: @SCCompSC)

In an awards ceremony Thursday the team from Nanyang Technological University took the gold in the Student Cluster Competition. With its dual-node Intel 2699-based cluster accelerated with 16 Nvidia V100 GPUs, the team from Singapore pulled off a triple-play, also recording record runs for both Linpack and HPCG. At 51.77 TFlop/s the team’s SC17 Linpack score beat the previous record by nearly 40 percent and its HPCG score of 2,055.85 was a nearly 50 percent improvement over the previous record-holder. Among the competitors this year who deserve honorable mention are the first all-high school team from William Henry Harrison High. Check back next week for more extensive coverage of the contest and a rundown of the winning teams from our roving contest reporter Dan Olds.

Ralph A. McEldowney

The HPCwire editorial team would like to congratulate everyone on their achievements this year. And we applaud our HPCwire Readers’ and Editors’ choice award winners, a diverse and exceptional group of organizations and people who are on the cutting-edge of scientific and technical progress. (The photo gallery of award presentations can be viewed on Twitter.)

We look forward to seeing many of you in June at the International Supercomputing Conference in Frankfurt, Germany, and then in Dallas for SC18, November 11-16, when SC will be celebrating its 30th anniversary. The SC18 website is already live and the golden key has been handed to next year’s General Chair Ralph A. McEldowney of the US Department of Defense HPC Modernization Program. If you’re partial to the Mile High City, you’re in luck because SC will be returning to Denver in 2019 under the leadership of University of Delaware’s Michela Taufer, general chair of the SC19 conference.

Stay tuned in the coming weeks as we release our SC17 Video Interview series.

The post SC Bids Farewell to Denver, Heads to Dallas for 30th appeared first on HPCwire.

How Cities Use HPC at the Edge to Get Smarter

Fri, 11/17/2017 - 20:30

Cities are sensoring up, collecting vast troves of data that they’re running through predictive models and using the insights to solve problems that, in some cases, city managers didn’t even know existed.

Speaking at SC17 in Denver this week, a panel of smart city practitioners shared the strategies, techniques and technologies they use to understand their cities better and to improve the lives of their residents. With data coming in from all over the urban landscape and worked over by machine learning algorithms, Debra Lam, managing director for smart cities & inclusive innovation at Georgia Tech who works on strategies for Atlanta and the surrounding area, said “we’ve embedded research and development into city operations, we’ve formed a match making exercise between the needs of the city coupled with the most advanced research techniques.”

Panel moderator Charlie Cattlett, director, urban center for computation & data Argonne National Laboratory who works on smart city strategies for Chicago, said that the scale of data involved in complex, long-term modeling will require nothing less than the most powerful supercomputers, including the next generation of exascale systems under development within the Department of Energy. The vision for exascale, he said, is to build “a framework for different computation models to be coupled together in multiple scales to look at long-range forecasting for cities.”

“Let’s say the city is thinking about taking 100 acres and spend a few hundred million dollars to build some new things and rezone and maybe augment public transit,” Cattlett said, “how do you know that that plan is actually what you think it’s going to do? You won’t until 10-20 years later. But if you forecast using computation models you can at least eliminate some of the approaches that would be strictly bad.”

With both Amazon and Microsoft in its metropolitan area, it’s not surprising that Seattle is doing impressive smart city work. Michael Mattmiller, CTO of Seattle, said good planning is necessary for a city expected to grow by 32 percent. Mattmiller said 75 percent of the new residents moving to Seattle are coming for jobs in the technology sector, and they will tend to have high expectations for how their city uses technology.

Some of Seattle’s smart city tactics are relatively straightforward, if invaluable, methods for city government to open the lines of communication with residents and to respond to problems faster. For example, the city developed an app called “Find It, Fix It” in which residents who encounter broken or malfunctioning city equipment (broken street light, potholes, etc.) are encouraged to take a cell phone picture and send a message to the city with a description of the problem and its location.

Of a more strategic nature is Seattle’s goal of becoming carbon neutral by 2050. The key challenges are brought on by the 100,000 people who come to the downtown areas each day for their jobs. The city’s Office of Sustainability collects data on energy consumption from sensors placed on HVAC and lighting systems in office buildings and retail outlets and has developed benchmarks for comparing energy consumption on a per-building basis, notifying building owners if they are above or below their peer group.

Mattmiller said Amazon and Microsoft helped build analytics algorithms that run on Microsoft Azure public cloud. The program is delivering results; Mattmiller said energy consumption is down, with a reduction of 27 million tons of carbon.

Seattle also analyzed weather data and rainfall amounts, discovering that the city has distinct microclimates, with some sections of the city getting as much as eight more inches of rain (the total annual amount of rain in Phoenix) per year than others. This has led to the city issuing weather alerts to areas more likely to have rain events and to send repair and maintenance trucks to higher risk areas.

Transportation, of course, is a major source of pollution, carbon and frustration (30 percent of urban driving is spent looking for parking spaces). Seattle trolled resident for ideas and held a hackathon that produced 14 prototype solutions, including a team from Microsoft who bike to work: they developed a machine learning program that predicts the availability of space on bike racks attached to city buses, “an incredibly clever solution,” Mattmiller said.

In Chicago, Pete Beckman, co-director, Northwestern Argonne Institute of Science and Engineering, Argonne National Laboratory, helped develop sensors placed throughout the city in its Array of Things project. He said that while most sensors used by cities are big, expensive and sparse, Beckman said the project managers wanted to “blanket the city with sensors,” which would collect a broad variety of data and also have significant computational power – a “programmable sensor” that doesn’t just report data but one for which you can write programs to run in the device. They also wanted it to be attractive, so students at the Art Institute of Chicago were recruited to help design the enclosure.

“This becomes a high performance computing problem,” Beckman said. “Why do you need to run programs at the edge? Why run parallel computing out there? Because the amount of data we want to analyze would swamp any network. The ability to have 4K cameras, to have hyperspectral imaging, to have audio, all that (data) can’t be sent back to the data center for processing, it has to be processed right there in a small, parallel supercomputer. Whether it’s Open CV (Open Source Computer Vision Library), Caffe or other deep learning framework like Tensorflow, we have to run the computation out at the edge.”

One scenario outlined was of a sensor detecting an out-of-control vehicle approaching a busy intersection; the sensor picks up on the impending danger and delays the pedestrian “WALK” sign and turns all the traffic lights in the intersection red. These are calculations that require HPC-class computing at the street corner.

Chicago is using its Array of Things sensors in other critical roles, such as real time flood monitoring, for tracking pedestrian, bicycle car and truck traffic and predictively model accidents.

“The questions for us in the parallel computing world,” Beckman said, “are how do we take that structure on our supercomputers and scale it in a way so we have a virtuous loop to do training of large-scale data on the supercomputer and create models that are inference-based, that are quick and fast, that can be pushed out to parallel hardware accelerated out on the edge? The Array of Things project is working on that now.”

The post How Cities Use HPC at the Edge to Get Smarter appeared first on HPCwire.

SC17 Keynote – HPC Powers SKA Efforts to Peer Deep into the Cosmos

Fri, 11/17/2017 - 19:06

Thus week’s SC17 keynote – Life, the Universe and Computing: The Story of the SKA Telescope – was a powerful pitch for the potential of Big Science projects that also showcased the foundational role of high performance computing in modern science. It was also visually stunning as images of stars and galaxies and tiny telescopes and giant telescopes streamed across the high definition screen extended the length of Colorado Convention Center ballroom’s stage. One was reminded of astronomer Carl Sagan narrating the Cosmos TV series.

SKA, you may know, is the Square Kilometre Array project being run by an international consortium and intended to build the largest radio telescope in the world; it will be 50 times more powerful than any other radio telescope today. The largest today is  ALMA (Atacama Large Millimeter/submillimeter Array) located in Chile and has 66 dishes.

SKA will be sited in two locations, South Africa, and Australia. The two keynoters Philip Diamond, Director General of SKA, and Rosie Bolton, SKA Regional Centre Project Scientist and Project Scientist for the international engineering consortium designing the high performance computers, took turns outlining radio astronomy history and SKA’s ambition to build on that. Theirs was a swiftly-moving talk, both entertaining and informative. The visuals flashing adding to the impact.

Their core message: This massive new telescope will open a new window on astrophysical phenomena and create a mountain of data for scientists to work on for years. SKA, say Diamond and Bolton, will help clarify the early evolution of the universe, be able to detect gravitational waves by their effect on pulsars, shed light on dark matter, produce insight around cosmic magnetism, create detailed, accurate 3D maps of galaxies, and much more. It could even play a SETI like role in the search for extraterrestrial intelligence.

“When fully deployed, SKA will be able to detect TV signals, if they exist, from the nearest tens maybe 100 stars and will be able to detect the airport radars across the entire galaxy,” said Diamond, in response to a question. SKA is creating a new government organization to run the observatory, “something like CERN or the European Space Agency, and [we] are now very close to having this process finalized,” said Diamond.

Indeed this is exciting stuff. It is also incredibly computationally intensive. Think about an army of dish arrays and antennas, capturing signals 24×7, moving them over high speed networks to one of two digital “signal processing facilities”, one for each location, and then on to two ‘science data processors” centers (think big computers). And let’s not forget data must be made available to scientists around the world.

Consider just a few data points, shown below, that were flashed across stage during the keynote presentation. The context will become clearer later.

It’s a grand vision and there’s still a long way to go. SKA, like all Big Science projects, won’t happen overnight. SKA was first conceived in 90s at the International Union of Radio Science (URSI) which established the Large Telescope Working Group to begin a worldwide effort to develop the scientific goals and technical specifications for a next generation radio observatory. The idea arose to create a “hydrogen array” able to detect H radiofrequency emission (~1420 MHz). A square kilometer was required to have a large enough collection area to see back into the early universe. In 2011 those efforts consolidated in a not-for-project company that now has ten member countries (link to brief history of SKA). The U.S. which did participate in early SKA efforts chose not to join the consortium at the time.

Although first conceived as a hydrogen array, Diamond emphasized, “With a telescope of that size you can study many things. Even in its early stages SKA will be able to map galaxies early in the universe evolution. When full deployed it will conduct fullest galaxy mapping in 3D encompassing up to one million individual galaxies and cover 12.5 billon years of cosmic history.”

A two-phase deployment is planned. “We’re heading full steam towards critical design reviews next year,” said Diamond. Full construction starts in two years with construction of the first phase expected to begin in 2019. So far €200 have been committed for design along with “a large fraction” of the €640 required for first phase construction. Clearly there are technology and funding hurdles ahead. Diamond quipped if the U.S. were to join SKA and pony up, say $2 billion, they would ‘fix’ the spelling of kilometre to kilometer.

There will actually be two telescopes, one in South Africa about 600 km north of Cape Town and another one roughly 800 km north of Perth in western Australia. They are being located in remote regions to reduce radiofrequency interference from human activities.

“In South Africa we are going to be building close to 200 dishes, 15 meters in diameter, and the dishes will be spread over 150 km. They [will operate] over a frequency range of 350 MHz to 14 GHz. In Australia we will build 512 clusters, each of 256 antennas. That means a total of over 130,000 2-meter tall antennas, spread over 65 km. these low frequency antennas will be tapered with periodic dipoles and will cover the frequency range 50 to 350MHz. It is this array that will be the time machine that observes hydrogen all the way back to the dawn of the universe.”

Pretty cool stuff. Converting those signals into data is a mammoth task. SKA plans two different types of processing center for each location. “The radio waves induce voltages in the receivers that capture them and modern technology allows us to digitize them to high precision than ever before. From there optical fibers transmit the digital data from the telescopes to what we call central processing facilities or (CPFs). There’s one for each telescope,” said Bolton.

Using a variety of technologies including “some exciting FPGA, CPU-GU, and hybrids”, CPFs are where the signals are combined. Great care must be taken to first synchronize the data so it enters the processing chain exactly when it should to account for the fact the radio waves from space reached one antenna before reaching another. “We need to correct that phase offset down to the nanosecond,” said Bolton.

Once that’s done a Fourier transform is applied to the data. “It decomposes essentially a function of time into the frequencies that make it up; it moves us into the frequency domain. We do this with such precision that the SKA will be able to process 65000 different radio frequencies simultaneously,” said Diamond

Once the signals have been separated in frequencies they processed one of two ways. “We can either stack the signals together of various antenna in what we call a time domain data. Each stacking operation corresponds to a different direction in the sky. We’ll be able to look at 2000 such directions simultaneously. This time domain processing analysis detects repeating objects such as pulsars or one off events like gamma ray explosions. If we do find an event, we are planning to store the raw voltage signals at the antennae for a few minutes so we can go back in time and investigate them to see what happened,” said Bolton.

This time domain data can be used by researchers to measure pulsar – which are a bit like cosmic lighthouses – signal arrival times accurately and detect the drift if there is one as a gravitational wave passes through.

“We can also use these radio signals to make images of the sky. To do that we take the signals from each pair if antennas, each baseline, and effectively multiply them together generating data objects we call visibilities. Imagine it will be done for 200 dishes and 512 groups of antennas, that’s 150,000 baselines ad 65000 different frequencies. That makes up to 10B different data streams. Doing this is a data intensive process that requires around 50 petaflops of dedicated digital signal processing.

Signals are processed inside these central processing facilities in a way that depends on the science that “we want to do with them. Once processed the data are then sent via more fiber optic cables to the Science Data Processors or SDPs. Two of these “great supercomputers” are planned, one in Cape Town for the dish array and one in Perth for low frequency antennas.

“We have two flavors of data within the science processor. In the time domain we’ll do panning for astrophysical gold, searching over 1.5M candidate objects every ten minutes sniffing out the real astrophysical phenomena such as pulsar signals or flashes of radio light,” said Diamond. The expectation is for a 10,000 to 1 negative-to-positive events. Machine learning will play a key role in finding the “gold”.

Making sense of the 10 billion incoming visibility data streams poses the greatest computational burden, emphasized Bolton: “This is really hard because inside the visibilities (data objects) of the sky and antenna responses are all jumbled. We need to do another massive Fourier transform to get from the visibility space that depends on the antenna separations to sky planes. Ultimately we need to develop self-consistent models not only of the sky that generated the signals but also how each antenna was behaving and even how the atmosphere was changing during the data gathering.

“We can’t do that in one fell swoop. Instead we’ll have several iterations trying to find the calibration parameters and source positions of brightnesses.” With each iteration bit by bit, fainter and fainter signal emerge from the noise. “Every time we do another iteration we apply different calibration techniques and we improve a lot of them but we can’t be sure when this process is going to converge so it is going to be difficult,” said Bolton.

A typical SKA map, she said, will probably contain hundreds of thousands of radio array sources. The incoming images are about 10 petabytes in size. Output 3D images are 5000 pixels on each axis and 1 petabyte in size.

Distributing this data to scientists for analysis is another huge challenge. The plan is to distribute data via fiber to SKA regional centers. “This another real game changer that the SKA, CERN, and a few other facilities are bringing about. Scientists will use the computing power of the SKA regional centers to analyze these data products,” said Diamond.

The keynote was a wowing, multimedia presentation, and warmly received by attendees. It bears repeating that many issues remain and schedules have slipped slightly, but it is still a stellar example of Big Science, requiring massively coordinated international efforts, and underpinned with enormous computing resources. Such collaboration is well aligned with SC17’s theme – HPC Connects.

Link to video recording of the presentation: https://www.youtube.com/watch?time_continue=2522&v=VceKNiRxDBc

The post SC17 Keynote – HPC Powers SKA Efforts to Peer Deep into the Cosmos appeared first on HPCwire.

Argonne to Install Comanche System to Explore ARM Technology for HPC

Fri, 11/17/2017 - 18:05

Nov. 17, 2017 — The U.S. Department of Energy’s (DOE) Argonne National Laboratory is collaborating with Hewlett Packard Enterprise (HPE) to provide system software expertise and a development ecosystem for a future high-performance computing (HPC) system based on 64-bit ARM processors.

ARM is a RISC-based processor architecture that has dominated the mobile computing space for years. That dominance is due to how tightly ARM CPUs can be integrated with other hardware, such as sensors and graphics coprocessors, and also because of the architecture’s power efficiency. ARM’s capacity for HPC workloads, however, has been an elusive target within the industry for years.

“Inducing competition is a critical part of our mission and our ability to meet our users’ needs.” – Rick Stevens, associate laboratory director for Argonne’s Computing, Environment and Life Sciences Directorate.

Several efforts are now underway to develop a robust HPC software stack to make ARM processors capable of supporting the multithreaded floating-point workloads that are typically required by high-end scientific computing applications.

HPE, a California-based technology company and seller of high-level IT services and hardware, is leading a collaboration to accelerate ARM chip adoption for high-performance computing applications. Argonne is working with HPE to evaluate early versions of chipmaker Cavium ARM ThunderX2 64-bit processors for the ARM ecosystem. Argonne is interested in evaluating the ARM ecosystem as a cost-effective and power-effective alternative to x86 architectures based on Intel CPUs, which currently dominate the high-performance computing market.

To support this work, Argonne will install a 32-node Comanche Wave prototype ARM64 server platform in its testing and evaluation environment, the Joint Laboratory for System Evaluation, in early 2018. Argonne researchers from various computing divisions will run applications on the ecosystem and provide performance feedback to HPE and partnering vendors.

Argonne’s advanced computing ecosystem, chiefly its Argonne Leadership Computing Facility, a DOE Office of Science User Facility, supports a research community whose work requires cutting-edge computational resources — some of the most powerful in the world. For more than a decade, Argonne has been partnering with industry vendor IBM, and more recently, Intel and Cray, to produce custom architectures optimized for scientific and engineering research. These architectures not only feature custom processor systems, but novel interconnects, software stacks and solutions for power and cooling, among other things.

“We have to build the pipeline for future systems, too,” said Rick Stevens, associate laboratory director for Argonne’s Computing, Environment and Life Sciences Directorate. “Industry partnerships are critical to our ability to do our job — which is to provide extreme-scale computing capabilities for solving some of the biggest challenges facing the world today. Inducing competition is a critical part of our mission and our ability to meet our users’ needs.”

“By initiating the Comanche collaboration, HPE brought together industry partners and leadership sites like Argonne National Laboratory to work in a joint development effort,” said HPE’s Chief Strategist for HPC and Technical Lead for the Advanced Development Team Nic Dubé. “This program represents one of the largest customer-driven prototyping efforts focused on the enablement of the HPC software stack for ARM. We look forward to further collaboration on the path to an open hardware and software ecosystem.”

Argonne researchers may eventually contribute to development of the ARM system’s compilers, which are the programs that translate application code into instructions interpreted by the processor. In the past, the difficulty and expense of compiler development have impeded the adoption of alternative processor architectures by high-performance computing applications. Such obstacles are now mitigated by robust open source compiler projects, such as LLVM, which Argonne contributes to actively.

The Comanche collaboration will be presenting at different venues and showcasing a full rack of next generation ARM servers at the 2017 International Conference for High Performance Computing, Networking, Storage and Analysis (SC17) this week (booth #494).

Argonne National Laboratory seeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts leading-edge basic and applied scientific research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state and municipal agencies to help them solve their specific problems, advance America’s scientific leadership and prepare the nation for a better future. With employees from more than 60 nations, Argonne is managed by UChicago Argonne, LLC for the U.S. Department of Energy’s Office of Science.

The U.S. Department of Energy’s Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, visit the Office of Science website.

Source: Argonne National Laboratory

The post Argonne to Install Comanche System to Explore ARM Technology for HPC appeared first on HPCwire.

Julia Computing Wins RiskTech100 2018 Rising Star Award

Fri, 11/17/2017 - 14:09

NEW YORK, Nov. 17, 2017 — Julia Computing was selected by Chartis Research as a RiskTech Rising Star for 2018.

The RiskTech100 Rankings are acknowledged globally as the most comprehensive and independent study of the world’s major players in risk and compliance technology. Based on nine months of detailed analysis by Chartis Research, the RiskTech100 Rankings assess the market effectiveness and performance of firms in this rapidly evolving space.

Rob Stubbs, Chartis Research Head of Research, explains, “We interviewed thousands of risk technology buyers, vendors, consultants and systems integrators to identify the leading RiskTech firms for 2018. We know that risk analysis, risk management and regulatory requirements are increasingly complex and require solutions that demand speed, performance and ease of use. Julia Computing has been developing next-generation solutions to meet many of these requirements.

For example, Aviva, Britain’s second-largest insurer, selected Julia to achieve compliance with the European Union’s new Solvency II requirements.  According to Tim Thornham, Aviva’s Director of Financial Modeling Solutions, “Solvency II compliant models in Julia are 1,000x faster than our legacy system, use 93% fewer lines of code and took 1/10 the time to implement.” Furthermore, the server cluster size required to run Aviva’s risk model simulations fell 95% from 100 servers to 5 servers, and simpler code not only saves programming, testing and execution time and reduces mistakes, but also increases code transparency and readability for regulators, updates, maintenance, analysis and error checking.

About Julia and Julia Computing

Julia is a high performance open source computing language for data, analytics, algorithmic trading, machine learning, artificial intelligence, and many other domains. Julia solves the two language problem by combining the ease of use of Python and R with the speed of C++. Julia provides parallel computing capabilities out of the box and unlimited scalability with minimal effort. For example, Julia has run at petascale on 650,000 cores with 1.3 million threads to analyze over 56 terabytes of data using Cori, the world’s sixth-largest supercomputer. With more than 1.2 million downloads and +161% annual growth, Julia is one of the top programming languages developed on GitHub. Julia adoption is growing rapidly in finance, insurance, machine learning, energy, robotics, genomics, aerospace, medicine and many other fields.

Julia Computing was founded in 2015 by all the creators of Julia to develop products and provide professional services to businesses and researchers using Julia. Julia Computing offers the following products:

  • JuliaPro for data science professionals and researchers to install and run Julia with more than one hundred carefully curated popular Julia packages on a laptop or desktop computer

  • JuliaRun for deploying Julia at scale on dozens, hundreds or thousands of nodes in the public or private cloud, including AWS and Microsoft Azure

  • JuliaFin for financial modeling, algorithmic trading and risk analysis including Bloomberg and Excel integration, Miletus for designing and executing trading strategies and advanced time-series analytics

  • JuliaDB for in-database in-memory analytics and advanced time-series analysis

  • JuliaBox for students or new Julia users to experience Julia in a Jupyter notebook right from a Web browser with no download or installation required

To learn more about how Julia users deploy these products to solve problems using Julia, please visit the Case Studies section on the Julia Computing Website.

Julia users, partners and employers hiring Julia programmers in 2017 include Amazon, Apple, BlackRock, Capital One, Comcast, Disney, Facebook, Ford, Google, IBM, Intel, KPMG, Microsoft, NASA, Oracle, PwC, Uber, and many more.

About Chartis Research

Chartis Research is a leading provider of research and analysis on the global market for risk technology. It is part of Infopro Digital, which owns market-leading brands such as Risk and WatersTechnology. Chartis’ goal is to support enterprises as they drive business performance through improved risk management, corporate governance and compliance, and to help clients make informed technology and business decisions by providing in-depth analysis and actionable advice on virtually all aspects of risk technology.

Source: Julia

The post Julia Computing Wins RiskTech100 2018 Rising Star Award appeared first on HPCwire.

MAX and Ciena Join Forces to Expand Opportunities for Collaborative Research in the Science and Higher Education Communities

Fri, 11/17/2017 - 14:01

COLLEGE PARK, Md., Nov. 17, 2017 — Mid-Atlantic Crossroads (MAX), a center at the University of Maryland (UMD) that operates a regional advanced cyberinfrastructure platform, and Ciena, a network strategy and technology company, have announced a strategic partnership that will leverage the resources of both organizations to enable and expand sophisticated research activities in the science and higher education communities.

A newly created 200 Gbps network connection will join together MAX and Ciena’s robust research infrastructures to facilitate technology development and testing in the areas of multi-domain, multi-layer software-defined networking (SDN), along with distributed systems integration. As a result, this interconnection will allow both organizations to expand the reach of their testbed facilities as well as their research activities, thus opening up new opportunities for scientific collaboration and innovation.

“MAX is delighted to partner with Ciena for such a unique technological opportunity,” said Tripti Sinha, Assistant Vice President and Chief Technology Officer of UMD/MAX. “By joining forces, two organizations with such notable research infrastructures will be able to create a very powerful resource that will better the scientific community and advance critical discoveries.”

The connection, which was made in Baltimore, MD, expands access to Ciena’s SDN research testbed which unites all of the key packet, optical, and software building blocks required to demonstrate the benefits of software-defined, multi-layer wide area networks (WANs). This testbed provides a high-scale, programmable infrastructure that can be controlled and adapted by network-level applications, and it provides open interfaces to coordinate computing, storage, and network resources in a unified, virtualized environment. In collaboration with Ciena, community partners CANARIE, ESnet, Internet2, and StarLight, were also instrumental in the development of this unique resource.

“Ciena is delighted with the addition of MAX to our research network environment. In 2013 when Ciena unveiled the OPn Research on Demand Testbed, it was clear scientific applications and large-scale data sharing were rapidly consuming more network bandwidth and needed much greater network agility to maximize use of these valuable resources. The University of Maryland’s research network, MAX, joins our collaborative environment of network innovation, with CANARIE, Internet2, ESnet, STARlight and other collaborators,” stated Rod Wilson, Ciena’s Chief Technologist for Research Networks. “This new agreement will enable sustained long-term investigation and experimentation in high-performance networking. MAX brings new perspectives and capabilities that will contribute to leap-frog advancements in networking innovations.”

Through this strategic partnership, the national and global research communities will also benefit from greater access to MAX’s dynamic research infrastructure, which includes SDN as a mechanism for the integration of network services with compute, storage, instruments, and application systems and workflows. The MAX research infrastructure also includes facilities for dynamic network control, as well as connections to flexible programmable edge systems, high-performance computational facilities, data repositories, and cloud infrastructures.

“University of Maryland researchers are always at the forefront of new ideas in networking and data science, and as a result, the technology to enable those innovations must always stay one step ahead,” said Jeff Hollingsworth, Interim Chief Information Officer and Professor of Computer Science at UMD. “As one of the nation’s and one of the world’s premier research institutions, the University of Maryland will benefit greatly from access to this new resource created from the MAX-Ciena partnership.”

MAX and Ciena’s new partnership will soon be on full display at the 2017 Supercomputing conference in Denver, CO. MAX staff and connector organizations will utilize the new infrastructure to show a variety of demonstrations in the areas of multi-domain SDN, mutli-100 Gbps end-system to end-system performance across the wide area, and high-performance distributed file system access.

About Mid-Atlantic Crossroads (MAX)

Mid-Atlantic Crossroads (MAX) is a center at the University of Maryland that operates a multi-state advanced cyberinfrastructure platform. MAX’s all-optical, Layer 1 core network is the foundation for a high-performance infrastructure providing state-of-the-art 100-Gbps network technology and services. MAX participants include universities, federal research labs, and other research-focused organizations in the Washington and Baltimore metropolitan areas. MAX serves as a connector and traffic aggregator to the Internet2 national backbone and peers with other major networks. Its mission is to provide cutting-edge network connectivity for its participants, tailored and generic data-transport solutions, and advanced services to accommodate and optimize large data flows and to facilitate network and application research. For more information about MAX and MAX services, please visit www.maxgigapop.net.

About Ciena

Ciena (NYSE: CIEN) is a network strategy and technology company. We translate best-in-class technology into value through a high-touch, consultative business model – with a relentless drive to create exceptional experiences measured by outcomes. For updates on Ciena, follow us on Twitter @CienaLinkedIn, the Ciena Insights blog, or visit www.ciena.com.

Source: MAX

The post MAX and Ciena Join Forces to Expand Opportunities for Collaborative Research in the Science and Higher Education Communities appeared first on HPCwire.

Huawei Unveils Xilinx FPGA-Powered Cloud Server to North America at SC17

Fri, 11/17/2017 - 08:54

DENVER, Nov. 16, 2017 — Xilinx, Inc. (XLNX) and Huawei Technologies Co., Ltd. today jointly announced the North American debut of the Huawei FPGA Accelerated Cloud Server (FACS) platform at SC17. Powered by Xilinx high performance Virtex UltraScale+ FPGAs, the FACS platform is differentiated in the marketplace today.

Launched at the Huawei Connect 2017 event, the Huawei Cloud provides FACS FP1 instances as part of its Elastic Compute Service. These instances enable users to develop, deploy, and publish new FPGA-based services and applications through easy-to-use development kits and cloud-based EDA verification services. Both expert hardware developers and high-level language users benefit from FP1 tailored instances suited to each development flow.

The interactive demonstration at SC17 illustrates the large performance advantage of the Huawei FACS platform compared to an HPC class CPU-based platform through a video encoding and a compression scenario. Highlighted demos from NGCodec and DeePhi feature video encoding and deep learning capabilities, respectively. The FP1 demonstrations feature Xilinx technology which provides a 10-100x speed-up for compute intensive cloud applications such as data analytics, genomics, video processing, and machine learning. Huawei FP1 instances are equipped with up to eight Virtex UltraScale+ VU9P FPGAs and can be configured in a 300G mesh topology optimized for performance at scale. Both the cloud service and the platform technology for on premise solutions are highlighted in the demonstration.

The FP1 FPGA accelerated cloud service is available on the Huawei Public Cloud today. To register for the public beta, please visit http://www.hwclouds.com/product/fcs.html.

About Huawei 

Huawei is a leading global information and communications technology (ICT) solutions provider. Our aim is to enrich life and improve efficiency through a better connected world, acting as a responsible corporate citizen, innovative enabler for the information society, and collaborative contributor to the industry. Driven by customer-centric innovation and open partnerships, Huawei has established an end-to-end ICT solutions portfolio that gives customers competitive advantages in telecom and enterprise networks, devices and cloud computing. Huawei’s 180,000 employees worldwide are committed to creating maximum value for telecom operators, enterprises and consumers. Our innovative ICT solutions, products and services are used in more than 170 countries and regions, serving over one-third of the world’s population. Founded in 1987, Huawei is a private company fully owned by its employees.

About Xilinx

Xilinx is the leading provider of All Programmable FPGAs, SoCs, MPSoCs, RFSoCs, and 3D ICs. Xilinx uniquely enables applications that are both software defined and hardware optimized – powering industry advancements in Cloud Computing, Video/Vision, Industrial IoT, and 5G Wireless. For more information, visit www.xilinx.com.

Source: Xilinx

The post Huawei Unveils Xilinx FPGA-Powered Cloud Server to North America at SC17 appeared first on HPCwire.

Registration of ASC18 Student Supercomputer Challenge Now Open

Thu, 11/16/2017 - 22:35

(Nov. 16, Denver) The ASC Student Supercomputer Challenge 2018 (ASC18) officially kicked off today at the Supercomputing Conference 2017 (SC17). The registration is now open to university students from around the world.

ASC Challenge is the world’s largest student supercomputing competition, drawing thousands of university students to participate since being inaugurated 6 years ago. The ASC17 saw 230 university teams join the challenge, with 20 teams moving on to the finals. The finals of this tournament was held at the National Supercomputing Center in Wuxi, China, home to the world’s fastest supercomputer, Taihu Light.

Mr.Vangel Bojaxhi, representative of the Asia Supercomputer Community, introduced the competition schedule. ASC18 will be divided into three stages: the first stage is registration and team set up, will last from November to December, 2017; the second stage is preliminary contest and will be held between January to March 2018; the third stage is the finals and will be held in May in 2018. Bojaxhi noted that the ASC committee planned to continue cooperation with the large-scale supercomputer, and use the system to serve as the platform for ASC18, and that artificial intelligence would be one of the themes of the competition.

Dan Olds, a senior journalist who has covered college student supercomputer competition for many years, called the Challenge a “life-changing event” for many of its participants. He noted that in addition to expanding their computing and research knowledge, the competition can also help participants develop a better understanding of their career path.

The schools sending the teams also realize great value from the contests. Several universities have used the SCC as a springboard to build a more robust computer science and HPC curriculums. The contests also give the schools an opportunity to highlight student achievement, regardless of whether or not they win.

“Participating in the Challenge with the Sunway Taihu Light Supercomputer was a very valuable experience,” said Li Yuxuan, a member of Tsinghua University team, who emerged as ASC17 champions and winners of the challenge’s e Prize.

Team coaches Alexander Ditter and Jan Laukemann, who led the University of Erlangen-Nuremberg in its first entry to the Challenge this year, echoed Li’s sentiments. “ASC provided us an opportunity to communicate with other universities from around the world. It is a good platform for exchange on the latest trends in supercomputing applications and systems,” noted Jan Laukemann.

ASC is sponsored and organized by China. This association is generously supported by Asian, European and American experts and institutions. The main objectives of ASC are to initiate exchange and cultivation of young supercomputer talents across different countries, improve supercomputer application level and R&D capacity, facilitate the driving force of supercomputer, and promote technical and industrial innovation.

 

ASC18竞赛报名地址:http://www.asc-events.org/ASC18/

ASC18 enrollment website: http://www.asc-events.org/ASC18/

The post Registration of ASC18 Student Supercomputer Challenge Now Open appeared first on HPCwire.

2017 ACM Gordon Bell Prize Awarded to Chinese Team for 18.9 Petaflops Earthquake Simulation

Thu, 11/16/2017 - 18:14

DENVER, Nov. 16, 2017 – ACM, the Association for Computing Machinery (www.acm.org), has named a 12-member Chinese team the recipients of the 2017 ACM Gordon Bell Prize for their research project, “18.9-Pflops Nonlinear Earthquake Simulation on Sunway TaihuLight: Enabling Depiction of 18-Hz and 8-Meter Scenarios.” Using the Sunway TaihuLight, which is ranked as the world’s fastest supercomputer, the team developed software that was able to efficiently process 18.9 Pflops (or 18.9 quadrillion calculations per second) of data and create 3D visualizations relating to a devastating earthquake that occurred in Tangshan, China in 1976. The team’s software included innovations that achieved greater efficiency than had been previously attained running similar programs on the Titan and TaihuLight supercomputers.

The ACM Gordon Bell Prize (awards.acm.org/bell) tracks the progress of parallel computing and rewards innovation in applying high performance computing to challenges in science, engineering, and large-scale data analytics. The award was presented today by ACM President Vicki Hanson and Subhash Saini, Chair of the 2017 Gordon Bell Prize Award Committee, during the International Conference for High Performance Computing, Networking, Storage and Analysis (SC17) (sc17.supercomputing.org/) in Denver, Colorado.

Although earthquake prediction and simulation is an inexact and emerging area of research, scientists hope that the use of supercomputers, which can process vast sets of data to address the myriad of variables at play in geologic events, may lead to better prediction and preparedness. For example, the Chinese team’s 3D simulations may inform engineering standards for buildings being developed in zones known to have seismic activity. In this vein, many have advocated for a significant increase in the

amount of sensors to regularly monitor seismic activity. The Tangshan earthquake, which occurred on July 28, 1976 in Tangshan, Hebei, China, is regarded as the most devastating earthquake of the 20th century, and resulted in approximately 242,000-700,000 deaths. In developing their simulations for the Tangshan earthquake, the winning team included input data from the entire spatial area of the quake, a surface diameter of 320 km by 312 km, as well as 40 km deep below the earth’s surface. The input data also included a frequency range of the earthquake of up to 18 Hz (Hertz). In the study of earthquakes, a Hertz is a unit of measurement that measures the number of times an event happens in the period of a second. For example, it might correspond to the number of times the ground shakes back and forth during an earthquake. Previous simulations of violent earthquakes have employed a lower frequency than 18 Hz, since enormous memory and time consumption are needed for high frequency simulations.

This year’s winning team is not the first to develop algorithms for supercomputers in an effort to simulate earthquake activity. In the abstract of their presentation, the 2017 Gordon Bell recipients write: “Our innovations include: (1) a customized parallelization scheme that employs the 10 million cores efficiently at both the process and thread levels; (2) an elaborate memory scheme that integrates on-chip halo exchange through register communication, optimized blocking configuration guided by an analytic model, and coalesced DMA access with array fusion; (3) on-the-fly compression that doubles the maximum problem size and further improves the performance by 24%.”

Of its new innovations, the Chinese team adds that its on-the-fly compression scheme may be effectively applied to other challenges in exascale computing. In their paper, the authors state: “The even more exciting innovation is the on-the-fly compression scheme, which, at the cost of an acceptable level of accuracy lost, scales our simulation performance and capabilities even beyond the machine’s physical constraints. While the current compression scheme is largely customized for our specific application and the Sunway architecture, we believe the idea has great potential to be applied to other applications and other architectures.”

Winning team members include Haohuan Fu, Tsinghua University and National Supercomputing Center, Wuxi, China; Conghui He, Tsinghua University and National Supercomputing Center, Wuxi, China; Bingwei Chen, Tsinghua University and National Supercomputing Center, Wuxi, China; Zekun Yin, Shandong University; Zhenguo Zhang, Southern University of Science and Technology, China; Wenqiang Zhang, University of Science and Technology of China; Tingjian Zhang, Shandong University; Wei Xue, Tsinghua University and National Supercomputing Center, Wuxi, China; Weiguo Liu, Shandong University; Wanwang Yin, National Research Center of Parallel Computer Engineering and Technology, China; Guangwen Yang, Tsinghua University and National Supercomputing Center, Wuxi, China; and Xioafei Chen, Southern University of Science and Technology, China.

Innovations from advanced scientific computing have a far-reaching impact in many areas of science and society—from understanding the evolution of the universe and other challenges in astronomy, to complex geological phenomena, to nuclear energy research, to economic forecasting, to developing new pharmaceuticals. The annual SC conference brings together scientists, engineers and researchers from around the world for an outstanding week of technical papers, timely research posters, and tutorials.

The Sunway TaihuLight is a Chinese supercomputer with over 10.5 M heterogeneous cores and is ranked as the fastest supercomputer in the world. Located at the National Supercomputer Center in Wuxi, Jingsu, China, it is nearly three times as fast as the Tianhe-2, the supercomputer that previously held the world record for speed.

About ACM

ACM, the Association for Computing Machinery (www.acm.org) is the world’s largest educational and scientific computing society, uniting computing educators, researchers and professionals to inspire dialogue, share resources and address the field’s challenges. ACM strengthens the computing profession’s collective voice through strong leadership, promotion of the highest standards, and recognition of technical excellence. ACM supports the professional growth of its members by providing opportunities for life-long learning, career development, and professional networking.

About the ACM Gordon Bell Prize

The ACM Gordon Bell Prize (awards.acm.org/bell) is awarded each year to recognize outstanding achievement in high-performance computing. The purpose of this recognition is to track the progress over time of parallel computing, with particular emphasis on rewarding innovation in applying high-performance computing to applications in science. The prize is awarded for peak performance as well as special achievements in scalability and time-to-solution on important science and engineering problems and low price/performance. Financial support for the $10,000 awards is provided by Gordon Bell, a pioneer in high-performance and parallel computing.

Source: ACM

The post 2017 ACM Gordon Bell Prize Awarded to Chinese Team for 18.9 Petaflops Earthquake Simulation appeared first on HPCwire.

Oracle Announces Oracle Cloud Infrastructure Options for Enterprise, AI and HPC Applications

Thu, 11/16/2017 - 08:55

REDWOOD SHORES, Calif., Nov. 16, 2017 — Oracle today announced the general availability of a range of new Oracle Cloud Infrastructure compute options, providing customers with unparalleled compute performance based on Oracle’s recently announced X7 hardware. Newly enhanced virtual machine (VM) and bare metal compute, and new bare metal graphical processing unit (GPU) instances enable customers to run even the most infrastructure-heavy workloads such as high-performance computing (HPC), big data, and artificial intelligence (AI) faster and more cost-effectively.

Unlike competitive offerings, Oracle Cloud Infrastructure is built to meet the unique requirements of enterprises, offering predictable performance for enterprise applications while bringing cost efficiency to HPC use cases. Oracle delivers 1,214 percent better storage performance at 88 percent lower cost per input/output operation (IO).

New Innovations Drive Unrivaled Performance at Scale

All of Oracle Cloud Infrastructure’s new compute instances leverage Intel’s latest Xeon processors based on the Skylake architecture. Oracle’s accelerated bare metal shapes are also powered by NVIDIA Tesla P100 GPUs, based on the Pascal architecture. Providing 28 cores, dual 25Gb network interfaces for high-bandwidth requirements and over 18 TFLOPS of single-precision performance per instance, these GPU instances accelerate computation-heavy use cases such as reservoir modeling, AI, and Deep Learning.

Oracle also plans to soon release NVIDIA Volta architecture-powered instances with 8 NVIDIA Tesla V100 GPUs interconnected via NVIDIA NVLINK to generate over 125 TFLOPS of single-precision performance. Unlike the competition, Oracle will offer these GPUs as both virtual machines and bare metal instances.  Oracle will also provide pre-configured images for fast deployment of use cases such as AI. Customers can also leverage TensorFlow or Caffe toolkits to accelerate HPC and Deep Learning use cases.

“Only Oracle Cloud Infrastructure provides the compute, storage, networking, and edge services necessary to deliver the end-to-end performance required of today’s modern enterprise,” said Kash Iftikhar, vice president of product management, Oracle. “With these latest enhancements, customers can avoid additional hardware investments on-premises and gain the agility of the cloud. Oracle Cloud Infrastructure offers them tremendous horsepower on-demand to drive competitive advantage.”

In addition, Oracle’s new VM standard shape is now available in 1, 2, 4, 8, 16, and 24 cores, while the bare metal standard shape offers 52 cores, the highest Intel Skylake-based CPU count per instance of any cloud vendor. Combined with its high-scale storage capacity, supporting up to 512 terabytes (TB) of non-volatile memory express (NVMe) solid state drive (SSD) remote block volumes, these instances are ideal for traditional enterprise applications that require predictable storage performance.

The Dense I/O shapes are also available in both VM and bare metal instances and are optimal for HPC, database applications, and big data workloads. The bare metal Dense I/O shape is capable of over 3.9 million input/output operations per second (IOPS) for write operations. It also includes 51 TB of local NVMe SSD storage, offering 237 percent more capacity than competing solutions1.

Furthermore, Oracle Cloud Infrastructure has simplified management of virtual machines by offering a Terraform provider for single-click deployment of single or multiple compute instances for clustering. In addition, a Terraform-based Kubernetes installer is available for deployment of highly available, containerized applications.

By delivering compute solutions that leverage NVIDIA’s latest technologies, Oracle can dramatically accelerate its customers’ HPC, analytics and AI workloads. “HPC, AI and advanced analytic workloads are defined by an almost insatiable hunger for compute,” said Ian Buck, general manager and vice president of Accelerated Computing at NVIDIA. “To run these compute-intensive workloads, customers require enterprise-class accelerated computing, a need Oracle is addressing by putting NVIDIA Tesla V100 GPU accelerators in the Oracle Cloud Infrastructure.”

“The integration of TidalScale’s inverse hypervisor technology with Oracle Cloud Infrastructure enables organizations, for the first time, to run their largest workloads across dozens of Oracle Cloud bare metal systems as a single Software-Defined Server in a public cloud environment,” said Gary Smerdon, chief executive officer, TidalScale, Inc. “Oracle Cloud customers now have the flexibility to configure, deploy and right-size servers to fit their compute needs while paying only for what they use.”

“Cutting-edge hardware can make all the difference for companies we work with like Airbus, ARUP and Rolls Royce,” said Jamil Appa, co-founder and director of Zenotech. “We’ve seen significant improvements in performance with the X7 architecture. Oracle Cloud Infrastructure is a no-brainer for compute-intensive HPC workloads.”

About Oracle

The Oracle Cloud offers complete SaaS application suites for ERP, HCM and CX, plus best-in-class database Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) from data centers throughout the Americas, Europe and Asia. For more information about Oracle (NYSE: ORCL), please visit us at oracle.com.

Source: Oracle

The post Oracle Announces Oracle Cloud Infrastructure Options for Enterprise, AI and HPC Applications appeared first on HPCwire.

Inspur wins contract for NVLink V100 based Petascale AI Supercomputer from CCNU

Thu, 11/16/2017 - 08:50

Inspur announced the news in SC17 that Inspur has been awarded a contract to design and build a Petascale AI Supercomputer based on “NVLink + Volta” for Central China Normal University(CCNU), part of the university’s ongoing research efforts in frontier physics and autonomous driving AI research.

The supercomputer will configure 18 sets of Inspur AGX-2 servers as computing nodes, 144 pieces of the latest Nvidia Volta architecture V100 chips that support NvLink 2.0, and the latest Intel Xeon SP (Skylake) processor. It will run Inspur ClusterEngine, AIStation and other cluster management suites, with high speed interconnection via Mellanox EDR Infiniband . The peak performance of the system will reach 1 PetaFlops. With NVLink2.0 and Tesla V100 GPU, the system will be able to simultaneously support both HPC and AI computing.

Inspur AGX-2 is the world’s first density AI Supercomputer, supporting 8 *NVIDIA® Volta® 100 GPUs with NVLink 2.0 enabled in a 2U form factor. It offers NVLink 2.0 for faster interlink connections between the GPUs with bi-section bandwidth of 300 GB/s. AGX-2 also features great I/O expansion capabilities, supporting 8x NVMe/SAS/SATA hot swap hard drives and up to 4 EDR InfiniBand HCAs. AGX-2 supports both air-cooling and on-chip liquid-cooling to optimize and improve power efficiency and performance.

AGX-2 can significantly improve the computing efficiency of HPC, with 60T double precision flops per server. For VASP software, used extensively in physics and material science, AGX-2’s performance with one P100 GPU equals 8 nodes 2-socket mainstream CPU computing clusters. The Nvlink provided by AGX-2 also features excellent performance in parallel efficiency of multiple GPU cards, with 4x P100 GPU cards in parallel reaching the performance of nearly 20 nodes 2-socket mainstream CPU computing clusters.

 

For AI computing, the Tesla V100 employed by AGX-2 is equipped with Tensor for deep learning, which will achieve 120 TFLOPS to greatly improve the training performance of deep learning frameworks with NVLink 2.0 enabled.  Based on the Imagenet dataset for deep learning training models, the AGX-2 shows excellent scalability. Configured with 8x V100, the AGX-2 delivers 1898 images/s, which is 7 times faster than a single card and 1.87 times than P100 with the familiar config, when GoogleNet model is trained with TensorFlow.

Central China Normal University plans to further upgrade the AI supercomputer to multi-Peta flops system.

The post Inspur wins contract for NVLink V100 based Petascale AI Supercomputer from CCNU appeared first on HPCwire.

SC17 Student Cluster Competition Configurations: Fewer Nodes, Way More Accelerators

Thu, 11/16/2017 - 08:13

The final configurations for each of the SC17 “Donnybrook in Denver” Student Cluster Competition have been released. Fortunately, each team received their equipment shipments on time and undamaged, so the teams are running the best versions of their clusters.

What’s really notable this year is the wide variety of configurations. Student Cluster Competition neophytes tend to think that with the 3,000 watt power cap every team would be using approximately the same configurations. But as you can see on the chart, the configs vary wildly. On the ultra small side is Team Peking’s single node GPU server. On the high side, we have multi-time champion teams Tsinghua and Texas with an eight node monsters.

Back in the day, an eight node student cluster was about average – with a 10 or 12 node cluster being the high-end of node counts. We saw 4-6 node systems, but they were considered to be too small to win the Overall Championship. How times change, right?

We’ve seen significant changes since just a year ago.

Today, the median node count is three, which is 25% smaller than the median at last year’s SC16 competition. While cores per CPU have risen significantly (20%), overall average CPU core count per cluster has dropped by close to 25% and the median CPU core count has dropped nearly in half.

Accelerator counts are up, not surprisingly, and many of the teams are using the latest NVIDIA V100 Volta GPUs. I’m expecting to see big LINPACK numbers from those teams using large numbers of GPUs, not surprisingly.

This sets up an interesting story line:  Tsinghua has won the last two major competitions (ASC and ISC) with configurations that have more nodes than the other teams. However, the clear majority of teams believe that their “Small is Beautiful” approach is the correct path – who is correct? We’ll find out Thursday afternoon. Stay tuned…..

The post SC17 Student Cluster Competition Configurations: Fewer Nodes, Way More Accelerators appeared first on HPCwire.

Student Clusterers Demolish HPCG Record! Nanyang Sweeps Benchmarks

Thu, 11/16/2017 - 08:06

Nanyang pulled off the always difficult double-play at this year’s SC Student Cluster Competition. The plucky team from Singapore posted a world record LINPACK, thus taking the Highest LINPACK Award, but also managed to notch the highest HPCG score as well. This is quite an achievement.

LINPACK and HPCG are the bookends of HPC system benchmarks. LINPACK is an embarrassingly parallel linear algebra benchmark that mimics HPC workloads of yesteryear. Your LINPACK score represents the very highest numerical performance your system can achieve.

HPCG is a much newer benchmark that uses sparse matrices and such to more closely duplicate the HPC workloads of today. You can find out more about HPCG here.  This is an incredibly stressful benchmark (for the system, not for the operator) that rigorously tests your server. However, unlike LINPACK, HPCG only takes 30 minutes to run, which means that any supercomputer, regardless of size, can easily find the time to power through an HPCG.

 Nanyang used their dual-node Intel 2699-based cluster along with 16 NVIDIA V100 GPUs to blast their way into cluster competition history with their record-breaking score of 2,055.85. This easily topped the 1,394.32 mark set by the Purdue/NEU team at ISC17. In fact, the top three finishers in the HPCG SC17 competition posted scores above the ISC17 record.

Student competitors have blown away benchmark records set only five months ago at the ISC17 event. The major difference maker seems to be new Intel Skylake CPUs and NVIDIA V100 accelerators. However, this is following the usual pattern. Each new cluster competition breaks the records set by the previous competition. It’s the circle of life in the tech business, right?

The post Student Clusterers Demolish HPCG Record! Nanyang Sweeps Benchmarks appeared first on HPCwire.

Atos and ParTec Selected as Industrial Partners for the Successor of the Jülich Supercomputer

Thu, 11/16/2017 - 07:53

JULICH, Germany, Nov. 16, 2017 — A figurehead of Forschungszentrum Jülich retires in the spring of 2018: JUQUEEN, Europe’s fastest supercomputer for many years, will meet its successor. Forschungszentrum Jülich and the international IT company Atos have agreed at the SC17 supercomputing conference currently taking place in Denver, USA, to install the first module of the system. The third partner is the Munich-based software company ParTec. The new system is to be operated as a national high-performance computer within the framework of the Gauss Centre for Supercomputing (GCS), to which the three data centers of the Jülich Research Centre (JSC), the Bavarian Academy of Sciences (LRZ) and Stuttgart University (HLRS) belong. GCS and its supercomputers are jointly supported and financed by the federal government and the three states in which GCS is headquartered. In addition, an expansion of the central Jülich storage system was agreed with the manufacturer Lenovo.

Supercomputer JUQUEEN
Copyright: Forschungszentrum Jülich / R.-U. Limbach

“The Jülich Supercomputing Centre (JSC) is breaking new ground with its modular concept,” explains Prof. Thomas Lippert, Director of the JSC. The first module of the successor of JUQEEN, which is now planned, is tailored to a wide range of very complex applications in simulation and data analysis. With a nominal peak performance of 12 petaflop/second – equivalent to 12 trillion computing operations per second – this first expansion stage will already be designed for twice as much computing power as its predecessor system, which has come of age. Within two years, the system is to be completed by a second module within a common network. This second module is then specially designed for applications that require the highest computing power.

The decision to choose Atos as the hardware supplier was made in a two-stage competitive procurement process of the research center. The module is to be installed in the first half of 2018 and is based on Atos’ Sequana architecture. “A high integration density and efficient hot water cooling allow significant savings in operating costs,” explains Dr. Michael Stephan, technical expert at the JSC for the system. “Unlike its predecessor, the new system uses hot water to cool the racks that can be much warmer than the normal ambient temperature. This means that it can be cooled directly with outside air and without the need to spend additional energy for cooling”.

The supercomputer is developed in a co-design approach by the partners Atos, Forschungszentrum Jülich and ParTec. “As a leading manufacturer of supercomputer systems in Europe, we are particularly pleased to be taking this highly innovative path towards modular supercomputing with our partners,” says Dr. Martin Matzke, SVP Big Data and Security of Atos Germany. “Our main contribution to this co-design is our new Sequana architecture, specifically designed to meet the unique technological requirements of exascale computing.”

A new era of supercomputing

Modular supercomputing, an idea conceived by Dr. Lippert almost 20 years ago, was realised by JSC and ParTec in the EU-funded research projects DEEP and DEEP-ER together with many partners from research and industry. “Since 2010, our experts have been developing the software, which will in future create the union of several modules into a single system, “ says Bernhard Frohwitter, CEO of ParTec. “Our goal is to provide the leading software for Exascale.”

The JSC has recently shown that modular supercomputing actually works: With the expansion of JURECA (see press release dated November 13, 2017), a modular supercomputer with an innovative cluster booster architecture went into operation at the Jülich Supercomputing Centre (JSC) for the first time worldwide. Now the next modular system is in preparation.

Universal tool for science

Supercomputers have become a universal tool for science. Simulations on supercomputers are indispensable for testing scientific models in different fields such as quantum physics, climate and neuroscience. At the same time, they allow insights into structure and behaviour of important building blocks of life, basic materials properties or chemical processes under extreme conditions, for example, which are not otherwise possible for physical-technical, financial and ethical reasons.

The Jülich Supercomputing Centre procures and operates the new system as a member of the Gauss Centre for Supercomputing (GCS), the merger of the three national high-performance computing centres in Germany. The computing time is allocated to national and European projects via established peer review procedures. The GCS and the Forschungszentrum Jülich is supported by the Federal Ministry of Education and Research and the Ministry of Culture and Science of North Rhine-Westphalia as well as the ministries in Baden-Württemberg and Bavaria.

Expansion of the central storage system

Furthermore, an expansion of the central Jülich storage system is planned. Together with the manufacturer Lenovo, Forschungszentrum Jülich has agreed on an extension and partial renewal of the system from 20.3 to a total of 81.6 petabytes. The globally accessible JUST (Jülich Storage) storage platform provides high-performance storage for the supercomputers at the JSC. For instance, the system stores data from major Jülich large-scale projects such as the European Human Brain Project and the Jülich Brain Atlas, as well as the Alpha Magnetic Spectrometer (AMS) – an antimatter detector installed on the International Space Station ISS.

The new JUST system will be based on Lenovo’s “Distributed Storage Solution for IBM Spectrum Scale” (DSS-G). The bandwidth will more than double as a result of the upgrade, allowing access speeds of up to 500 GB/s.

Source: Jülich Supercomputing Centre 

The post Atos and ParTec Selected as Industrial Partners for the Successor of the Jülich Supercomputer appeared first on HPCwire.

Student Cluster LINPACK Record Shattered! More LINs packed than ever before!

Thu, 11/16/2017 - 07:11

Nanyang Technological University, the pride of Singapore, utterly destroyed the Student Cluster Competition LINPACK record by posting a score of 51.77 TFlop/s at SC17 in Denver. The previous record, established by Germany’s Friedrich-Alexander-Universitat (FAU), was 37.05 TFlop/s.

Nanyang used a two node Intel E5-2699 based cluster with a whopping 16 NVIDIA Volta V100 GPUs to take the trophy for Highest LINPACK (although there isn’t an actual trophy). Astonishingly enough, their high score, using only 3,000 watts, is enough to top the most energy efficient systems on the Green500.

However, the way that Nanyang achieved their score was a bit risky. They powered down their fans for the duration of the (very short) LINPACK run. Their bet paid off in a new world record, although I wouldn’t recommend that anyone try this on your home clusters.

The next top finisher was NTHU from Taiwan, another team with a low node count (only three) and a lot of GPUs. Their cluster sported eight V100 and four P100 accelerators for a total of twelve. Peking only has one actual HPC node, augmented with 10 V100s, while Utah brought four nodes and nine V100s.

Tsinghua had a larger cluster with six nodes, but still a heaping helping of eight V100 Tesla accelerators. Keen observers will note that the top three LINPACK finishers all broke the record established at ISC17. Another thing they all have in common is that they’re using loads of NVIDIA V100s to drive their compute power.

I also want to point out that the all high school team, Henry Harrison High, managed to post a score in the top 10. That’s pretty impressive for a team of kids so young and new to the HPC field.

The average LINPACK score rose considerably from 15.62 at ISC17 to the 26.83 average at SC17. The median LINPACK rose even more, moving to 29.68 from the ISC17 average score of 12.35.

So Nanyang takes home the Highest LINPACK Award, which will probably be in the form of a certificate. I would like to be there when all of the teams try to explain to their parents and grandparents exactly what a “LINPACK” is and how you “pack the most LINs”.

 

 

 

The post Student Cluster LINPACK Record Shattered! More LINs packed than ever before! appeared first on HPCwire.

Nyriad to Demonstrate New Operating System for the SKA Precursor MWA Telescope at SC17

Wed, 11/15/2017 - 22:35

DENVER, Nov. 14, 2017— Exascale computing company Nyriad Limited, the first commercial spin-out from the SKA, has been partnering with the International Centre of Radio Astronomy Research (ICRAR) to design, develop and deploy a ‘Science Data Processing’ (SDP) operating system for the SKA-Low precursor telescope, the Murchison Widefield Array (MWA).

Nyriad will be demonstrating the first commercial product from the collaboration called Nsulate, a GPU-accelerated storage solution which enables increased storage resilience while reducing the storage power requirements by over 50 percent.

ICRAR are engaged in several aspects of the preconstruction phase for the SKA program, including research, software engineering and data intensive astronomy, and are collaborating with Nyriad on developing a GPU-accelerated OS architecture for next-generation supercomputers. This new OS will eliminate the need for dedicated storage infrastructure while dramatically reducing power consumption and infrastructure costs. It will increase performance by using the GPUs traditionally dedicated to data processing alone to also perform the storage-processing functions for the supercomputer, thereby keeping the storage very close to the computing nodes. Nyriad is further collaborating with ICRAR on GPU-accelerating their Daliuge graph processing framework to analyse vast streams of radio antennae data in real-time.

Director of ICRAR’s Data Intensive Astronomy Program, Professor Andreas Wicenec, stated, “Nyriad was founded following discussions and consulting work around the SKA data challenges. The Nyriad founders, Matthew A. Simmons (CEO) and Alex St. John, identified a need for innovative approaches merging storage and processing which could benefit the SKA, but it became obvious that many of the ‘Big Data’ projects arising in other sciences, industries and governments would benefit as well.”

“The Universe is a big place,” said Nyriad CTO Alex St. John, “so we’ve been forced to rethink the entire software and hardware stack to come up with new computer designs that can handle the data processing volumes necessary to map the cosmos, Nsulate is the first of a suite of solutions that address these problems.” St. John is best known for his early pioneering work at

Microsoft on creating the Direct3D API and DirectX media OS that gave rise to modern consumer and HPC GPUs.

The new OS is being demonstrated at Nyriad partner booths TYAN and by HPC Systems at the SuperMicro booth at SC17, and at the Nyriad suite by appointment.

About Nyriad

Nyriad is a New Zealand-based exascale computing company specialising in advanced data storage solutions for big data and high performance computing. Born out of its consulting work on the Square Kilometre Array Project, the company was forced to rethink the relationship between storage, processing and bandwidth to achieve a breakthrough in system stability and performance capable of processing and storing over 160Tb/s of radio antennae data in real-time, within a power budget impossible with any modern IT solutions.

About ICRAR

The International Centre for Radio Astronomy Research (ICRAR) was founded in August 2009 with the specific purpose of supporting Australia’s bid to host the world’s largest radio telescope and one of the largest scientific endeavors in history, the Square Kilometre Array (SKA). ICRAR is a joint venture between Curtin University and The University of Western Australia (UWA), with funding support from the State Government of Western Australia. ICRAR has research nodes at both universities and is now host to over 150 staff and postgraduate students.

Source: Nyriad, ICRAR

The post Nyriad to Demonstrate New Operating System for the SKA Precursor MWA Telescope at SC17 appeared first on HPCwire.

NEC Supplies Top500 Supercomputer to Johannes Gutenberg University Mainz in Germany

Wed, 11/15/2017 - 19:21

DUSSELDORF and TOKYO, Nov. 15 – NEC Corporation today (Nov. 16 in Tokyo) announced that NEC Deutschland GmbH has delivered an LX  series supercomputer to Johannes Gutenberg University Mainz (JGU), one of Germany’s leading research universities and part of the German Gauss Alliance consortium of excellence in high-performance computing (HPC). The new HPC cluster ranks 65th in the most current TOP500 list of the fastest supercomputers in the world from November 2017 and 51st in the Green500 list of the most energy-efficient supercomputers.

This cluster extends the existing MOGON-II cluster, thereby providing a total computational capacity of approximately 1.9 Petaflop/sec. It offers high performance computing services for researchers at JGU and the Helmholtz Institute Mainz (HIM), a research institute specializing in high-energy physics and antimatter research. JGU is a member of the “Alliance for High-Performance Computing Rhineland-Palatinate” (AHRP) and offers access to MOGON-II to all universities in Rhineland-Palatinate.

The new MOGON-II HPC cluster upgrade consists of 1040 dual-socket compute nodes, each equipped with two Intel(R) Gold 6130 CPUs and a total memory of 122 TB.

The nodes are connected through a high-speed Intel Omni-Path network with a topology that allows continuous expansion of the system, which meets the ongoing growth of HPC demand from researchers from JGU and HIM.

The MOGON-II cluster is connected to a 5 PetaByte NEC LxFS-z parallel file-system capable of 80 GigaByte/s bandwidth. This highly innovative ZFS-based Lustre solution provides advanced data integrity features paired with a high density and high reliability design.

“We have been working together with NEC for many years now, and we are happy to confirm that this collaboration has always been very fruitful to our research members and to the excellence in research at Mainz University. The high sustained performance and stability of NEC’s HPC solution, as well as the dedication and skill of their team continuously deliver exceptional results,” emphasizes Professor Andre Brinkmann, Head of the Zentrum fur Datenverarbeitung and of the Efficient Computing and Storage Group at JGU.

“We are honored to see Johannes Gutenberg University Mainz and Helmholtz Institute Mainz, two highly respected members of the research community, adopt NEC’s latest HPC solution as part of extending the capabilities of the MOGON-II cluster,” said Yuichi Kojima, Vice President HPC EMEA at NEC Deutschland.

About Johannes Gutenberg University Mainz

With around 32,000 students and more than 4,400 academics and researchers from over 120 nations, Johannes Gutenberg University Mainz (JGU) is one of the largest research universities in Germany. Its main core research areas are in the fields of particle and hadron physics, the materials sciences, and translational medicine. The university campus is also home to four partner institutes involved in top-level non-university research: the Helmholtz Institute Mainz (HIM), the Max Planck Institute for Chemistry (MPI-C), the Max Planck Institute for Polymer Research (MPI-P), and the Institute of Molecular Biology (IMB).

About NEC Corporation 

NEC Corporation is a leader in the integration of IT and network technologies that benefit businesses and people around the world. By providing a combination of products and solutions that cross utilize the company’s experience and global resources, NEC’s advanced technologies meet the complex and ever-changing needs of its customers. NEC brings more than 100 years of expertise in technological innovation to empower people, businesses and society. For more information, visit NEC at http://www.nec.com.

Source: NEC Corp.

The post NEC Supplies Top500 Supercomputer to Johannes Gutenberg University Mainz in Germany appeared first on HPCwire.

Hyperion HPC Market Update: ‘Decent’ Growth Led by HPE; AI Transparency a Risk Issue

Wed, 11/15/2017 - 19:06

The HPC market update from Hyperion Research (formerly IDC) at the annual SC conference is a business and social “must,” and this year’s presentation at SC17 played to a SRO crowd at a downtown Denver hotel. This writer has attended several of these breakfasts, and looking back at the evolving list of major trends identified by Hyperion reveals how incredibly fast advanced scale computing is changing.

For example, two years ago at SC15 in Austin, our coverage shows there was not a single mention of AI, machine learning or deep learning. This year, AI is the air we breathe, with the market straining to blast past HPDA into machine learning and deep learning to the point where Hyperion devoted significant time this morning to sounding warning calls about the power of AI and how lack of transparency threatens its future.

But before going into that, let’s first look at Hyperion’s industry update.

Earl Joseph, Hyperion CEO, portrayed 2016 showing “some decent growth” for HPC server sales, growing to $11.2 billion worldwide, with strongest growth in the sector comprising systems priced at $250,000 and more. While the first half of 2017 shows an overall market decline of 3.5 percent, to $5.1 billion, Joseph said the first two quarters typically are softer than the second half of the year.

In the vendor race, HPE has a significant lead with a 36.8 percent share, followed by Dell EMC with 20.5 percent, Lenovo at 7.8 percent and IBM at 4.9 percent. Nearly half of global HPC server sales are in North America, and X86 processors are used in nearly 75 percent of HPC servers sold worldwide.

Hyperion has also studied the proliferation of HPC and its economic impact. Bob Sorensen, VP of research and technology, reported that there are nearly 800 HPC sites in the U.S. He also said a recent Hyperion study shows that HPC-reliant US economic sectors contribute almost 55 percent of the GDP to the US economy, encompassing $9.8 trillion and accounting for over 15.2 million jobs.

Joseph said that while the high end of the HPC market has shown the strongest health with a relatively weak lower end, looking ahead, it will be the mid-market that will drive the industry to $14.8 billion by 2021, which Joseph called “fairly healthy growth.”

He also noted the spiky nature of the supercomputing industry, pointing out that three or four exascale-class machines are scheduled to be delivered in 2021 followed by only one or two in subsequent years, resulting in the industry showing “unbelievable growth followed by decline.”

Hyperion also analyses the industry by vertical sectors. The national labs, academia and the Department of Defense comprise nearly half of the total market, with CAE and bio-sciences representing the largest commercial sectors.

From a revenue perspective, AI (broadly defined) offers strong market potential. Senior research VP Steven Conway said Hyperion forecasts a rapid ramp up to $1.2 billion by 2021. But AI – particular its black box nature – is a double edged sword.

But problems of learning machine transparency pose a threat to the future of the nascent AI market.

He explained the “gigantic” difference between deep learning and machine learning. “With machine learning, you train the computer to do some things, and then it does them. With deep learning, you train the computer and then it goes beyond the training and learns on its own.”

He quoted recent warnings by Stephen Hawking that the development of AI “is one of the pivotal moments in human history,” and he cited Dr. Eng Lim Goh of HPE to the effect that “we humans can’t teach computers how to think like we do because we don’t know how humans think. So they’re (computers) think the way they do. That’s a big difference. They can think on their own. And they’re really capable of culture, of learning from each other – that’s been demonstrated.”

While the opacity of deep learning systems raises the dystopic specter of machines moving beyond human control, the more immediate problem of transparency is trust. Citing the famous case earlier this year of Google admitting it didn’t understand how its AlphaGo system defeated the world champion Go player, Conway said it’s one thing to not know how a machine won a board game and another to entrust your life, literally, to a computer – i.e., an autonomous car or AI-driven medical treatment.

“The problem with deep learning today is that in many instances it’s a black box phenomenon, and that needs to change,” Conway said.

Change is on the way.

In Germany, he said, the first national law has been passed that puts restrictions and imposes ethical standards on autonomous vehicles. Based on the work of a 14-person ethics commission the law requires that self-driving cars must allow humans to take over control of the car. Further if its proven that the car was in control when an accident occurred then the automaker is liable; if a person was in control then the driver is liable.

In addition, autonomous cars in Germany will not be programmed to make “demographic” ethical decisions along the lines of letting an elderly person die rather than a baby.

In the U.S., Conway said, limited steps have been taken hold autonomous cars accountable, with the National Highway Transportation Safety Administration issuing a document requiring a “black box” in self-driving vehicles similar to those on airplanes that enable post facto reconstruction of accidents.

“When, as inevitably will happen, we start seeing some self-driving vehicles where they can’t avoid an accident and there’s injury  or death involved, particularly in the early years, then insurance companies, automakers and families will want to know how it happened, why it happened,” Conway said.

AI accountability also is a major issue in the medical profession – if treatments fail that are based on deep learning system recommendations, then questions will be asked. Conway said another key concern with precision medical systems, such as IBM Watson, is that the data selected to train the system can be flawed due to human bias.

Hyperion also announced a mapping project that tracks the nearly 800 HPC sites around the country, along with an initiative to study the emergence of quantum computing. The company also announced the winners of its HPC user and vendor innovation awards.

The post Hyperion HPC Market Update: ‘Decent’ Growth Led by HPE; AI Transparency a Risk Issue appeared first on HPCwire.

Enyx Premieres TCP and UDP Offload Engines for Intel Stratix 10 FPGA On REFLEX CES XpressGXS10-FH200G Board

Wed, 11/15/2017 - 17:35

DENVER, Nov. 15, 2017 — Enyx, a world-class pioneer in ultra-low latency FPGA-based technology and solutions, is pleased to announce its enterprise-class TCP/IP, UDP/IP and MAC network connectivity Intellectual Property (IP) Cores for FPGAs and SoCs support for the high-performance REFLEX CES XpressGXS10-FH200G PCIe Board, which features a Stratix 10 GX FPGA from Intel’s new top-of-the-line 14nm Stratix 10 family

Enyx network connectivity IP Cores are addressing the growing throughput and hardware acceleration needs of the datacenter industry and are performing network protocol offloading for applications, such as network security enabled NICs, smart NICs, high performance data distribution, custom packet filtering and high bandwidth bridges. Enyx also provides custom project implementation through Enyx Design Services as part of a complete and customized Smart NIC or Smart Switch solution.

“We are pleased to collaborate with our valued partner REFLEX CES to offer the industry-first TCP and UDP full hardware stacks on Intel’s new, cutting-edge Stratix 10 FPGAs,” says Eric Rovira Ricard, VP Business Development North America at Enyx. “Intel is making FPGA technology ready for data centers, opening new areas for hardware offloading applications in high performance computing, and Enyx is proud to provide the most mature and feature rich network protocol stacks for seamless, FPGA-enabled network connectivity on the latest devices.”

“We are delighted to work with Enyx to offer the best-in-class UDP & TCP IP low latency reference design on our Stratix 10 FPGA board first to market for the Finance and Networking applications, and therefore providing a fast and trusted solution,” said Eric Penain, Chief Business Officer at REFLEX CES.

Enyx nxTCP and nxUDP IP Cores feature full RTL Layers 2, 3, 4 implementations with integrated 40G/25G/10G/1G MAC, compliant with the IEEE 802.3 standards, supporting ARP, IPv4, ICMP, IGMP and TCP/UDP protocols. nxTCP and nxUDP are designed to work seamlessly on Intel (formerly Altera) and Xilinx FPGA and SoC designs. Enyx TCP implementation on Intel Stratix 10 GX devices feature latencies of less than 60 ns in transmission and 110 ns in reception and can also manage up to 32,768 TCP sessions in parallel.

REFLEX CES XpressGXS10-FH200G is the first commercially available PCIe board supporting the 14nm Intel Stratix 10 FPGA family. REFLEX CES XpressGXS10-FH200G PCIe board includes the biggest 2800 KLE Stratix 10 density for processing intensive and various data algorithms with its mix of memory capabilities in DDR4 and QDR2+. It has an optical interface capability of 200Gbit via two QSFP28 cages and uses PCIe gen3 x16. An additional 200Gbit board-to-board interface is provided using a firefly connection. The footprint is compatible with SoC FPGA’s enabled HPS access via the Ethernet interface on the PCIe bracket side. REFLEX CES is a certified board partner of Enyx.

Starting in 2018, the Intel Stratix 10 version downloadable package will be available and will include a reference design for the REFLEX CES XpressGXS10-FH200G PCIe board.

Enyx made this announcement today at the SC17 conference in Denver where it is currently presenting its technology product line and services.

About Enyx

Enyx is a leading developer and provider of FPGA-based ultra-low latency technologies and solutions. Enyx Technology & Design Services division provides design services and connectivity IP cores for FPGA and SoC, for tailored Smart NICs and Smart Switches. Enyx Technology & Design Services division has engaged with over 50 customers world-wide, including hedge funds, exchanges, top-tier investment banks, telecom operators, research labs, universities, and technology manufacturers for the defense, military, aeronautics, aerospace and high-performance computing industries.

For more information, visit www.enyx.com

About REFLEX CES

Recognized for its expertise in high-speed applications, analog and hardened systems, REFLEX CES has become a leading partner with major industrial companies. REFLEX CES simplifies the adoption of FPGA technology with its leading-edge FPGA-based custom embedded and complex systems. REFLEX CES FPGA network platforms enable better flexibility and ease of programming, offering a faster and most powerful board, and reducing the customers’ technology risks and time to market. The company provides FPGA COTS boards for several markets, including the Finance market where Ultra Low Latency capability is a key element, and other markets like Networking.

Source: Enyx

The post Enyx Premieres TCP and UDP Offload Engines for Intel Stratix 10 FPGA On REFLEX CES XpressGXS10-FH200G Board appeared first on HPCwire.

Pages