Print

Distributed Systems Laboratory

The Distributed Systems Laboratory hosts a series of successful activities and projects, recognized worldwide, within a collaboration between California Institute of Technology, in U.S., European Organization for Nuclear Research (CERN) in Switzerland and University Politehnica of Bucharest. The activities primarily target research projects in the domain of large scale distributed systems, with remarkable results achieved by the teams working in areas such as monitoring of distributed systems (project MONALISA), evaluation using modeling and simulation of distributed systems (project MONARC 2) and more.

The research team is member of the Romanian Grid consortium, roGRID, organizing yearly the Grid Initiative Summer School and other events related to Grid in Romania. The team includes PhD students developing their thesis in the filed of distributed computing. Team members were awarded Oracle and IBM scholarships for their PhD theses. Also, the Master program “Parallel and Distributed Computer Systems”, for Romanian students, is co-organized by the Free University in Amsterdam and UPB (http://www.vu.nl/en/programmes/international-masters/types-of-masters/short-track-masters/index.asp).

The team was involved in the development of MonALISA (http://monalisa.cern.ch), a collaboration between UPB, Caltech and CERN developpping a distirbuted monitoring framework. The system makes use of agent technology to develop a scalable, fault tolerant, high performance platform that performs a wide range of information gathering and processing tasks and provide this information in a dynamic, customized, self describing way to other services or clients. The monitoring framework is currently used in production by several important communities: Open Science Grid, CMS, Alice experiment at CERN, USLHCNet high speed transatlantic network. More than 350 MonALISA services are running throughout the world monitoring more than 20,000 compute servers, and thousands of concurrent jobs. More than 1.5 million parameters are currently monitored in near-real time with an aggregate update rate of approximately 15,000 parameters per second. The team received the “CENIC (Cooperation for Education Networks in California) 2006 Innovation Award for High-Performance Applications” for the MonALISA project. Within the same collaboration the team is also involved in the development of Fast Data Transfer (FDT, http://monalisa.cern.ch/FDT/), an application which supports efficient large scale data transfers and helps the active monitoring of the available bandwidth between sites. FDT was succesfully demonstrated at SuperComputing conferences between 2007-2009 sustaining translatlantic data transfer rates of 110+ Gbps.

MONARC 2 (http://monarc.cacr.caltech.edu), also developed by a mixed team Caltech, CERN, UPB is a simulation framework that allows the analysis of dynamic behavior of Grids. It allows building simulatin models that capture specific characteristics of resources and activities. The simulator has been extended for the study of resouce management policies and activity scheduling algorithms. Using data collections offered by MonALISA helped the elaboration of realistic and accurate simulation experiments. A different approach has been used in the development of VNSim. This is a VANET simulator (realized in collaboration with the DS Lab from Rutgers, USA), which combines a complex model for vehicles mobility, a wireless network simulator, and an interface for integrating the emulation of vehicular applications.

The research in resource management and activity scheduling was oriented along two directions related to the architecture of the management system and to optimization techniques. A decentralized, fault-tolerant, flexible, and scalable service oriented architecture has been implemented (see DIOGENES,http://diogenes.grid.pub.ro/), which allows easily searching for and using services for metascheduling, monitoring, prediction, security, etc. This research was conducted within the GridMOSI (http://www.gridmosi.ro/GridMOSI) national project. Several optimization techniques based on minimal response time, minimal system time, maximum productivity, service level agreement, economical model, genetic algorithms, and semantic-based have been analyzed, improved, and validated in real distributed applications within the MedioGRID (http://mediogrid.utcluj.ro/app) national project. The team is a key partner within the PEGAF national project aiming at developping a workflow management platform for distributed systems, targeted to scientific applications. This research is focused on building a new workflow engine with dedicated modules for fault tolerance, data managemet, dynamic binding and workflow scheduling. The team is also actively involved in the dependable system research, leading the DepSys national project. The research is focused on originals approaches to the development of models, methods and techniques for increasing reliability, availability, safety and security of large scale distributed systems, particularly Grids and Web-based distributed systems.

The Laboratory was granted within the NCIT Research Center, an FP6 project
EU-NCIT - NCIT leading to EU IST excellency, whose strategic objective was the reinforcement of the scientific and technical know-how and experience in elaborating and coordinating proposals and projects related to several IST research areas of FP7. The team is actively involved in European research projects in the field of distributed systems:
- Enabling Grids for the E-science III (http://www.eu-egee.org/)
- SEE-GRID-SCI FP7 project (http://www.see-grid.eu/)
- P2P-NEXT project (http://www.p2p-next.org/

Coordinator: Prof. Nicolae Tapus, Prof. Valentin Cristea

Team: Corina Stratan, Florin Pop, Alex Costan, Ciprian Dobre. Eliana Tîrşa, Cătălin Leordeanu, Alex Herişanu, Mugurel Andreica, Mihai Capota