Performance analysis is the task of monitoring the behaviour of a program execution. The main goal is to find out the possible adjustments that might be done in order to improve the performance of the computer system in use. To be able to get that improvement, it is necessary to find the different causes/contributors of overhead. Today, we are already in the multicore era, but there is a gap between the level of development of the two main divisions of multicore technology (hardware and software). This project is focused on the issues concerning performance analysis, tuning of applications running specifically in a shared memory system and development of application that automatically extract system characteristics and configurations. This application is developed using OODM and implemented using C# programming language and can be used on any windows Operating System. The application developed from this project critically analyses multicore system, determine various causes of overhead in multicore environment,extracts system parameters and present various optimization strategies.
With computers playing an increasingly critical role in our day-to-day lives, it is important to know their components and how each works and of what impact they impose on performance of the computer system.
According to (Arnold, 1994) Computer performance is characterised by the amount of useful work accomplished by a computer system compared to time and resources used. Depending on the context, good computer performance is dependent on the available system resources. Most computer users do not know the system specification, they lack the knowledge of conventional way of extracting system parameters but with the computerised system in this thesis (Otherwise known as Autospec) every computer users will be able to determine the system configuration by installing and running the software.
The System development can be likened to building a house, this demands adequate planning and preparation in order to meet the objectives of the proposed design.
The parameters or the resources that are of interest in our analysis include the followings:
- Operating system
- Hard drives
- Optical drives
Performance analysis is the task of investigating the behaviour of program execution (Mario, 2009). The main aim is to find out the possible adjustments that might be done in order enhance the performance of computer system. Besides, the hardware architecture and software platform (operating system) where a program is executed has impact on its performance. Workload characterization involves studying the user and machine environment, observing key characteristics, and developing a workload model that can be used repeatedly. Once a workload model is available, the effect of changes in the workload and system can be easily evaluated by changing the parameters of the model. This can be achieved by using compiler directives such OpenMP multithread application. In addition, workload characterization can help you to determine what’s normal, prepare a baseline for historical comparison, comply with management reporting, and identify candidates for optimization.
Presently, multicore processors chips are being introduced in almost all the areas where a computer is needed. For example, many laptop computers have a dual core processor inside. High Performance Computing (HPC) address different issues, one of them is the exploitation of the capacities of multicore architecture(Mario, 2009).
Presently, multicore processors chips are being introduced in almost all the areas where a computer is needed. For example, many laptop computers have a dual core processor inside. High Performance Computing (HPC) address different issues, one of them is the exploitation of the capacities of multicore architecture.
Performance analysis and optimization is a field of HPC responsible for analysing the behaviour of applications that perform big amount of computation. Some applications that perform high volume of computations require analysing and tuning. Therefore, in order to achieve better performances it is necessary to find the different causes of overhead.
There are a considerable number of studies related to the performance analysis and tuning of applications for supercomputing, but there are relatively few studies addressed specifically to applications running on a multicore environment.
A multicore system is composed of two or more independent cores (or CPUs). The cores are typically integrated onto a single circuit die (known as a chip multiprocessor or CMP), or they may be integrated onto multiple dies in a single chip package.
This thesis examines the issues involved in the performance analysis and tuning of applications running specifically in a shared Memory and the development of a computerized system for retrieving systems specification for possible changes. Multicore hardware is relatively more mature than multicore software, from that reality arises the necessity of this research. We would like to emphasize that this is an active area of research, and there are only some early results in the academic and industrial worlds in terms of established standards and technology, but much more will evolve in the years to come.
Several years, the computer technology has been going through a phase of many developments. Based on Moore law, the speed of processors has been increasing very fast. Every new generation of micro-processor comes with clock rate usually twice or even much faster than the previous one. That increase in clock frequency drove increases in the processors performance, but at the same time, the difference between the processors speed and memory speed was increasing. Such gap was temporarily solved by instruction level parallelism (ILP) (Faxen et al, 2008). Exploiting ILP means executing instructions that occur close to each other in the stream of instructions through the processor in parallel. Though it appeared very soon that more and more cycles are being spent not in the processor core execution, but in the memory subsystem which includes the multilevel caching structure, and the so-called Memory Wall, problem started to evolve quite significantly due to the fact that the increase in memory speed didn’t match that of processor cores.
Very soon a new direction for increasing the overall performance of computer systems had been proposed, namely changing the structure of the processor subsystem to utilize several processor cores on a single chip. These new computer architectures received the name of Chip Multi Processors (CMP) and provided increased performance for new generation of systems, while keeping the clock rate of individual processors cores at a reasonable level. The result of this architectural change is that it became possible to provide further improvements in performance while keeping the power consumption of the processor subsystem almost constant, the trend which appears essential not only to power sensitive market segments such as embedded systems, but also to computing server farms which suffer power consumption/dissipation problems as well.