Performance tests and benchmarks in Java

Performance

According to the book: “Optimizing Java: Practical Techniques for Improved Performance Tuning” [1]

The performance is defined by this metrics:

Throughput

Throughput is a metric that represents the rate of work a system or subsystem can perform. This is usually expressed as number of units of work in some time period.

Latency

If this metric is explained via metaphors that evoke plumbing, it’s the time taken to process a single transaction and see a result at the other end of the pipe.

Capacity

The capacity is the amount of work parallelism a system possesses—that is, the number of units of work (e.g., transactions) that can be simultaneously ongoing in the system.

Utilization

It’s the utilization levels of different resources (CPU, RAM, etc) depending on the workload.

Efficiency

Dividing the throughput of a system by the utilized resources gives a measure of the overall efficiency of the system.

Scalability

The throughput or capacity of a system depends upon the resources available for processing. The change in throughput as resources are added is one measure of the scalability of a system or application

Degradation

If we increase the load on a system, either by increasing the number of requests (or clients) or by increasing the speed requests arrive at, then we may see a change in the observed latency and/or throughput.

If the throughput stops increasing, or latency increase happens the degradation of the system under additional load.

Performance tests vs Microbenchmark

Using the approach “Top-Down Performance” we can think in performance tests like large-scale benchmarking, and the microbenchmark the subset of metrics of a specific part of the system that component the large metric.

One of the aspects of Java performance that many engineers miss at first encounter is that large-scale benchmarking of Java applications is usually easier than trying to get accurate numbers for small sections of code. [1].

There are a different kind of types of performance tests, depending on what questions are intended to answer, for example, latency test, throughput test, load test, stress test, etc. These kinds of tests are centered on the user experience, and usually, are conducted using one or more use cases that represent a business transaction.

To categorize benchmarks in a more fine-grained manner, people invented benchmark categories such as “micro”, “macro”, or even “meso”, which separate performance tests by scope. Maybe we can say the separation is done by the affected layers or complexity level of code under test. [2].

Profilers

TO-DO

Microbenchmarking

Is the measurement of small pieces of Java code. It’s also called performance unit tests[3] are small scale and highly granular performance tests on code level.

Most of the articles, books, and videos that I’ve seen say that it’s really difficult to do it correctly [1][4].

When to Microbenchmark

As the book “Optimizing Java: Practical Techniques for Improved Performance Tuning” [1] mentions there are three main use cases for low-level analysis or microbenchmarking:

  1. You’re developing general-purpose library code with broad use cases.
  2. You’re a developer on OpenJDK or another Java platform implementation.
  3. You’re developing extremely latency-sensitive code (e.g., for low-latency trading).

In our case, we have reason number one.  Opentracing and Jaeger have limited knowledge about the contexts in which they will be used and we need to start measuring to understand what is the performance overhead.

Disclaimer – experimental science

The dynamic nature of the Java platform, and features like garbage collection and aggressive JIT optimization, lead to performance that is hard to reason about directly. Worse still, performance numbers are frequently dependent on the exact runtime circumstances in play when the application is being measured. [1].

“The scary thing about microbenchmarks is that they always produce a number, even if that number is meaningless. They measure something; we’re just not sure what.”
Brian Goetz

Tools

There are some tools we can use to make Microbenchmarking, and I’ll list them below:

Java Microbenchmark Harness – JMH

JMH is a Java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other languages targetting the JVM. It was developed as part of the OpenJDK project.

This is de facto standard for micro-benchmarking.

Caliper

Caliper is a tool created by Google for measuring Java code performance, primarily focused on microbenchmarks.

As I’ve read this is less used in the community, over JMH [5]

Playing with JMH

First test

I’ve started writing my firsts benchmark tests using JMH. I followed the official JMH’s instructions.

First, I created a new project using the JMH archetype:

 
mvn archetype:generate \
-DinteractiveMode=false \
-DarchetypeGroupId=org.openjdk.jmh \
-DarchetypeArtifactId=jmh-java-benchmark-archetype \
-DgroupId=org.sample \
-DartifactId=jmh-examples \
-Dversion=1.0

When this it’s finished, we can import the new maven project in our IDE.

Using this archetype, the project already has declared the required dependencies, as you can see in the pom.xml:

<dependencies>
    <dependency>
        <groupId>org.openjdk.jmh</groupId>
        <artifactId>jmh-core</artifactId>
        <version>${jmh.version}</version>
    </dependency>
    <dependency>
        <groupId>org.openjdk.jmh</groupId>
        <artifactId>jmh-generator-annprocess</artifactId>
        <version>${jmh.version}</version>
        <scope>provided</scope>
    </dependency>
</dependencies>

I write one test to compare the String concatenation between StringBuffer, StringBuilder and using plus.


public class StringConcatenation {

    /*
    Sometimes you way want to initialize some variables that your benchmark code needs,
    but which you do not want to be part of the code your benchmark measures.
    Such variables are called "state" variables.
    */

    @State(Scope.Thread)
    public static class StateVariables {
        String a = "Hello ";
        String b = "world";
    }

    @Benchmark
    public void testPlusConcatenation(StateVariables state) {
        String c = state.a + state.b;
    }

    @Benchmark
    public void testStringConcatenationStringBuilder(StateVariables state) {
        StringBuilder sb = new StringBuilder();
        sb.append(state.a).append(state.b).toString();
    }

    @Benchmark
    public void testStringConcatenationStringBuffer(StateVariables state) {
        StringBuffer sb = new StringBuffer();
        sb.append(state.a).append(state.b).toString();
    }
}

Then I compiled using the console:

    mvn clean install

This step generates the artifact benchmarks.jar, located in the target folder of your project.

Then, I ran the benchmark via console:

 java -jar target/benchmarks.jar

The tests start, and you can see the progress in the console:

# JMH version: 1.21
# VM version: JDK 1.8.0_151, Java HotSpot(TM) 64-Bit Server VM, 25.151-b12
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/jre/bin/java
# VM options: &amp;amp;lt;none&amp;amp;gt;
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: org.sample.StringConcatenation.testStringConcatenationStringBuilder

# Run progress: 66.67% complete, ETA 00:08:22
# Fork: 1 of 5
# Warmup Iteration   1: 48198317.966 ops/s
# Warmup Iteration   2: 57531090.669 ops/s
# Warmup Iteration   3: 58399042.978 ops/s
# Warmup Iteration   4: 58603168.507 ops/s
# Warmup Iteration   5: 59852000.695 ops/s
Iteration   1: 54818945.388 ops/s
Iteration   2: 59627298.159 ops/s
Iteration   3: 59569734.263 ops/s
...

At the end of the test, we have a warning:

REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial experiments, perform baseline and negative tests that provide experimental control, make sure the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts. Do not assume the numbers tell you what you want them to tell.

And finally, we can see the test results:

Benchmark                                                  Mode  Cnt         Score         Error  Units
StringConcatenation.testPlusConcatenation                 thrpt   25  59032237.610 ± 2125379.789  ops/s
StringConcatenation.testStringConcatenationStringBuffer   thrpt   25  58221290.035 ± 1970418.629  ops/s
StringConcatenation.testStringConcatenationStringBuilder  thrpt   25  59049317.104 ± 1765752.180  ops/s

With this result, we can conclude preliminarily that we have the maximum throughput concatenating String using StringBuilder.

Output files

We can write the results in one output file specifying some parameters when we run the tests:

java -jar target/benchmarks.jar -rf csv -rff results.csv

In this example, we are exporting the result in CSV format (result.csv).

Source code

The source code if this example and others are located here.

References

[1] https://books.google.ca/books/about/Optimizing_Java.html?id=fl1NswEACAAJ&redir_esc=y

[2] https://blog.codecentric.de/en/2017/10/performance-measurement-with-jmh-java-microbenchmark-harness/

[3] https://www.ifi.uzh.ch/dam/jcr:ccf1399a-2d57-4ff9-a3b0-59d69616d5d3/msr18-author-version.pdf

[4]

[5]

Other useful resources

 

Anuncios

How I became an Outreachy intern

What is Outreachy?

Outreachy provides three-month paid internships to work in Free and Open Source Software (FOSS) for under-represented people.


Today I am starting my trip as an Outreachy intern for CNCF, and I want to share my experience since I heard about this program and how I got selected for the internship.

The beginning

One of my best friends, “an opensource enthusiast” told me about Outreachy a few years ago. I remember he told me “You absolutely should apply to Outreachy, you can get an internship to contribute to FOSS”.

I remember the firsts “NO” in my mind:

  • Probably I am not as young as required
  • I never contribute to any FOSS before.
  • I have to learn a lot to be able to do the first contribution required for the application.
  • I’m not as proficient in English as I would like to be.

Back in those days, I was in a full-time job, I didn’t have enough time, and the time passed… I started following Outreachy in tweeter, and occasionally I saw some folk’s tweet sharing about the experience of being an Outreachy intern.

Met a past Outreachy intern / Outreachy mentor

I had the opportunity this year to meet Vicky. She is an amazing human being and a really kick-ass engineer. She was an Outreachy intern and later a mentor for Openstack. I asked her everything from the most stupid question to an orientation about which project choose. She dispelled all my doubts, and encourage me to apply.

Application process

The stars were aligned for me, I was finishing a project, and I was looking for new opportunities and saw an Outreachy’s tweet announcing a new round for this year.

I followed carefully the dates, and at that moment I started my researching about what were the possible organizations/projects, which one work with a specific technology, etc..

I knew that I wanted to work with something related to cloud computing, and microservices, but I didn’t know exactly what.

Subscribing myself to Outreachy mailing list helped me to keep updated with the important dates.

Filling the application

You have to fill an initial application. First, you have to verify if you meet the eligibility criteria, and then answer some specific questions about your experience in FOSS and tell some personal experiences about being an under-represented people. This step was challenging for me because I was not used to writing in English and I didn’t want to make a bad first impression. I think the most important here is, to be honest, take the time to write and review as many times you need to feel comfortable with it, and ask others for help to review it.

Choosing the project

This was the hard part for me, I found so many projects interesting. I spend a whole weekend researching about each project and taking notes.

As Outreachy recommends, I choose 2 projects based on my criteria:

  • Is it related to my interest?
  • Does it have a supportive community?
  • Do I have the required skills? or Do I want to learn the missing skills?

As a tip, maybe if you don’t find a project which meets your interest, maybe you can wait for a little bit because probably there will be other projects posted during the application period and usually these projects have an extended deadline.

Making the contribution

When you already have chosen your project/s, you have to get in touch with the mentors and ask for advice to know where start. That depends on each community, but usually, you will be assigned to an issue/task tagged as “Outreachy” or “First issue” in the project’s bug tracker.

If you don’t get a quick response from the mentor, don’t panic. Remember there is a time zone difference and also probably the mentor has many requests.

Then you have to read the project’s instructions for contributing. Usually, the first step is reading the “Contributor guide”, or “Developer guide” located in the repo or wiki project.

These are my lessons learned in this phase:

  • Read carefully the documentation. It happened to me that I had a problem in my environment, and that was because I missed a step following the instructions.
  • Ask for help. There are really nice people in each community, who can help you when you are stuck with some problem. In my experience, if you need a quick response you have to try in the chat, like IRC, gitter, slack.
  • Help others. Yes, you are in a competence, but you are in a community too, and you can make good friends if you are nice and help others.
  • Use your weekends. If you still have a full-time job like I had, take advantage of your weekends to do your best in your contributions.
  • Be constant. Maybe you do not have enough time in one day to finish a big task, but instead you can read the documentation, take a deep dive in the codebase, write an e-mail, or ask a question in the chat, and each of these steps can feel small, but doing this each day you can achieve your goal.

Final conclusions

There are no superpowers required to apply to Outreachy internship, the only requirements are: be eager to learn, follow the instructions and be constant.

If you have the interest, but you have doubts whether to apply or not apply, ask past interns or mentors, there are really accessible people, at a distance of one chat or one mail.

If you don’t know which project to apply, research, get involved in each community, ask the community members.

And finally, have confidence. The first step to achieve a goal is to believe in it and to believe you are capable of doing it.


I want to thank Dmitry and Ilya for all the help and support. 

And also I’m really grateful with my mentor Juraci Paixão Kröhling (Red Hat) for all your guidance, and patience that allowed me to get here.

Trabajo colaborativo – proceso de enseñanza-aprendizaje

Educación 2.0

El trabajo colaborativo se define como aquellos procesos intencionales de un grupo para alcanzar objetivos específicos, generando herramientas para una labor productiva. Un ejemplo de herramientas que colaboran en este sentido, son las denominadas tecnologías de la información y la comunicación.

Uno de los desafíos de la educación 2.0 es lograr un aprendizaje significativo utilizando herramientas TICs que se tienen al alcance de los estudiantes. Estas herramientas pueden variar desde herramientas que se encuentran en el uso diario como suite de office, navegadores y cuentas de correo electrónico como también herramientas mas complejas o para fines más específicos como moodle, blackboard u otros.

Finalmente la elección de las herramientas a utilizar depende de las que se tenga disponible, las del conocimiento del docente, las de mayor facilidad de los estudiantes, pero finalmente lo importante es facilitar la interacción estudiante-profesor y estudiante-estudiante en el proceso de enseñanza y aprendizaje, y que las mismas constituyan herramientas de apoyo en el aprendizaje.

A continuación se presentan herramientas para trabajo colaborativo, de poco conocimiento, específicas para el área de informática:

  1. EclipseGavab: es una versión personalizada del entorno de desarrollo Eclipse[11] especialmente diseñada para la docencia; que permite la implementación de la metodología de enseñanza Aprendizaje Basado en Proyectos (ABP) a través de la utilización de características colaborativas, como la edición compartida del código, la mensajería instantánea y el control de versiones. Soporta los lenguajes de programación Pascal, C y Java, lo cuales son ampliamente utilizados en la enseñanza de la programación de los primeros años de las carreras informáticas [7]. Es una herramienta multiplataforma, lo cual la hace independiente del sistema operativo, por otra parte se instala a través de un único instalador, lo que facilita su uso por parte de los alumnos novatos. Se usa como entorno de desarollo para las prácticas de programación de las diferentes materias que dictan los docentes del grupo Gavab de la Universidad del Rey Juan Carlos, evitando de esta forma que los alumnos tengan que aprender diferentes aplicaciones para las práctica de programación. Más información sobre el mismo puede encontrarse en http://code.sidelab.es/projects/eclipsegavab/
  2. Virtual Programming Lab (VPL): es un gestor de prácticas de programación sobre Moodle, de esta forma se permite la incorporación del ambiente de desarrollo de software al aula virtual de las materias donde se utiliza, posibilitando la entrega, edición y ejecución de las prácticas de programación, además de una evaluación continua y automática de éstas [8]. Actualmente está disponible la versión para Moodle 2.0, que incluye entre otras, las siguientes características:
    • Se puede utilizar con los Grupos de Moodle
    • Permite la evaluación de los trabajos enviados
    • El índice de actividades de VPL muestra el número de envíos realizados por los grupos de trabajo y cuales de los mismos están evaluados.
    • Puede ser utilizado con Pascal, Java y C++ entre otros lenguajes de programación
    • La herramienta permite que los docentes puedan seguir el proceso de aprendizaje de los alumnos, brindando asistencia situada, lo cual es muy importante para los alumnos novatos. Por otra parte, la herramienta puede proponerse como una solución a los problemas de espacio y recursos con los que es común encontrarse en las clases prácticas de las materias de programación. Más información sobre VPL puede encontrarse en http://vpl.dis.ulpgc.es/.
  3. Ambiente Instruccional SABATO: es una herramienta informática personalizada que integra los paradigmas de enseñanza: Aprendizaje Basado en Problemas (PBL) y el aprendizaje colaborativo apoyado en computadora (CSCL). Dispone de un ambiente colaborativo de aprendizaje en situaciones virtuales de enseñanza, a través de herramientas que proveen servicios sincrónicos y asincrónicos que son muy útiles en la enseñanza-aprendizaje asistido por computadora [5]. Sus características más representativas son:
    • Permitir el acceso remoto de todos los participantes (alumnos, docentes), desde cualquier ubicación e independientemente de la plataforma.
    • Soportar diferentes niveles de usuarios, proveyendo acceso restringido y selectivo.
    • Soportar contenidos multimedia en la presentación de las situacione problemáticas.
    • Chat integrado para los miembros de un curso.
    • Soportar diferentes roles: administrador, profesor, asistente de docencia y estudiante.
  4. HabiPro ( Habitos de Programación): es una herramienta pedagógica y colaborativa que fue desarrollada en el marco de una tesis doctoral de la Universidad de Castilla – La Mancha con la intención de estimular en los estudiantes que la utilicen las habilidades de observación, reflexión o relación, todas estas muy necesarias en la formación de buenos programadores [9]. La interfaz de la aplicación dispone de dos ventanas: una es el Chat que permite la comunicación entre los estudiantes y la otra es una ventana compartida – Ventana de Trabajo, donde los alumnos pueden colaborar para resolver un problema. [10]. En la ventana de trabajo se presentan cuatro tipos diferentes de ejercitaciones:
    • Encontrar Errores: esta es una actividad a la que los programadores novatos se enfrentan con frecuencia, porque el programa que esta desarrollando no funciona. Para los alumnos novatos es conveniente hacerse en el hábito de predecir o pensar cual o cuales son los errores que no permitirían que el programa funcione correctamente
    • Acomodar un programa en el orden correcto: con este tipo de ejercitaciones se busca asistir al alumno en el aprendizaje de la estructura de un programa. La visualización del programa ordenado le permite al alumno comprender cual es su funcionalidad.
    • Predecir resultados: con estos ejercicios se busca mostrar la importancia de crear programas que sean fáciles de entender ( con comentarios y nombres significativos para las variables) para otras personas.
    • Completar programas: En estos ejercicios los alumnos deben escribir una sentencia que se omitió. La intención de los mismos es mostrar que existen diferentes soluciones. El alumno debe ser capaz de encontrar la mejor.

HabiPro tiene una arquitectura del tipo cliente- servidor, donde el servidor dispone de un Modelo de Grupo, que a medida que se utiliza la herramienta, permite almacenar dos tipos de información: una relativa a los aspectos pedagógicos y otra relativa a los aspectos sociales. Esta información permite representar y caracterizar al grupo de trabajo. Así a medida que el Modelo de Grupo tiene más información sobre los usuarios y su forma de interactuar, la herramienta puede presentar ejercitaciones que se adapten mejor al grupo, de esta forma se logra adaptar el sistema al grupo de trabajo. [10]

Fuente
http://www.ead.unlp.edu.ar/blog/?p=257 (21/09/2014)

Referencias

  • JiménezBuiles J.A., Pavony Meneses M., Alvarez Serna, A. F. Entorno de integración de PBL y CSCL para la enseñanza de algoritmos y programación en ingeniería. En revista Avances en Sistemas e Informática, ISSN 1909-0056, Vol. 5, Nº. 3, 2008 , págs. 189-194. Disponible http://www.revista.unal.edu.co/index.php/avances/article/view/10112 Junio 2012
  • Pérez Pérez Juan Ramón, Paule Ruiz Mª del Puerto, Cueva Lovelle Juan Manuel (2006) “Capítulo 3. Sistemas orientados a la mejora de la calidad del software” en congreso IV International Conference on Multimedia and Information & Communication Technologies in Education (m-ICTE2006).
  • Gallego M., Gortázar F. (2009). EclipseGavab, un entorno de desarrollo para la docencia online de la programación. JENUI 2009. Barcelona, 8-10 de julio de 2009 ISBN: 978-84-692-2758-9. Disponible: http://jenui2009.fib.upc.edu/ Junio 2012
  • Rodriıguez del Pino, J.C., Royo Rubio E., Hernandez Figueroa. VPL: Laboratorio virtual de programación para Moodle. En Actas de las XVI Jornadas de Enseñanza Universitaria de Informática, Jenui 2010, pags. 429–435, Santiago de Compostela, Julio 2010.
  • Vizcaíno, A., Contreras, J., Favela, J., & Prieto, M. (2000). An adaptive, collaborative environment to develop good habits in programming. Proceedings of the 5th International Conference on Intelligent Tutoring Systems (pp. 262-271). Montreal, Canada
  • Vizcaíno, A. (2001). Enhancing Collaborative Learning Using a Simulated Student Agent. Tesis Doctoral. Universidad de Castilla-La Mancha. Disponible: http://dialnet.unirioja.es/servlet/tesis?codigo=10644