Why I Withdrew From a Speaking Commitment

I have recently withdrawn from an upcoming speaking commitment. I wouldn’t have done this unless I felt it was absolutely necessary. I want to use this blog post to explain my reasons for doing so and to, hopefully, allow others to learn from my mistake.

I don’t consider myself to be the most principled of people. I have participated in a partial ‘pay to speak’ conference this year, and have submitted to another one for next year. But there were some things about this upcoming event that made me feel uncomfortable. I didn’t promote my participation in the event on social media, as I was not proud of being a part of it.

I want to start off by saying that I’m not particularly proud of  having pulled out of a commitment, especially with it only being a few weeks until the event. It’s not something I make a habit of. I realise it was my mistake in the first place to commit to speaking at an event that it turned out I disagreed with.

I jumped at the chance to speak at this event when I was asked. As a new speaker I was flattered to have been approached to speak. From now on I will be doing more research into any event that I submit to speak at/am approached to speak at.

My reasons for pulling out are as follows:

Lack of Diversity in the Programme

The main reason for me pulling out was the lack of diversity on the programme.

There were eight speaking slots at this event. Four of these slots were filled by white males. Another two were keynotes given to the event sponsors. That in itself doesn’t sit right with me – does being able to pay hundreds/thousands of pounds means you will produce a great talk? One slot was filled by a female speaker. The last slot had not yet been announced.

The email promotion for the event contained pictures of the four white males only. This was not something I wanted to be a part of.

When I was asked to speak, I did raise my concerns around the lack of diversity. At this point I think I was about to become the third white male speaker. Unfortunately, I didn’t push my concerns hard enough. I should have used an example from Richard Bradshaw’s Speaker Rider. This is definitely something I will learn from when I look for my next speaking arrangement.

Lack of Care for Speakers

I was already having major doubts about the event when I saw the lack of diversity in the programme. It was at this point I decided to do some more research into the event organisers and speak to others who have spoken at their events in the past.

It didn’t take me long to find out that the event organiser had, in the past, sold video content from previous conference sessions for hundreds of pounds. It also appears that speakers themselves were given a “generous” half-price discount on said session videos.

After agreeing to speak at the event, I had no social media interaction with the organisers. No invitation to promote my session with a video recording, for example, and they have done nothing to promote the programme for the event. The impression I got was that they didn’t have their speakers’ best interests at heart. No speaker costs at all were covered. They even misspelt my name in the promotional email.

I know I have stated that I have spoken at a partial pay to speak conference in the past – but that conference at least paid accommodation and interacted with their speakers. If an event is making thousands of pounds from the content that speakers are providing (both from tickets and digital content after the event) then they can (and should) at least cover some travel/accommodation costs.

Elitism

I recently received an email from the organisers asking about whether my employer was interested in sponsoring the event. If we paid thousands of pounds then our company was given a speaking slot. As I mentioned above, surely having money to sponsor an event does not automatically equal good content? I would suggest it can lead to a sales pitch rather than some quality conference content. It’s safe to say that I did not forward this email onto my employer.

The organisers also organise industry awards that you have to pay hundreds of pounds to enter which, for me, takes away any credibility that these awards have. You’re essentially buying an award.

Finally…

Thanks to those of you who I spoke to about making this decision. I could have gone ahead with the commitment, and voiced my displeasure to organisers, but I knew I wasn’t going to be happy with myself if I did. The words would have felt empty.

Instead, I made the decision to withdraw my participation and I feel a massive relief in having done so. I know this was the right decision for me.

 

 

Advertisements

3Cs Board

Back in May I blogged about how using a Kanban board to visualise your team’s workload can really help with improving productivity.

One of the other visual management tools we used within the team was a Lean 3Cs Board. Our whiteboard was set up to allow us to raise Concerns within the team, identify the root Cause and then implement a possible Countermeasure to this root Cause. This was part of our team’s continuous improvement effort.

What is a 3Cs board?

The 3Cs in this case stand for Concern, (Root) Cause and Countermeasure. The idea is that at any point during a sprint, or at a Retrospective, a Concern can be raised by anyone within the team. The board consisted of the following columns:

  • Date Added – when was this Concern raised?
  • Concern – what is the Concern?
  • Cause – what is the root Cause of this concern?
  • Countermeasure – what possible Countermeasures are there to this root Cause? The team should select one of the suggested Countermeasures and implement it.
  • Action – what has the team decided the next step is to progress this Concern?
  • Action Owner – who is taking ownership of this Concern?
  • Target Date – when should the next Action be completed by?

Here’s a short video from the NHS describing a 3Cs board.

The Date Added column held the date that the Concern was raised. The Target Date is the date that the team has decided the next action should be completed by.

For example, if I were to raise a Concern that it was too hot in the office then the next Action could be to identify the root Cause of it being too hot. We would update the Target Date to be a week (for example) in advance.

I could be the Action Owner, meaning it is my responsibility to ensure we have a root Cause identified by this Target Date. It doesn’t mean I have the responsibility of identifying the Cause myself, it’s just my role to ensure it keeps moving along.

How did this help our team?

We had always done Retrospectives in our team but we weren’t seeing any end results from the Actions identified in Retrospectives. Setting up a 3Cs board, and a weekly stand-up around this board to discuss the Concerns, ensured that we were always thinking towards a possible Countermeasure/solution.

The physical whiteboard was visible in the team’s area and not hidden away in an electronic document. We found that in the past Actions weren’t happening because team members were forgetting about them.

It was also important that any Countermeasures implemented were agreed upon by the whole team. The team felt self-empowered to improve our own processes. Once we identified a final Countermeasure the card was removed from the board. If we found that our Countermeasure didn’t actually solve the concern then we were free to raise the Concern again. It really helped us continuously improve our processes.

Example

We raised a Concern that we were being blocked from doing a specific task as we relied on an external team to do this work for us.

Our Cause was that we didn’t have permissions to carry out this task ourselves. Why? We hadn’t asked to be given these permissions before as it was always seen as the external team’s role. We now agreed that we wanted to be able to do as much work as possible within our own team. We wanted to own our work from beginning to end where possible.

The possible Countermeasure, selected by the team, was to discuss with the external team and ask to be given permissions to carry out this task. This means that if a similar type of task was carried out by our team in the future then we could handle it completely within our team and there would be no reliance on an external team. This would improve our productivity as similar tasks would not be put on hold in the middle of our sprint.

 

 

 

Using Kanban in the Scrum Team

In March I was selected to attend Craneware’s internal Lean training course led by a Lean consultant. The business has made a commitment to transition to Lean and in order to facilitate this transition, several employees from different departments have been selected as ‘Lean Leaders’ to ensure the transition is facilitated from the bottom up as well as top down.

Having completed the course my aim is to apply the Lean theory I learned into my daily work. I’m going to write a series of blog posts detailing the different ideas I’ve applied and what benefits I’ve gained from them.

The first topic I’m going to write about is:

Kanban

One of the key takeaways from the course was the use of Kanban to visualise and manage workflow. You can read more about what Kanban is here. In this post I’ll discuss how I took this information back to my team and the success we’ve had in implementing Kanban so far.

We’ve been using a Kanban board in our team for six weeks. It wasn’t a hard sell to the three others that this is how we want to manage our work. Most of us had used Trello for personal projects, but perhaps hadn’t properly applied Lean techniques to their full effect whilst using it.

The major difference was that this time we were going for a physical whiteboard with cards and magnets rather than managing work electronically. This raised a couple of questions – weren’t we duplicating work between TFS and our Kanban whiteboard? Doesn’t somebody need the data that is recorded in TFS?

After agreeing to be accountable myself for ensuring that TFS was maintained, and the required reports were still sent to management at the end of each sprint, the team seemed pretty enthusiastic about giving this a go. We’ve now found that the whiteboard is our source of truth and I only need to update TFS once every couple of days.

After six weeks we have identified the following advantages to using a physical whiteboard:

Visualisation

We started off by drawing the board out. We drew horizontal swim lanes, one for each member of the team. We also drew vertical columns with the headers ‘To Do’, ‘In Progress’ and ‘Done.’ Since we still had to work within the Engineering department’s two week sprints we add our committed work into the ‘To Do’ column after each planning session. We know roughly at this stage who will be carrying out which tasks, but moving the tasks in the ‘To Do’ column up and down a lane is not uncommon if a team member is becoming a bottleneck during a sprint, for example

An early implementation of the board is shown below. We have some early indicators of unplanned and Blocked work but it’s a bit messy looking.

IMG_1138

And now an image of what it looks like today. We have a much clearer key and it looks a lot tidier with better sized cards and tape to divide the columns rather than pen, which rubs out.

9E79C137-7B68-4F24-B6C1-81002DBCD25A

We found that early into our first sprint that using Kanban allowed us to visualise who the bottlenecks in the team were. Our team consists of me (test specialist), two Developers and an Architect. Halfway through the first sprint the Architect and I still had the majority of our tasks to complete but the two developers were pretty much finished their work for the sprint. This was extremely easy for anyone walking past to identify because there were no planned tasks left in their ‘To Do’ columns.

It also became easier to visualise blocked tasks. By writing a big ‘B’ next to the task or using a specific colour of magnet we could see just how many of our tasks were blocked from moving into the next column. We can begin to ask why these are blocked and if the same reasons keep recurring then we can look to improve that process within the team.

Perhaps my favourite benefit was that we began to properly highlight unplanned work. It’s not uncommon for teams to not meet their sprint commitments. A lot of this is down to external influences giving urgent work to members of the scrum teams. By tracking this unplanned work on our Kanban board and recording the time spent on these tasks, we are able to identify exactly how big a problem this is. We can then begin to raise this with the people who are giving us this work and highlight to them that this is affecting our productivity. This should have the effect that they will reconsider whether the work they are giving us is urgent enough to interrupt our sprint.

Stand-Ups

As a team, we have found our stand-ups more productive. We found that we were asking the important questions, for example, ‘what is stopping you moving this card into the next column? Are you blocked?’. By using the board as a guide the conversations were more focused (no ‘ummm I’m not sure what I’ve done today… some coding, meetings… yeah..’) as all of the information we needed to discuss was in front of us.

By focusing on blockers, discussing urgent tasks which had appeared on our board, discussing cards which hadn’t moved in a while and highlighting process flaws (bottlenecks, lack of proper planning) we’ve found that stand-ups around the Kanban board are really productive.

Managing Work in Progess (WIP)

It’s an obvious one, and is probably the main reason most people begin to utilise Kanban, but our board ensured that we were only tackling one item of work at a time. I think everyone in software development is guilty of spinning too many plates at once, jumping between tasks etc.

We found that knowing that their should only be one card in the ‘In Progress’ column at any one time, and keeping each other accountable for ensuring that this is the norm, we were less likely to jump between tasks. This brings benefits to the quality of our work. When you’re jumping between tasks you’re introducing waste and risk by constantly having to re-focus your mind on the task at hand.

Retrospectives

Finally, using a Kanban board has provided us with plenty of material for our end of sprint retrospectives. I’ll discuss how we have used a 3C board to identify concerns and create solutions within the team in another post. I just wanted to use this section to highlight how Kanban can help in identifying these concerns.

The things we’ve been able to highlight as problems in our team so far are:

  • As mentioned above we identified that the Architect and I were bottlenecks in our team due to being the only people who possessed the knowledge (and permissions) required to carry out our tasks.
  • We’ve also been able to highlight that there are often blockers within the team that prevent work from moving from ‘left to right’. Again, due to knowledge and permissions.
  • A lot of priority changes and urgent work are coming from the same external people, disrupting our flow.

By highlighting these problems through the use of our Kanban board we are able to continuously improve the flow and quality of our work.

 

 

 

Challenges I Face With Performance Testing – Part 2

In August last year I wrote the following Blog post – detailing some challenges I had come across whilst attempting to create a performance testing strategy at Craneware. This post can be read here: Challenge I Face With Performance Testing.

Towards the end of that article I asked the following questions:

  • Which toolset should we use to create our performance tests?
  • How can we plug this into our Application Insights APM solution?
  • How do we best integrate this into our CI/build process?
  • Where should the tests be run from?
  • Which tests should be included in a Definition of Done?

I’m now going to use this post to give an update on where I got to with answering these.

Which toolset should we use to create our performance tests?

I quickly figured out that there’s no silver bullet when it comes to answering this question. A colleague and I evaluated a number of tools (Load Testing within Visual Studio, Artillery) but eventually settled with Apache JMeter.

It’s open-source, meaning we needed no business sign-off to use it and it’s one of the most widely adopted performance testing tools, meaning that there was no lack of support on Stack Overflow etc. It was also easy to get up and running with it as it has a GUI.

As easy as it was to get tests up and running – there were a lot of areas to explore. I made sure we read and re-read the (very good) documentation. I figured out early on that it’s really easy in performance testing to return false results. I also had some frustrations around how much RAM and CPU JMeter used when performing heavy load tests. I can’t emphasise enough the importance of following these best practices when using JMeter.

Overall though, I’m happy with our choice. Teams found it easy to get to grips with after we created an initial test template and our performance test script repository is growing quite quickly.

How can we plug this into our Application Insights APM solution?

To be decided. We still have no answer to this. The results we use come from a JMeter graph generator plugin. Teams look into CPU usage etc from our APM solution as tests are running but it’s a very manual process.

How do we best integrate this into our CI/build process?

Also still to be decided. My colleague and I are going to suggest we use BlazeMeter to host our JMeter test scripts as I believe the benefits far outweigh the cost. We’ll be making a business proposal in the near future.

The other solution is we develop our own tool using command line to create a build step in our API releases on TFS. This will take a lot of development effort both to create and maintain.

Where should the tests be run from?

Currently our tests are being run from our local machines but we also now have several VMs hosted on Azure at our disposal for some of our heavier load tests. In our CD process we will be running the tests on VMs.

Which tests should be included in a Definition of Done?

We were successful in getting the following tests into the scrum teams’ Definition of Done. It’s a very small start, but it’s a start. We had no performance testing at all before the end of 2017 so I’m quite proud of what we achieved with the teams adopting these tests:

  • Single user load test – does the API meet our NFR with a single user making requests? This could be used as a smoke test in Production in the future.
  • Expected load test – teams define the number of users making requests before an API is developed. We then test with this number of users and ensure our NFRs are met.
  • Maximum expected load test – As above, but we run with the maximum number of requests we can expect in our system.

We’re obviously missing a few different types of performance tests here. Namely soak, spike and stress tests.

I think our biggest victory is that teams are now talking about performance from the very beginning of development. Teams are re-working solutions based on performance concerns. This wasn’t happening this time last year so I would declare this a victory, especially considering neither myself nor my colleague (who were responsible for delivering this strategy) are dedicated performance testing experts. We’re testers within scrum teams who research this within our self-learning time.

We’ve now hired a Performance Engineer who will take the lead on further developing our performance testing strategy. I’m looking forward to working closely with her and learning from her experience.

Conference Week – March 2018

On the week beginning 12th March 2018 I was extremely fortunate to be able to attend two conferences – UKSTAR in London and TestBash Brighton. I was attending UKSTAR as a speaker, where I presented for the first time and I was fortunate enough to win my TestBash ticket through the Testers Island Podcast. These were my first two testing conferences as I have never been in a position to self-fund or secure funding through work prior to this.

I want to write about some areas that stood out for me over the course of the week. It’s a bit of a brain dump!

My First Talk

On the Monday at UKSTAR I presented my first talk. Prior to this I had delivered a 15 minute lightning talk at a local testing meetup. Unfortunately I wasn’t on stage until after 4pm (the last talk before the closing keynote) so I had to the whole day to build myself up. Sometimes concentrating on other talks was a struggle because I kept thinking about my talk later in the day.

Surprisingly, I didn’t really feel properly nervous until just before my talk. I was lucky in that there was a 25 minute break between the previous talk finishing and my talk starting. This gave me plenty of time to get some water and get everything set up before I began. Alan Richardson (@eviltester) was great at talking to me right up until he introduced my talk, which distracted me from my nerves.

120318 - Eurostar Testing London-248

I think overall my talk went well. I got a lot of good feedback in person, on Twitter and from the official feedback from attendees. I know that I had a pretty long pause in the middle which I put down to nerves as I completely lost my train of thought. I still found it really hard to shut out my own critical voice as people were complimenting me. I decided beforehand that no matter what I was going to accept the praise, but inside I was thinking ‘yeah, but what about when I…’ or ‘I should’ve done this…’

The overall ‘buzz’ afterwards was great. It made the hours spending writing, re-writing, creating slides, rehearsing, re-writing and rehearsing again all worth it. I definitely want to talk at a conference again, but I don’t think I want to do more than one a year. Advice to myself in the future would be to include notes within the PowerPoint presentation in case I have a mind blank again and to prepare the talk a bit further in advance!

Now that the event is over, and I’ve had the positive feedback, I’m ready for some constructive feedback. Feel free to get in touch to offer me some.

As well as performing my first full talk I also gave a 99 Second Talk at TestBash. It was an idea that I had the day before so on the morning of the conference I wrote it in a notebook and throughout the rest of the day adding to it etc. The final result can be found here.

DYbU4SnXcAAqVY7

Testing Community

The thing I was most excited about when heading down to London and Brighton was meeting people who I had followed on Twitter for 1-2 years. I’d been speaking to Danny Dainton (@DannyDainton) a lot before the event as we were both speaking for the first time (and I’m delighted I got to see his talk in person!) at a conference in the same week, so I was particularly excited about meeting him in person.

I felt that UKSTAR had a good community vibe during the breaks etc and I met a lot of great people there, but TestBash’s feeling of community was on another level.

Everything at TestBash is designed to enable conversation between attendees. The Meetups throughout the week, the UnExpo, the single track conference and the Open Space on the Saturday. I’m not sure if it’s because Ministry of Testing (MoT) put events like this in place that means the community feel is so big or if it’s the community feel that allows MoT to host these type of events throughout the week, but it works really well!

One thing I found difficult was actually introducing myself to people. Richard Paterson (@rocketbootkid – who gave a great talk) covers it well in his blog about UKSTAR. I think I said hi to most people I recognised, but I found it quite a scary thing to do.

The testing community is a great thing though. I really enjoyed the networking I did and I know if I have technical challenges at work I can reach out to some people for advice.

One thing that was mentioned at both events was the value in testers attending non-testing tech meetups. I’m going to attempt to attend a few local ones over the course of the year as I believe it will be valuable to me to understand what developers, architects, DevOps engineers etc are discussing and how testing fits into this.

Communities of Practice

One of the big themes for the week was around Communities of Practice. This wasn’t a term I’d heard before the start of the week but after hearing Christina Ohanian (@ctohanian) deliver a great keynote on the subject at UKSTAR, Emily Webber (@ewebber) open TestBash with a brilliant talk on the same subject and identifying a couple of TestBash Workshop and Open Space sessions on CoP, I decided it’s something I’m going to need to explore further.

I’m in the process of setting up a community around improving our Continuous Delivery process at work. I initially called it a guild but I think community will be a better word for it.

This blog post by Lee Marshall (@nu_fenix) seems like a good starting point to me, as well as Emily’s book ‘Communities of Practice’.

Conference Formats

One thing at UKSTAR which perhaps didn’t work so well were the one and a half hour workshops. I attended two of the technical workshops and I don’t feel that you could get hands on enough in that time slot. That’s not to say that the sessions weren’t valuable, Dan Billing’s in particular gave me a lot of good ideas.

The Conversation Track at UKSTAR was a cool idea. I enjoyed the session I attending with Simon Prior (@siprior) and Joel Montvelisky (@joelmonte) about introducing testing as a career to Universities and offering more value than simply testing. It consisted of two great shorter presentations and then the floor was opened to attendees and presenters to discuss the topic further. I think it worked well.

I really enjoyed TestBash’s single track conference format. I think it’s great to have everyone at the conference watching the same talks and it means that there’s extremely high quality talks on a good variety of topics.

Overall

I had a great time attending both conferences and I can’t wait to attend my next one. I feel like my ‘conference week’ has and will help me grow in my career.

Now that I’m finally back at work I need to put some of the ideas I’ve gathered from the different sessions and interactions into practice. I need to prove to my company that sending people to conferences is worthwhile.

 

Why Testers Need to be Involved Early

I’ve had a really good week at work. We’re moving our clients’ data from one database technology to another. This may not sound like anybody’s idea of fun – data migrations, new schema structure and learning how to support this new technology are all part of the challenge. But it’s a challenge that I’m enjoying.

The main reason I’m enjoying it is that testing has been considered from the very beginning.

My typical experience of solution design and requirement discussion is that developers meet with architects but do not include testers. Designs are drawn out on whiteboards and a high-level solution is agreed upon. This is then given to a scrum team for further refinement and efforting. This is the first time testers see the proposed solution.

The problem I’ve experienced with testers not being involved in the initial design discussions is that testing is not discussed in any great depth. Any limitations that we may have as testers may not initially be realised.

Prototypes are sometimes created from these design sessions, often using technologies that we as testers haven’t experienced before. This means that testers are playing catch-up in understanding the new technology when the work comes to the scrum team. How can we effectively test when we haven’t had the same exploration time that developers have had?

The project I’m currently working on has been different.

We had a very high-level design session including the whole team. In this session we were able to highlight what tests we would need to carry out as each section of the design was drawn on the whiteboard. This ranged from functional API and UI tests, performance testing and data integrity tests as the data is migrated. I’m going to discuss a few of the immediate benefits I’ve seen in this.

We need to write a tool which migrates documents from the old database solution to the new one – we were there to highlight the importance of this tool checking a percentage of these documents in line-by-line and also test all of the documents at a high-level. It was decided we would pair up with developers when writing this tool and then run the migration ourselves on our test environment to ensure we have sufficient data integrity checks in place.

We highlighted our technical limitations in writing a tool to populate the database with hundreds of millions of documents. We raised our concerns with the team straight away. This wasn’t something we had had to create before, so we gave writing a tool to populate this data a good shot. We couldn’t get the throughput to a level we needed (it was going to take months to populate the data), but we wrote the logic for creating realistic test data. We highlighted this problem to the team and a developer worked alongside us to improve the performance of the tool. The benefit of us being involved early was 1) the team knew we may need support and kept this in mind and 2) by working alongside developers, everyone knew the type of data we wanted to test.

The last benefit was to do with performance testing. We highlighted straight away that we needed Virtual Machines to achieve the throughput required for our tests to be realistic. We also highlighted that running JMeter in a distributed way wasn’t something we had done before. We worked with our architecture and DevOps team and since we’d raised this early, they knew exactly what we needed and in plenty of time. We’ve now got tests set up to go a week or two ahead of when we need to run them. We also got valuable feedback from the team in regards to the tests we were writing and how to gather the metrics we needed to put into the tests from our production logs.

We’ve explained how we want our performance testing to evolve and they’re now totally on board. So by being involved early – we haven’t just got buy-in for performance testing this solution – but for our performance testing solution on a longer term.

Hopefully the value of having testers involved early will shine through at the end of this project. We’ve already highlighted risks and test plans at the earliest possible stage.

Going forward, I’m going to make sure that the teams know the value of having testers involved in whiteboard design sessions so that we can discuss a visible test plan early.

Challenges I Face With Performance Testing

Over the past few weeks, in my new role as Automation Engineer, I’ve been tasked with developing a performance testing roadmap. The initial aim of the roadmap is to have each of the scrum teams carrying out their own performance testing. We also want performance testing carried out on a nightly basis, as well as being part of our Continuous Integration pipeline.

As a company, we’re primarily using Microsoft products. Our build and release pipeline is hosted on an on-premise TFS, our tests are written in C# using Visual Studio and our logging comes from Application Insights.

Continuous Integration

Our initial plan was to use our API integration tests, including an assert on response time, and create load tests from these. Another solution was to create web performance tests to performance test user flow through the product. We envisaged having load tests being run after each build is deployed to our test environment and the web performance tests run on a nightly basis as a separate build process. We also wanted to hook our tests into Application Insights so that we were reporting exactly what is happening in the environment for the duration of the tests.

This all sounded great, in practice. Unfortunately, with an on-premise version of TFS, we do not get some of the amazing cloud testing functionality you get with TFS online. This was a bit of a set-back for us – it meant we would need to re-consider a) where the tests are run and b) how we pull back metrics from Application Insights.

It also raised a few more questions about our tooling – are we right to stick with Visual Studio Load/Web Performance tests?

As for where the tests are run – the obvious choice is on a VM. Do we host these VMs within the company or do we use Azure DevTest Labs? How do we get logging information (such as memory usage, database usage, response time etc.) from Application Insights? Do we make an API call to retrieve this information once the tests are finished?

These are just some of the questions raised by the fact we couldn’t plug straight into Microsoft’s cloud testing solution.

Definition of Done

Another challenge we’re facing is bringing in performance testing into the scrum teams’ Definition of Done. From my reading I believe we would at least want to test the following scenarios:

  • Single user performance test – to ensure our NFRs are being met in terms of API response time.
  • Load testing – with an expected load – ensure NFRs are being met, as above. Can be done on APIs in isolation, as well as simulating user flow.
  • Stress testing – find the breaking point in the system so that we know our limits.

I believe the single user tests and load tests should be run as each new build is created.

Our challenge now is, our toolset remains in question. Demoing the creation of tests to the teams is not possible when we do not know what we will be using.

As a stop-gap, I’m going to demo a tool called Artillery to the teams and explain how to carry out the above tests using this tool. However, these scripts are manual to create, although they do not take much time to set up.

My thinking around this is that any performance testing is better than no load testing. Although this is a temporary solution, surely it is better than having no performance testing at all?

Conclusion

The reason for writing this post is to get all my thoughts written down. I want to know exactly what I feel my challenges are so that I deal with them one by one. I would also appreciate pointers/feedback from anyone reading this who has been through this journey already.

My questions are:

  • Which toolset should we use to create our performance tests?
  • How can we plug this into our Application Insights APM solution?
  • How do we best integrate this into our CI/build process?
  • Where should the tests be run from?
  • Which tests should be included in a Definition of Done?

 

 

Creating a Production Monitoring Dashboard

Ever since reading ‘The DevOps Handbook’ a few months ago (highly recommended for absolutely anyone involved in software development) I’ve developed a bit of an obsession with environment monitoring and making our environments visible to absolutely everyone in the company.

The benefits of this are well documented – but let me cover a couple of them.

As software companies move into more frequent production deployments the need for fast feedback from production increases. The use of Application Performance Management (APM) solutions has increased. These solutions allow us to monitor the performance of our production environments and allow us to quickly identify any slowdown which may be negatively impacting user experience. They can also be useful in identifying the root cause of any errors in the application.

In an ideal world the APM solution can highlight any issues (slow performance, exceptions, services not responding etc.) so that engineering can be proactive in fixing these issues. In my mind, this means problems can be fixed before the end user even has to report a bug.

In order to achieve this ideal scenario, our production environments need to be visible to everyone within the engineering department.

The Challenge

It was soon after learning about environment monitoring that I was of the opinion that our production environment was not as visible as it should be. We had a tool created by an Architect which pulled down all logging information from our APM solution (Application Insights) for a specified time period and placed it into a spreadsheet.

This tool was really useful, and I was an avid user, but getting people to view the logs on a consistent basis proved to be a challenge.

My First Step to a Solution

I wasn’t convinced that this tool was being used widely enough and therefore decided to try and come up with the beginnings of a more permanent solution. We needed something which would be visible to everyone in the office.

We have multiple TV screens dotted around the office which usually display client bug counts. The idea came to me that we should use some of these screens to display a live representation of our application’s production performance and any errors which may be getting thrown. We could also easily integrate the dashboard into our SharePoint launch page so that it’s the first page everyone sees as they log in for the day.

My thinking was that if we displayed a count of slow API response times in a chart on a dashboard, and people walked past and saw this number was really high, then they would be inclined to dig deeper into the logs.

By making any problems visible at a glance of a TV screen or internet launch page, we, as engineering, would have instant visibility of our production environment. This would make us more proactive in dealing with any production issues.

Creating the Dashboards

I required support from colleagues when it came to creating the dashboards. I had an idea – but creating a decent looking graph was going to require assistance from architecture (who could help me write Application Insights Analytics queries and pointed me in the direction of the PowerBI tool) and Business Intelligence (BI – to help me present the data in a suitable way).

Application Insights Analytics is a powerful search tool which allows you to return specific logging information, without having to manually click through the Azure Portal UI. You can search through the entirety of your logging output (from up to 90 days ago) with a relatively simple query. The queries are written in a SQL-like (AIQL) language, an example is below:

requests
| where timestamp > ago(30d)
| summarize ClientCount = dcount(client_IP) by bin(timestamp, 1h), resultCode

I was lucky enough that I didn’t have to create my queries from scratch – I could use the queries already created for our current logging tool and edit them slightly. This actually turned out to be a lot easier than I first anticipated. I did need a bit of a push to jump into it at first but once I understood what the existing queries were doing, manipulating them to meet my needs turned out to be relatively painless.

I would like to store the queries I use to pull back the production metrics in a central location. This would then allow anyone who wishes to investigate the data displayed on the dashboard to run the query on Application Insights Analytics and dig a bit deeper into the root cause of any problems.

Now that I had the relevant metrics I began discussing my dashboards with a developer who has BI knowledge. It was decided it would be best to display the last seven days worth of data on each dashboard to give the metrics some context. The reason for this I’d like to highlight in this example:

If there were 50 exceptions on one API after deployment on a Friday, but there were 1000 the day before the release, would you be concerned?

How about if those 50 exceptions appeared on the Friday but there were 0 the day before? That could possibly change how you interpreted the same figure as above.

I initially created 3 dashboards using Microsoft’s PowerBI tool – one displaying the last 7 days worth of performance metrics, another displaying exceptions on WebJobs and APIs over 7 days and the third displaying a combination of both but over a 24 hour period.

PowerBI plugs in seamlessly to the Analytics queries described above and automatic data refresh times can be specified. Once the data refreshes, the dashboards automatically display the updated data.

The tool was actually pretty simple to get to grips with – the skill lies in being able to present the data in a sensible way. I really wanted to highlight spikes in performance issues and application errors at a glance.

2251.1
An example visualisation from PowerBI

I had my three dashboards reviewed by a BI developer and after some discussion we narrowed my suggestions down to displaying a combination of the first two dashboards in a single page. He believed there was an overlap in the data being shown in the first two dashboards and the third dashboard was only really valuable if I could display a live streaming of data. This idea has been set aside for now, but it will be the second dashboard I create.

So what did I decide to include on my dashboard? In summary, there are four graphs, all displaying data from our production environment over the past 7 days:

  • A line graph – with a constant line highlighting an NFR – measuring one of our main WebJobs throughput for each day.
  • A horizontal stacked bar graph – displaying all exceptions from the multiple different applications in our software.
  • Another horizontal stacked bar graph – highlighting all API responses taking longer than three seconds.
  • A vertical stacked bar graph – displaying 90th Percentile of API response times for values greater than one second.

I believe that this dashboard is a great proof of concept, using the tools we already had at our disposal. I can already think of multiple possibilities for future dashboards.  Of course, dashboards are only one way of monitoring production – smoke tests and sensible alerts are next on my list to tackle.

At the time of writing the dashboard is being reviewed at upper management level, licensing, logistics of displaying it on the screens/SharePoint etc is being discussed.

I’m happy with what I’ve achieved up to this point – I’ve proven that production monitoring doesn’t have to be overly complicated. If I can come up with a basic dashboard, then anyone can. Now I need to build on this so that we can have a constant stream of feedback from our production environment.

 

30 Days of Performance Testing: Days 25-30

Day 25 – Share three benefits of monitoring your application in production

I’m a huge fan of today’s challenge – production monitoring is something I’ve been working on and thinking about a lot over the past few weeks (even before the challenge began). I’m a massive believer in everyone having visibility of our production environment.

Here are my main three reasons for believing this is important:

Ability to fix issues before customers raise them- This one is my favourite. If we have sensible alerting set-up for exceptions, it is possible to investigate and fix issues before our customer even reports them (one of the benefits of working in Scotland and having a U.S. client base!).

Generates an inquisitiveness into behaviour – If we see that an API was particularly slow on a certain day at a certain time on a graph, it could trigger us to investigate what was happening on the environment at this time. This could give us a greater understanding of customer workflow – and potentially highlight a use case that may not have been load tested before.

Allows us to see effects our changes have had – After deploying our latest code changes to production it is important to monitor what effect these changes have had on performance. Of course, performance testing should be carried out as early as possible, but it is often possible that some scenarios may have been missed. There’s nothing more satisfying than pushing out a change and watching throughput quadruple!

Day 26 – Explore the differences between your test and production environments, could they impact performance tests?

Our NoSQL database (ArangoDB) is on one VM in our test environment, whereas on Production, it is in a cluster. This would impact our performance test results significantly as our production cluster is much more performant.

This has been raised with the leadership team and is in the process of being changed so we can start performance testing our ArangoDB cluster on our test environment. We have the 30 Days of Performance Testing challange to thank for that!

Day 27 – How do you share your performance testing results with your team?

Just now – the sharing of my results has no structure. I will run tests on an ad-hoc basis and let the team know if there is a problem.

What we’re currently in the process of doing is moving our performance tests as part of our CI/CI process. We would like the tests run on the cloud on a nightly basis. We will then store the reports from these test results in a central location. If any of the tests fail, then the relevant teams (e.g. API owners) will be notified via email and given a link to the reports.

The performance metrics I mention on Day 25 could also be used to display the results from our performance tests, or the environment’s status (CPU, memory usage) at the same time the tests are running. The plan is to make the queries/graphs ‘live’ for Production and then begin to roll the graphs back into our test environment.

Day 28 – Design, draw and share your ideal performance testing dashboard

Performance graph

First of all, apologies for the ‘design’ from Paint above! I’m hoping to have the real thing to show you in a future blog post.

My idea is to have these 3 dashboards on TV screens around the office. I believe it would be useful to have two of the screens showing a breakdown of performance metrics and exception/error counts over the past 7 days. This will display multiple API/WebJobs per day and will give a bit of context to the results.

For example, if a WebJob was throwing 567 exceptions on the Tuesday and then after deployment it was only throwing 55 on the Wednesday, you may react differently than if the WebJob was throwing 0 exceptions on the Tuesday but was now throwing 55 on the Wednesday.

I’ve had some input from our BI team on how to best display these charts and graphs – so hopefully they’ll be useful to the team when they go live.

The 24 hours graph is very much a work in progress – I know we can get hourly results from Azure metrics, I now need to figure out how to display these.

Day 29 – Explore how Service Virtualisation can assist with performance testing

Service virtualisation offers performance testers the opportunity to perform a load test on an end to end scenario containing many independent parts, before one or multiple of these independent parts has been completed by development. Essentially this means that we can test before developers have completed their work or if a specific service is ‘broken’ on our test environment.

We can emulate the behaviour of specific APIs and test using these mocked APIs rather than the actual deployed service. This reduces cost and ideally reduces the time taken to test.

On my Trello board I have a task to evaluate the Mountebank to mock APIs, but if anyone knows of any others then I’d be delighted to hear from you!

Day 30 – Share some potential challenges with performance testing in the mobile/IoT space

I’ve not had much experience in testing in the IoT space – but the main challenges I can identify are:

  • testing in realistic environments for the wide-range of IoT devices.
  • testing the wide range of combinations- if a piece of software can interact with hundreds of difference gadgets, how can we test all the scenarios?
  • there is (possibly) a lot more architecture to understand.
  • performance expectations will be at an all time high from users for many scenarios.
  • due to the many different devices/combinations – effective performance monitoring is crucial.

It’s a challenge that should be welcomed by all testers. An exciting opportunity to explore new technologies.

30 Days of Performance Testing: Days 19-24

Day 19 – Use a proxy tool to monitor the traffic from a web application

For today’s challenge I used Fiddler, a tool I’m sure many of us will be familiar with.

We currently use Fiddler to feed an API user flow into our load tests. We’ll begin by capturing API calls (can also be done from developer tools in your favourite browser) for a specific flow and then export it to a Visual Studio Web Performance Test, as described here. We’ll then create load tests from these web tests.

I always recommend using Fiddler (or developer tools) when testing a web application – it allows you to capture failing API calls and investigate what requests are being made as you browse your application.

2017-07-19 09_17_05-Telerik Fiddler Web Debugger

Using Fiddler to capture HTTP traffic whilst carrying out manual tests was my first step in becoming more back-end focused. It was the start of me questioning what was really going on behind the scenes in our application.

Day 20 – Explain the difference between causation and correlation

Causation – refers to the result of one variable being the result of an action of another variable.

For example, a large number of requests on a purchase API causes an increased number of HTTP server errors. As the number of requests on the API drops, the number of errors decreases. Our two variables, number of requests and number of errors highlight causation.

Correlation – Correlation can be used to find causation but a correlation between two variables does not mean that the two are linked in any way.

Using the example above – the number of requests on our login API is also increasing the number of HTTP errors. So we’re seeing number of HTTP errors increase on both our login and our purchase API. These are correlated – but the number of errors on the login API does not cause the number of errors on the purchase endpoint to increase.

Day 21 – Share your favourite performance testing tool and why

I’m going to cheat a bit here and talk about a monitoring tool rather than a performance testing tool. There are a couple of reasons for this;

1. I haven’t been performance testing long enough to really enjoy a tool. I have experience with Fiddler and with Visual Studio Web Peformance and Load tests but I feel like I’ve already spoken about those enough.

2. I use this tool (console) on a daily basis to measure our environment’s health and, of course, performance stats.

I’m going to talk about the Azure Portal and the advantages it offers me as a tester. I’m sorry for those of you who are not Azure hosted as this will be of limited use to you. Hopefully it will spark some ideas in to how to use similar platforms such as AWS.

There are 2 ways I use Azure Portal to measure performance:

Performance Metrics

I can use Azure to measure database usage. Microsoft’s measurement of database use is DTUs (discussed earlier in the 30 days challenge). If I’ve kicked off a load test on a database I’ll always check the DTU usage. If we’ve reached capacity then that means we may need to scale up on our service plan (extra costs), or we need to tweak our database design. If we’ve reached 50% usage for a load test of 500 users, then I can feed this information back to the stakeholders and continue to increase the user load until we reach capacity.

I can also track things such as HTTP server errors on APIs, average memory usage on WebJobs and APIs as well as average response times for APIs. You may have seen me ask for anyone who has used these metrics for testing on Twitter; the reason for this is that I’m trying to work on interacting with this data to display it in a central location for all our stakeholders to see. Hopefully there will be a blog post to follow on this subject in the future!

sqldb_service_tier_monitoring

Application Insights/Analytics – Request Times

Application Insights gives me a whole host of information. It allows me to track WebJob events, API requests (and I can sort by performance buckets) as well as giving me a whole host of amazing options to send availability (ping)  requests to APIs or run  in our production environment, to make sure they’re alive, as shown below:

04-webtests

Without going into too much detail here (this is going to be my next blog post) I’m currently in the process of using Application Insights Analytics to display production monitoring stats on TV screens dotted around the office, particularly the ones in the break area next to the pool table. I think it’s important that everyone has visibility of our production environment’s health.

Day 22 – Try an online performance testing tool

I wasn’t sure what was meant by an online performance testing tool – a performance testing tool that can be used through your browser is what I assumed, but I wasn’t aware of any. I’m interested to see what other people try for their own Day 22.

I decided to further explore Visual Studio Web Performance Tests. As I’ve mentioned before, as a company we use Microsoft products for day to day development; Visual Studio, TFS and our web application is Azure hosted. I’ve had a (albeit brief) look at tools such as JMeter and Gatling to run load tests but I don’t see the advantage of moving to tools such as these when Visual Studio tests plug in so nicely to our build/release process and our automated API integration tests are already written in C#. If anyone can suggest any, then I’m all ears.

I decided to focus my attention on figuring out how we run Visual Studio Web Tests in the cloud, rather than having them run from someone’s machine. We’re aiming to have performance tests run as part of our CI/CD process, so this step is critical.

There’s actually a pretty handy documentation guide here from Microsoft themselves. It’s actually relatively simple to do, the more difficult part will be working with our operations teams to get the server set up. There is also a way to run performance tests from the Azure Portal as outlined here.

On a side-note, whilst I’ve been exploring Azure/Visual Studio in more depth during the 30 Days of Testing challenge, I’ve always found Microsoft’s documentation to be easy to navigate and very intuitive.

Day 23 – Calculate the basic statistics for your response time results

Had this particular challenge been on Day 1, I would’ve provided you (and more worryingly, stakeholders) with the following statistics:

Quickest response, slowest response and the mean response.

There is only one of those times which I believe to be semi-valuable now and that is the mean response time, and even then that can be misleading. Quickest response tells you very little, that could have been one of the early requests of a load test and response time could gradually increase from then. Likewise, slowest response could have been a freak slowdown.

The statistic I tend to go for now when looking at API performance is percentiles, particularly the 90th percentile. This tells me that 10% of requests will be slower – thus giving me a snapshot of the very worst experience we could be providing to customers.

Other metrics I like to track are server errors and the number of responses which take greater than three seconds to respond.

Rather than kicking off a load test just now, I’m going to take percentile results from our QA environment over the past 3 days using Application Insights Analytics for all of our APIs. Here are the results (converted into seconds):

2017-07-23 18_51_04-qa-health-g0s08 - Analytics

Day 24 – Do you know what caused the last huge spike in your application’s performance?

We recently discovered that login times were spiking massively on our QA environment. This was an accidental discovery and was caused by a login history table being kept for auditing purposes.

Every time a user logged into our environment, the login history table was loaded into memory a new entry created, this caused the authorisation database to quickly run out of memory and caused login attempts to fail.

Our QA environment has been running for over one and a half years without this login history table being cleared. We’re fortunate that our application has only been on Production for a few months so we were able to optimise our login process before this became an issue for our customers.

I believe that had this login functionality been created now, we would’ve caught the performance flaw due to our newfound awareness of performance testing and monitoring. But it’s always easy to say that with the benefit of hindsight.