A few months ago at the Liberty QA code sprint in Ft. Collins, CO we started work on the OpenStack-Health dashboard. I also recently announced the dashboard to the openstack-dev ML to try and raise it to the attention of the broader community and to try and get more users and feedback on how to improve things. I figured it’d be good to write up a more detailed post on the basics of how the dashboard is constructed, it’s current capabilities and limitations, and where we’d like to see it move in the future. Especially given that the number of contributors to the project is still quite small. For the project to really grow and be useful for everyone in the community we need more people helping out on it.
Continue reading Exploring the OpenStack-Health dashboard
A little over a year ago I started writing subunit2sql, which is a project to collect test results into a SQL DB. The theory behind the project is that you can extract a great deal of information about what’s under test from the results of tests when you look at the trend data from it over a period of time. Over the past year the project has grown and matured quite a bit so I figured this was a good point to share what you can do with the project today and where I’d like to head with it over the next year. Continue reading Using subunit2sql with the gate
Last week we had a 3 day code sprint for the QA program in NYC: https://wiki.openstack.org/wiki/QA/CodeSprintKiloNYC HP hosted the event at the office in Chelsea. Overall it was a very productive week were we accomplished a great deal. The goal of the sprint was to make a good push on a list of priority items which had seemed to be a bit stagnant and get some code landed to try and push forward on them. It was also a valuable opportunity for a bunch of us to get together and get to know the people we work with daily basis a bit better. Something which is often hard to do over IRC, gerrit, or the ML. Continue reading OpenStack QA Code Sprint in NYC
A few months ago I made the post about debugging a gate failure. It has been linked around and copied to quite a few places and seems to be a very popular post. (definitely the most popular so far on this blog) I figured since the bug I opened from that was closed as invalid a while ago that I should write an update about the conclusion to the triage efforts for the OOM failures on neutron jobs. It turns out that my suppositions in the earlier post were only partially correct. The cause of the failures was running out of memory but what was leading to the OOM failures wasn’t just limited to neutron. It was just that the neutron jobs ran with more services which used more memory which made failures there more common.
Continue reading Gate Bug Triage Conclusion
Recently I was helping someone debug a gate failure they had hit on one of their patches. After going through the logs with them and finding the cause of the failures, I was asked to go through how I debug gate failures. To help people understand how to get from a failure to a fix.
I figured I would go through what I did and why on that particular failure and use it as an example to explain my initial debug process. I do want to preface this by saying that I wasted a great deal of time debugging this failure, because I missed certain details at first, but I’m leaving all of those steps in here on purpose.
The log url for this failure is:
Continue reading Triaging and classifying a gate failure
Based on some of the comments that were posted on the recent “Which Program for Rally” ML thread I feel that there’s been some continued confusion around exactly how all the projects work together in the QA program. So after discussing it with a wise council of my elders, I decided to start a blog so that I had a place to post more details and I could give a high level overview and clarify how everything works. I’m not really sure how much I’ll be using this blog in the future, as having one is something I’ve resisted for quite some time. But, I felt that making this post warranted me giving in to peer pressure.
Today’s the QA program :
So today in the QA program we have 3 projects, here is a high level over:
- Tempest: The OpenStack Integrated test suite, it’s concerned with just with having tests and running them
- devstack: A documented shell script to build complete OpenStack development environments.
- grenade: Upgrade testing using 2 versions of devstack to test an offline upgrade. Tempest can optionally be used after each version is deployed to verify the cloud.
Each of these projects is independent and is useful by itself. They have defined scope (which admittedly gets blurred and constantly evolves) and when used together along with external tools they can be used in different pipelines for certain goals.
Continue reading QA Program: From Juno into the Future