| CARVIEW |
If you’re building a chat system that has to actually talk to someone else’s chat system (and keep doctors happy while doing it), you’ll know: writing a specification is only half the battle. The other half is making sure that everyone follow it, and that everyone follows it in the same way.
That’s where the XMPP Interop Testing Framework comes in.
So, What Do We Do Again?
In short: we make sure XMPP software behaves the way the standards say it should.
We’ve built an open-source test framework that runs a bunch of automated checks against real XMPP servers using a real XMPP client library, testing everything from the core RFCs (6120, 6121, 7622) to the popular protocol extensions for things like:
- message receipts (
XEP-0184) - group chat (
XEP-0045) - file upload (
XEP-0363) - end-to-end encryption
It’s all designed to run in CI, with containers, and produce nice, clear pass/fail results, along with machine-consumable reports and human-readbale actionable information. The kind you can wave around in a meeting and say “See? Interoperable!”
Why NTA 7532 Folks Should Care
NTA 7532 is about making sure healthcare professionals can message each other securely, even when they’re on different systems and members of different organizations. That means encryption, integrity, and actual interoperability between products from different vendors.
You could write those requirements into a 200-page document (and you probably will). But to prove it works, you need tests. Preferably ones that don’t take a week to run by hand, and that aren’t only run just prior to launch and never again.
That’s exactly what we provide.
Our framework already checks for the building blocks that NTA 7532 is likely to depend on: authentication, transport security, message delivery, receipts, and so on. And because the tests are open and automated, every vendor can run the same suite - no secret sauce or proprietary knowledge required.
From “We Think” to “We Know”
Here’s the value add:
- Validation - The framework tells you, with logs and evidence, whether a given implementation matches the spec or standard.
- Transparency - Everyone can see what’s tested and why and how. The same tests for everyone, with the same criteria.
- Continuous improvement - When specs change or new features appear, we add new tests. Easy.
It turns a written standard into a living, testable thing. If you want to know whether two systems will work together before putting them in front of clinicians, this is how you find out.
The Bigger Picture
The fun part is collaboration.
The XSF writes and maintains the XMPP specs. NEN and the folks behind NTA 7532 define the national healthcare chat profile. And we, the Interop Testing Framework team, provide the bit in the middle: the place where specs meet running code.
Together, we can prove that “open standard” isn’t just a phrase, but that it’s something you can test, verify, and rely upon.
What’s Next
We’d love to:
- run pilot tests with any NTA 7532-aligned vendors
- map specific NTA 7532 requirements to existing (or new) XMPP test cases
- publish anonymised results to show real-world interoperability
- feed our findings back to both the NTA 7532 working group and to the XSF
If that sounds like something you’d like to be part of: fantastic!
Come talk to us.
Get Involved
The framework’s open-source, so dive right in:
Whether you’re writing specs, building servers, or just trying to get two chat systems to agree on a message receipt, we’re here for you.
Let’s make interoperability not just a checkbox, but a test you can actually pass ✅
Splash image courtesy of Marcus Urbenz, Unsplash
]]>Impossible Tests Can Fail Runs
Some tests can’t be executed if the server lacks required features. Previously, these “impossible” tests were skipped, which could make a run look fully successful when it wasn’t. Now you can configure the suite to treat impossible tests as failures, ensuring that a green run really means every configured test executed and passed.
Flexible Account Provisioning
Our tests act like clients, so they need accounts to log in. You can now choose from three provisioning methods:
- Admin Account using XEP-0133 to create test accounts.
- Explicit Test Accounts you configure up front.
- In-Band Registration via XEP-0077.
Pick the approach that fits your setup best. There is documentation available for you to review the finer details!
Together, these features give you more reliable results and more flexibility in how you run tests!
Splash image courtesy of Mohamed Nohassi, Unsplash
]]>Recent additions include Jenkins, Drone, Harness and Woodpecker.
This brings our total number of CI systems in which you can run XMPP interop tests up to a whopping ELEVEN, plus anywhere else you can run containers!
Whether you’re building in GitHub, GitLab, CircleCI, Jenkins, Forgejo, Woodpecker, Drone, Hardness or Bamboo, we’ve got you covered. If you build locally, you can run the JAR, and if you build anywhere that has a Docker or OCI image container runtime, you’re sorted.
We’ve done our absolute best to preserve every option in every runner, but not all CI systems are created equal, and there might be some discrepancies. If there’s a feature you’re missing that you need, do let us know.
Similarly, If there’s a CI system that you’re using that you’d like us to support, or if we do support it but you’re struggling, come find us in our MUC, or open an issue on GitHub.
Test all the things!!!1!
Splash image courtesy of Clay Banks, Unsplash
]]>Late last year, we reported that we had secured funding graciously provided by NLnet that allowed us to massively build out this project. This blog has been a bit quiet since then, but work has progressed. Significantly.
We have just released version 1.6.0 of all our test runners. With this release, we (again) more than doubled the total number of XMPP interop tests! By my count, our project now lets loose 933 tests on your XMPP server implementation!
The biggest chunk of work has gone into tests that verify parts of the basic XMPP protocol, notably for testing functionality that involves roster management (as specified in section 2 of RFC 6121) and for server rules for processing XML stanzas (section 8 of RFC 6121).
Additionally, a couple of new specifications are now being tested by our framework! Tests have been added for:
- XEP-0133: Service Administration
- XEP-0410: MUC Self-Ping (Schrödinger’s Chat)
- XEP-0421: Occupant identifiers for semi-anonymous MUCs
- XEP-0433: Extended Channel Search
This table gives a complete comparison of test coverage between versions 1.5.0 and 1.6.0.
| Specification | v1.5.0 | v1.6.0 | Difference |
|---|---|---|---|
| unknown | 13 | 13 | 0 |
| RFC 6120 | 1 | 1 | 0 |
| RFC 6121 | 11 | 402 | 391 |
| XEP-0030 | 19 | 19 | 0 |
| XEP-0045 | 252 | 252 | 0 |
| XEP-0048 | 1 | 1 | 0 |
| XEP-0050 | 4 | 4 | 0 |
| XEP-0054 | 10 | 10 | 0 |
| XEP-0060 | 24 | 24 | 0 |
| XEP-0080 | 2 | 2 | 0 |
| XEP-0085 | 1 | 1 | 0 |
| XEP-0092 | 1 | 1 | 0 |
| XEP-0096 | 2 | 2 | 0 |
| XEP-0107 | 2 | 2 | 0 |
| XEP-0115 | 12 | 12 | 0 |
| XEP-0118 | 2 | 2 | 0 |
| XEP-0133 | 0 | 44 | 44 |
| XEP-0198 | 10 | 10 | 0 |
| XEP-0199 | 2 | 2 | 0 |
| XEP-0215 | 6 | 6 | 0 |
| XEP-0232 | 1 | 1 | 0 |
| XEP-0313 | 2 | 2 | 0 |
| XEP-0347 | 3 | 3 | 0 |
| XEP-0352 | 6 | 6 | 0 |
| XEP-0363 | 12 | 12 | 0 |
| XEP-0374 | 2 | 2 | 0 |
| XEP-0384 | 4 | 4 | 0 |
| XEP-0410 | 0 | 3 | 3 |
| XEP-0421 | 0 | 67 | 67 |
| XEP-0433 | 0 | 19 | 19 |
| XEP-0486 | 4 | 4 | 0 |
To be clear: the work doesn’t end here. There is still significant improvement to be made (and we’ve not yet used up all of the grant either!) - we just liked to give you all an update. In the works are additional test implementations, and a couple of new test runners. That should both increase coverage, but also allow our tests to be executed on even more CI/CD platforms!
Please get in touch if you have any ideas for improvement, or other feedback. We’d love to hear from you!
Splash image courtesy of Imagine Buddy, Unsplash
]]>This behavior is generally preferable when testing an XMPP server implementation. A benefit of exclusion-based configuration is that tests that are newly added to the test suite will automatically be picked up, without requiring a configuration change.
However, there are scenarios where it is desirable to execute only a specific set of tests, for example when:
- testing of a server-sided component, that implements only one specification, or
- testing a development branch in which changes are applied to only one feature.
In those scenarios, having to disable all other tests is cumbersome.
We have now made available a mechanism in which specific tests can be included. When you include tests, only the included tests are executed. These configuration is very similar to that of the exclusion of tests. You can find more information in our documentation on Selecting Tests.
Please let us know if you like the new features. We’d love to hear from you!
Splash image courtesy of Heeren Darji, Unsplash
]]>Much of the XMPP Interop Testing project was made possible as the work was funded through the NGI0 Core Fund. This is a fund established by NLnet with financial support from the European Commission’s Next Generation Internet programme.
It is quite remarkable how far the effects of funding reach: it allowed us to work out our plans to take various, pre-existing bits and bobs, and quickly and efficiently turn a small tool used for internal testing to a proper testing framework for any XMPP server implementation to be able to use. That snowballed in bug fixes for server implementations, and improvements to specifications used by many. A relatively small fund thus improved the quality of open standard-based communication used in one shape or another by countless people, daily!
We are so happy and grateful to NLnet for boosting our project’s grant! With the additional work, we will add the following improvements:
- Have better test coverage by writing more tests;
- Improve feedback when tests fail or do not run at all;
- Add a new test account provisioning option;
- Improve test selection configuration;
- Automate recurring maintenance tasks;
- Add support for other build systems.
This all will help us improve our framework, it will help our users to improve their products, and will allow new projects to more easily deploy our open and free solutions into their CI pipelines!
You can expect a lot of these improvements to become available to you, soon!
Splash image courtesy of Micheile Henderson, Unsplash
]]>For our project, we opted early on to try something a little different. Rather than choose between the most popular engine and toolchain (Docker) and something more fully open-source source, we opted to do both. We wanted Docker and OCI images, and we wanted Docker and open-source build tools, to prove it can be built and it can be run by whoever wants to, however they want to.
I don’t know how many folks have been down the rabbit-hole of “what’s the difference between the Docker image format and OCI?” - the differences between the image formats are really fairly miniscule. Right now at least. Docker donated their tech of the day to form the initial OCI standards, and most tooling out there supports both Docker and OCI building and running. But that might not always be the case.
What did we do?
Firstly, for build instructions, we created a Dockerfile and a Containerfile. Sure, podman will consume a Dockerfile and produce an OCI image, but given the above mention of possible future divergence, it made sense. We also liked the idea that it gave the appropriate credit to the FOSS community - that whilst all Docker images are container images, not all container images are Docker images, and calling them Docker images devalues the gargantuan effort put into open tooling and standards.
Next, we tested that our work could actually be used to make a working container.
Here’s how it looks OCI:
~/git/smack-sint-server-extensions> podman build . -t sintse-oci:test
[1/3] STEP 1/3: FROM docker.io/library/eclipse-temurin:17-jdk-noble AS pom
[1/3] STEP 2/3: WORKDIR /usr/src
<snip>
[3/3] STEP 6/6: ENTRYPOINT ["/sbin/entrypoint.sh"]
[3/3] COMMIT sintse-oci:test
--> 56d0a8a75584
Successfully tagged localhost/sintse-oci:test
56d0a8a75584c3c1cc41d535dac50c244a0acf04bb3cf4abca989363d46a1608
~/git/smack-sint-server-extensions> podman inspect sintse-oci:test
<snip>
"Annotations": {
"org.opencontainers.image.base.digest": "sha256:5579258e20135a9b54c16b9038008adcff50bd1d637824ed564d5b1cc881cba8",
"org.opencontainers.image.base.name": "docker.io/library/eclipse-temurin:17-jre-noble"
},
"ManifestType": "application/vnd.oci.image.manifest.v1+json",
<snip>
~/git/smack-sint-server-extensions> podman run --network host sintse-oci:test --adminAccountUsername admin --adminAccountPassword admin
Saving JUnit-compatible XML file with results to /logs/test-results.xml
Saving debug logs in /logs
<snip>
Test run (id: bjvrd) finished! 266 tests were successful (✔), 2 failed (💀), and 29 were impossible to run (✖).
<snip>
And this is Docker:
~/git/smack-sint-server-extensions> docker build . -t sintse-docker:test
[+] Building 96.5s (21/21) FINISHED docker:default
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 860B 0.0s
=> [internal] load metadata for docker.io/library/eclipse-temurin:17-jre-noble
<snip>
=> exporting to image 0.1s
=> => exporting layers 0.1s
=> => writing image sha256:922ebdbac4d3c45b00c54ac2c5687b73312bd8697941a25380e7870708a5b25f 0.0s
=> => naming to docker.io/library/sintse-docker:test
~/git/smack-sint-server-extensions> docker inspect sintse-docker:test
<snip>
"Comment": "buildkit.dockerfile.v0",
<snip>
~/git/smack-sint-server-extensions> docker run --network host sintse-docker:test --adminAccountUsername admin --adminAccountPassword admin
Saving JUnit-compatible XML file with results to /logs/test-results.xml
Saving debug logs in /logs
<snip>
Test run (id: 4z7ev) finished! 266 tests were successful (✔), 2 failed (💀), and 29 were impossible to run (✖).
<snip>
We added CI to run a podman and Docker build on every PR to the repo, just to be sure contributors would be warned if they did something to break it, or in case a podman itself got an upgrade that caused a breakage. We don’t need to run the tests - we’ve got other checks around to check the validity and the capability of test running - here we’re more relying on the output of the podman build command to prove that we can build a container image.
Next, we added CI to the main branch to build and push a “bleeding edge” image to a container registry. We chose GitHub because (a) it’s with our code (b) it has OCI support. The same CI also deals with publishing tagged images when the repo is tagged.
It’s exciting! With containerised servers and containerised tests, there’s far less onus on developers to manage dependencies or be dependent on operating systems or versions. Not only does this reduce the barrier to entry for folks wanting to adopt the tests, it’s also far easier for me!
Splash image courtesy of CHUTTERSNAP, Unsplash
]]>New Test Runners
In my last blogpost, I wrote about the first Test Runner that we created: one that is usable as a GitHub Action.
We are continuing to improve that runner, but I’m happy to report that we now also have runners for:
(Please bear with us while we get the documentation in order).
What we’ve not yet crossed off the to-do list includes runners for:
… and maybe Jenkins? 😬 I’m still on the fence, and I keep forgetting to poke Dan about that one. Hey Dan! Whaddayathink?
Smack improvements
The testing framework that we’re using for most of the tests is provided by IgniteRealtime’s Smack Integration Test Framework. Through continuous usage of that framework, we ran into several bits and bobs that we wanted that Framework to have, do or do differently. With help of Smack’s lead developer, Florian, a couple of nice additions have been added.
The rationale for many of these improvements is the same: We recognized that it is incredibly difficult to figure out why a particular test fails, especially since we do not expect people that run our tests to be familiar with Smack, the SINT framework, our test implementations, or even the programming language that is used to create them. We have put in effort to improve this.
We’ve set out to make this easier, documenting a comprehensive how-to guide in the process.
Annotations
SINT tests can now be annotated to reference specifications. This allows us to tell which implementation is testing which specification. That in turn allows us to tell you not only which test fails, but also to which specification/functionality that test relates. We can even quote the exact section of the specification for which a test is failing now.
Debug Logging
Although debug logging (the act of logging XMPP stanzas related to any particular test) was already configurable in SINT, we’ve made that more flexible. This allowed us to write a custom logger, that generates debug files per test.
Having XMPP traffic dumps per test makes it much easier to debug the reason for a test failure. When a test fails, you can now easily see the exact XMPP traffic that was exchanged with the server under test.
Additionally, we’ve whipped up a test result processor that generates XML files in the JUnit format of test result reporting. Our thought here is that many build systems have native support for these files, which may lead to quick wins with regard to test result presentation in these systems. The proof of the pudding is in the eating though, so lets see how this pans out.
Assertion messages
We’ve put in quite some effort to improve the reporting of assertion failures. We have tried to make sure that an assertion failure will report a human-readable message that can be understood without looking at the source code of the test.
This, combined with the other improvements described above, should allow for an XMPP developer to deduce the reason for most test failures.
Improved test coverage!
We’ve also been busy writing new tests! So far, we’ve improved coverage for:
If you’re wondering why these: it is not completely arbitrary: we’re prioritizing the specifications mentioned in the XMPP Compliance Suites, but combine that with the available APIs in the libraries that we use. Obviously, we’re planning to add more and more coverage over time.
Next steps
That about sums up what we’ve been busy with. Over the next few weeks, we’ll continue to build new runners and tests.
I’m also contemplating asking one or two server developers, that are unfamiliar with this project, to have a go and try to integrate one of our runners in their server project’s CI pipeline. I’m hoping that this results in feedback that allows us to improve the usability of the project. I’m not quite sure about the timing of this though. I’ll think it over…
Until next time!
Splash image courtesy of Isaac Smith, Unsplash
]]>It is alive, ALIVE!
I’m so excited to have finished the first prototype of the first test runner that we are going to create!
As Dan & me are part of a team that is maintaining an XMPP server on GitHub, creating a GitHub Action that can be used to super easily run the integration tests in a GitHub pipeline/flow was the obvious first prototype to tackle.
Things worked out beautifully!
A wrapper for SINT
First, we’ve used a bit of prior art to create a new project that uses IgniteRealtime’s Smack Integration Test Framework, and adds its own test implementations. Smack’s tests are, after all, mostly client-oriented, while we’re mostly interested in having server-oriented tests. This new project was dubbed the smack-sint-server-extensions.
A wrapper for the wrapper
Next, the smack-sint-server-extension artifact was easily embedded in a new GitHub Action: the xmpp-interop-tests-action.
It is expected that this action is used in a continuous integration flow that creates a new build of the XMPP server that is to be the subject of the tests.
Very generically, the xmpp-interop-test-action is expected to be part of such a flow in this manner:
- Compile and build server software
- Start server
- Invoke xmpp-interop-test-action
- Stop server
This could look something like the flow below:
- name: Download Server distribution artifact from build job.
uses: actions/download-artifact@v4
with:
name: my-server-distribution
path: .
- name: Start CI server from distribution
id: startCIServer
uses: ./.github/actions/startserver-action # Should result in a running server.
- name: Run XMPP Interoperability Tests against CI server.
uses: XMPP-Interop-Testing/xmpp-interop-tests-action@v1.0
with:
domain: 'shakespeare.lit'
adminAccountUsername: 'juliet'
adminAccountPassword: 'O_Romeo_Romeo!'
- name: Stop CI server
if: ${{ always() && steps.startCIServer.conclusion == 'success' }}
uses: ./.github/actions/stopserver-action
Of course, we’ve immediately modified the continuous integration flow of our own XMPP server to make use of
xmpp-interop-tests-action. The proof of the pudding is in the tasting, after all! It worked!
Also, I’m happy to report that our server implementation passes all the tests that we’re running. 😅
There’s obviously still a lot of work to do, but, if you do feel adventurous and have a GitHub-based CI pipeline for an XMPP project… have a go!
Splash image courtesy of Benjamin Davies, Unsplash
]]>When we came up with our idea for XMPP interop tests for server implementations, we knew we weren’t the first folk to ever have the idea. What was important for us wasn’t about getting the tests right, but about getting the tests used. With so many programming languages in play, we can’t get almost everyone to adopt an additional language into their toolchain. That’s a clear no-go.
Instead, we’re choosing to wrap our tests in the packaging of the common CI systems (a Github Action, a Circle CI Orb, etc) so that whatever your regular toolchain, you can slot the tests into the test phase of your build pipeline.
Previously you might’ve had something like:
------------ --------- ------------- -------------------- --------------------
| Checkout | => | Build | => | Unit Test | => | Integration Test | => | Publish Artifact |
------------ --------- ------------- -------------------- --------------------
You’d just slot this into the pipeline, like this:
------------ --------- ------------- -------------------- -------------------- --------------------
| Checkout | => | Build | => | Unit Test | => | Integration Test | => | Compliance Tests | => | Publish Artifact |
------------ --------- ------------- -------------------- -------------------- --------------------
Since the compliance tests are arms-length tests, akin to E2E tests, one individual test will likely exercise more code and run slower than the other kinds of test, so you’d naturally want to run the faster tests first.
Further, since it’s an actual XMPP test, the implementing server author will also need to have a mechanism of running the XMPP server in the pipeline.
Splash image courtesy of Javier Allegue Barros, Unsplash
]]>