Javier Serrarno, co-inventor of White Rabbit and leader of the WR Project puts in writing his memories
Disclaimer: any account of things that happened many years ago will necessarily be distorted and biased. I have therefore decided to write this document in the first person, so that I am the only one to blame for any inaccuracy, and also to open it up to corrections from people who may have better memory than myself. Luckily there is quite an extensive written record at my disposal, so the following is, I believe, quite an accurate account of the history of White Rabbit up to the creation of the White Rabbit Collaboration, i.e. the time span ranging from mid-2007 to the end of 2023. It is impossible to write this short history without giving names, and it would also be unfair to the many great contributors, but unfortunately it is also very likely that some important names will be forgotten, so I apologise in advance for these omissions and I will gladly fix them as appropriate. Finally, if this is to be read by anyone, it’d better be kept short, and that will necessarily limit the number of people I can do justice to. OK, here we go!
Setting the scene
My first written record for the accelerator timing renovation project at CERN, which would later evolve into the WR project, dates from the 20th of July 2007. It is the minutes of a meeting attended by Julian Lewis, Jean-Claude Bau, Ioan Kozsar and myself, where automatic delay compensation is already mentioned as a feature we would really like to have in the new system. Pablo Álvarez joined the discussion in August, and that’s also when we started considering Ethernet as a possible physical layer. At the time though, we were concentrating our discussions on 100Mb/s copper links. There were lively exchanges on things like the delay asymmetry in twisted pair cables and we were still looking at off-the-shelf ways of having good synchronisation using standard Ethernet-based fieldbuses such as Ethernet Powerlink.
Ben Todd and Bruno Puccio joined the effort in November. This is the first time IEEE 1588 (aka PTP, the Precision Time Protocol) is mentioned in our meeting notes. It is also the time when we took the decision to break away from any legacy technologies in the existing timing system and base the new one on industry standards. Our interest seems to have switched to GbE (still on copper) then too, but November 2007 was quite intense, because by the end of the month we were already discussing chromatic dispersion in fibres!
The fieldbus discussion (at the time we were also looking for a good Ethernet-based alternative to WorldFIP) put us in contact with the people in the Zurich University of Applied Sciences (ZHAW) in Winterthur. There we met Hans Weibel, who told us about Synchronous Ethernet. The notion of extracting a clock signal from a data stream, and then using that as your system clock, was very natural to us. This is what all accelerator timing systems out there were doing. We just did not know it was also done with Ethernet, and had a “name”, i.e. it was a standard. I don’t have a written trace or a precise recollection of when the idea to combine SyncE with PTP came about, but it must have been around December 2007, because we then invited Hans to give a talk about PTP and SyncE in the Timing Workshop held at CERN in February 2008.
The first meeting
The Indico event for the first meeting (renamed “first WR workshop” after the fact) shows the list of participants and contains a summary of the discussions. We gathered three types of actors in this meeting: those who needed a new timing system (CERN, GSI, IN2P3, ITER, Elettra), those who could tell us relevant things about available technologies and building blocks (InES Winterthur, Austrian Academy of Sciences) and companies we could team up with (Cosylab, Oregano Systems, Micro Research Finland). This was early 2008, and our vision for commercial open-source hardware was slowly emerging. We wanted to have the best of both worlds: avoiding vendor lock-in by having fully open-source (software, firmware, gateware, hardware) designs but still benefiting from the nice documentation, support and general quality one expects from mature commercial products.
For me, the highlight of the meeting was the presentation of SyncE+PTP by Hans Weibel. I should probably clarify here that the way of combining these two technologies Hans suggested is not what WR ultimately turned out implementing. He was following up on a concept Silvana Rodrigues (then at Zarlink Semiconductor) had presented in an IEEE 1588 conference at NIST in 2006, and published in a paper in 2007. The idea was to combine SyncE at the core of a network with PTP at the periphery of that network. WR merges PTP and SyncE concepts in the same link between a switch and a node or another switch. But this presentation by Hans put the idea in our heads that great things could be done by combining SyncE with PTP.
New kid in town
I have lost all my emails from 2008, but I believe the idea of using PTP on a link where both actors are syntonized through layer 1 emerged sometime in the spring of that year, during one of the brainstorming coffee discussions Pablo and I used to have in building 864 at CERN. With our colleagues from the other institutes and companies, it became clear that we needed to design “something” and we started looking for a name. I proposed to call our yet-unborn child “White Rabbit” in a forum discussion on the 20th of May. It was around that time that I decided to use the CERN Technical Student Programme to get somebody to work on prototypes and validate concepts. Little did I suspect that this decision would forever change the project. Rumour has it that I got the thick folder of candidates from the secretariat and ended up hesitating between two students. They each had strong points in their favour, and I could not make up my mind. Suddenly I remarked that one of them had the same birthday as me, and I interpreted it as a sign of some kind. Se non è vero è ben trovato! Tomasz Włostowski arrived in our team on the 1st of July, as a Technical Student from the Warsaw University of Technology. He was a force of nature already back then. I remember leaving for holidays in August, having bought a couple of Virtex 5 dev kits for him to play with in the lab. Upon my return he had a working demonstrator of a fibre link with sub-nanosecond delay compensation. He presented it in the second timing workshop, held in October, along with a complete proposal for the White Rabbit protocol.
Tom’s breadth and depth, along with his dazzling development speed and his “I can do this” attitude, completely changed my level of ambition in the project. This is a sample conversation I remember fondly:
Tom: We need to control every single piece of delay in the data paths, all through the network.
Me: But Tom, this is Ethernet. That means we’d need to design a full switch from scratch!
Tom: Yes, so?
Me: Ah, OK.
His MSc thesis, published in May 2011, is in itself a big milestone in WR, and remained a reference document in the community for many years. It describes in detail all the key concepts in WR, many of them introduced and/or developed by Tom. One example is his brilliant implementation of the DDMTD phase detection circuit invented (to the best of my knowledge) by Pablo, which has since then become a cornerstone of WR and other timing technologies. The theoretical part is followed by a practical one which is essentially the full design of the first working WR switch. Not bad for an MSc thesis!
A night in Madrid
The year 2009 was also very intense. I remember in particular a sprint over the summer with the GSI team at CERN. Mathias Kreider and César Prados spent many nights with Tom in our lab in building 864, working against the clock (no pun intended) on a first demo of the switch. We also spent a lot of time and energy towards the end of the year preparing a project proposal for Framework Programme 7 (FP7) of the European Commission. Our friends Patrick Loschmidt and Georg Gaderer of the Austrian Academy of Sciences came to CERN to fine-tune the proposal with Erik Van Der Bij, and I must say it was a great proposal. Unfortunately, for reasons which I never fully understood, it was not funded.
Luckily, the Spanish government was looking for ways to give Spanish industry a competitive edge in the science sector, and we got an “Industry for Science” grant in 2010, so two Spanish companies (Integrasys and Seven Solutions) got funds to work on the software and hardware respectively for version 3 of the WR switch. By that time, Alessandro Rubini had become a key actor for all-things-software in WR, and was orchestrating the embedded Linux effort in the WR switch.
The December 2010 meeting page shows the first participation of some key members of the WR family, like Maciej Lipiński, Greg Daniluk (then at Elproma) and our friends Peter Jansweijer and Henk Peek of Nikhef. More on them later! It was also the start of discussions on what a WR node should look like, introduced by Tibor Fleck. The GSI team concentrated on the node, also working on Etherbone, a way to perform Wishbone reads/writes over Ethernet so the nodes in your network could look just like a huge memory map. This was presented by Wesley Terpstra and collaborators in the 2011 meeting at GSI.
Jumping a bit ahead to close our Spanish government chapter, we organised a meeting in 2012 in Madrid to celebrate the end of the grant and showcase the results. It was a really nice event. For the first time, we had a commercial-off-the-shelf WR switch we could just buy. It featured the beautiful WR logo Julian’s daughter had drawn for us back in October 2008, still in use today. Walking towards dinner in the centre one night, some of us found ourselves in the middle of a demonstration to defend the Spanish public health system. It was impossible to go through it. It was so dense. So we just joined the protest: ¡Sanidad! ¡Pública! ¡Sanidad! ¡Pública!
Superluminal neutrinos
April 2011 saw the first WR meeting outside CERN, in GSI. The highlight of that workshop was the demonstration of PTP working on a node. Independently of WR development, we at CERN had been maintaining our legacy timing system, which was being used, among other things, to send neutrinos to Gran Sasso (Italy) through the crust of the Earth, in a 731 km straight line.
Maciej, Alessandro, myself and Tom in the WR lab (2011).
The initial specification for the accuracy of the inter-lab synchronisation was around 1 μs. This was enough to statistically discriminate between the neutrinos coming from CERN and the ones coming from the Sun in the Gran Sasso National Laboratory. It soon became apparent that we could do much better, and with an accuracy in the few ns, the OPERA collaboration decided to carry out a time-of-flight experiment. The result was very unexpected and made headlines around the world: neutrinos seemed to be moving faster than the speed of light in vacuum! This was in direct contradiction with special relativity, and because the claim was so radical, we started verifying our (CERN) side of the synchronisation chain very intensely. Finally the OPERA people found a mistake on their (Gran Sasso) side which could explain the anomaly. This was early 2012. Version 3 of the WR switch was not fully ready yet, but there was a lot of pressure to deploy WR in parallel with the OPERA timing system in Gran Sasso, to do a new independent measurement. The months leading to the new measurements of the time-of-flight in the spring 2012 were incredibly stressful for many of us.
All was not sweat and tears during this tough year though. I remember at one point CERN’s director of research, Sergio Bertolucci, visited our lab. We gave him a quick intro to WR, and Tom presented our usual “hot-air gun” demo. It featured a WR switch and a node connected through a fibre spool of a few km. Tom switched delay compensation off and directed the hot air at the spool. We could then see the Pulse-Per-Second signals drifting in the oscilloscope as expected. Then he turned feedback on and the PPS pulses snapped together, thanks to the “WR magic”. As he kept describing the contents of one of the slides, looking at the screen on the wall, he forgot to keep moving his hand to spread the heat evenly across the spool. After a few seconds we started smelling something funny and then saw smoke! Sergio had a good laugh and thanked us for the hard work. He himself had done timing for detectors many years before, and his support was super welcome during those hard days.
Finally all four experiments in Gran Sasso were equipped with WR and made measurements consistent with the speed of light in vacuum. This was the first operational deployment of WR!
Dutch connection
I first met Peter Jansweijer and Henk Peek at CERN in the summer of 2010. Erik Van Der Bij had read a couple of their early papers (here and here) where they presented issues which later became central in WR, including the so-called “bitslide” and asymmetry due to dispersion in fibres. We realised we were working mostly top-down while our Nikhef colleagues were pushing bottom-up, so our lines of work were very complementary.
Peter and Henk joined the WR community that year, and presented their plans for the KM3NeT neutrino telescope in the November 2012 workshop in Madrid. Jeroen Koelemeij, who had already joined us in March at the meeting organised by Dietrich Beck at GSI, talked about long-distance time and frequency transfer. In time, a cluster of WR competence formed in the Netherlands, including universities, institutes and companies.
Another illustrious Dutchman, Paul Boven, made important contributions for long-distance links exposed to large temperature variations, testing WR for the Square Kilometre Array (SKA) in South Africa. By 2016 Nikhef had become a key actor in the WR community and it seemed very fitting to organise the ninth WR workshop there. This event was also attended by Erik Dierikx of VSL, the Dutch National Metrology Institute. Erik then went on to do a very thorough analysis and test of WR links using existing telecom infrastructure over long distances, in collaboration with the European Space Agency.
In 2016, Henk realised absolute calibration would help widen the adoption of WR, and he teamed up with Peter to investigate methods to quantify delays very precisely in the nodes, both in the electrical and optical sides. This work is documented in their excellent article, which won the best paper award in ISPCS 2018. Peter has continued working on absolute calibration (and trying to standardise it in IEEE 1588 with Maciej) to this day, after Henk retired. Many of their contributions are now key ingredients of WR, and will be even more so in the future.
The intense activity around WR in the Netherlands continues today, with plans for applications in new fields such as quantum networks and PNT alternatives to GNSS.
The magic of Open Source
One of the cool things that may happen when you open-source your work is that somebody else takes it and does something with it you could not have imagined. We have been very fortunate in the WR project to see this happen many times. New developers have shown up in our meetings and have presented ways of using and enhancing WR which have blown us away.
Ralf Wischnewski and Martin Brückner took WR to Siberia already in 2012. Guanghua Gong and collaborators started joining the meetings in 2014 and deployed what I believe is the largest WR network in the world so far for the LHAASO cosmic ray detection experiment. They explored the effects of temperature in nodes extensively and contributed all their improvements to the community, including new developments like a WR node self-contained in an FMC card. They even organised a tutorial WR workshop in China in 2018. Anders Wallin deployed the longest WR link so far (~1000 km) in 2016 and has also contributed several designs. In the same workshop at Nikhef, Cédric Viou and Daniel Charlet presented their application of WR for astronomy. The work of Paul-Éric Pottie, Namneet Kaur and collaborators in SYRTE on long-distance links is also remarkable. And more recently, the applications of Jean-Michel Friedt and Gwenhaël Goavec-Merou in Besançon in the domain of distributed signal acquisition and software-defined radio are just amazing!
By 2018, the WR community itself had become an object of study. Researchers investigating the open hardware phenomenon, from backgrounds as diverse as economics and anthropology, started studying this paradigm with the aim of explaining it and seeing ways in which it could maximise its positive impact on society. The presentation of Laia Pujol, Luis Felipe Murillo and Pietari Kauttu in the October 2018 meeting is an early precursor in a line of study which stays active to this day.
Fantastic Four
As I mentioned earlier, Maciej and Greg attended their first WR workshop in 2010. Adam Wujek joined the party in 2014. Together with Tom, this team of 4 Poles formed the backbone of WR development and support for several years. Greg’s MSc thesis, published in 2012, the year he joined CERN, is essentially the development of the WR PTP core, the other fundamental ingredient, along with the switch, in any WR network. Maciej’s PhD thesis, published in 2016, tells you how to get seamless redundancy (as in “cutting a fibre and not noticing any effect”) in a WR network. Its potential has not yet been fully exploited. These were also the years when our collaboration with Polish company Creotech intensified, with Greg Kasprowicz as our main initial entry point there. Creotech became a WR switch supplier along with Seven Solutions, where the most visible persons were Eduardo Ros and Javier Díaz. Greg K. also had a Warsaw University of Technology hat, and Eduardo and Javier were with the University of Granada in addition to their work with 7S.
Dimitris Lampridis joined the CERN team in 2016 and started working on distributing trigger pulses using WR. He is also a talented speaker and teacher, so he joined in the fun for our first WR tutorial workshop, which took place in Barcelona in 2017, piggy-backing on the ICALEPCS conference.
Greg, myself, Tom, Maciej, Adam and Dimitris in the WR tutorial workshop in Barcelona (2017)
Nanoseconds, picoseconds, femtoseconds
One line of development I have not described so far concerns the improvements in accuracy and precision in WR through the years. The goal of WR is not to be the most accurate and precise system for distributing time and frequency. It is rather to do our best using standard affordable technologies, and to comply with standards (more on this later). There is a very interesting paper, published in 2018 by Mattia Rizzi and collaborators, which describes the fundamental limitations that arise when you use FPGAs for precise synchronisation. Mattia himself was part of the CERN WR team and became the reference person for all things high-precision. He designed the low-jitter daughterboard for WR switch v3, which enhances the operation of the switch bringing it closer to the fundamental limits he described in his paper. The daughterboard design has been integrated in several switch designs, so today it is easy to buy an off-the-shelf switch with all its improvements.
Mattia and Tom also designed what I believe is the most precise WR node out there so far, the eRTM board for the low-level RF control in CERN’s Super Proton Synchrotron. Its integrated jitter between 10 Hz and 100 MHz is below 100 femtoseconds! This development demonstrated that you could get extremely good performance in high-frequency phase noise (or equivalently short-term Allan Deviation) provided you are willing to invest in a good Oven-Controlled Crystal Oscillator. Jeroen and Peter also have worked extensively on low-jitter in the Netherlands with similar results.
At the same time, advances in our understanding of the different contributions to the accuracy of calibration have allowed us to go well beyond the initial sub-nanosecond specification for accuracy. Today, accuracies of a few tens of picoseconds are routinely achieved in many links.
WR meets IEEE 1588
When we started WR, we wanted to use standards as much as possible. This was in line with our “open-source and commercial” vision. When we noticed that no standard could achieve the accuracy and precision we were aiming at, we changed our goal to “use standards as much as possible and extend them where needed”. This is easier said than done. A quick perusal of the WR standardisation wiki page will give you an idea of the amount of work involved in taking the basic key ideas of WR and integrating them in the IEEE 1588 standard.
The WR community will never be able to thank Maciej enough for all the work and love he put in this standardisation process. The whole journey, from the moment we decided to standardise WR to the publication of IEEE 1588-2019, which includes the options and profile for High Accuracy (the name generalised WR ideas got in the standard) took more than 8 years. During that time, Maciej had weekly phone calls, drafted, revised and corrected text in numerous iterations, interacting with the who’s who of timing and networking companies and working relentlessly to take everybody’s constraints and wishes into account. I was there when he received a standing ovation in the CERN Globe with the occasion of the ISPCS conference we hosted in 2018, when the new release of the standard was undergoing the last cosmetic changes.
After the new standard was released, we saw an increase of uptake of WR (or IEEE 1588 HA as you prefer) in industry. This was a mixed blessing. On the one hand, it was great to see this open-source technology have so much impact. On the other hand, it became clear that we had a scalability problem in our ability to support all these users and developers with a solid foundation for all WR designs. Something had to be done.
Taking it to the next level
The post-standardisation years have also seen a lot of development activity. A special workshop in 2020 was dedicated to gathering requirements for a new version of the WR switch. This is a long overdue development. The current (v3) switch design is more than 10 years old now. The new switch will be based on a Xilinx Zynq Ultrascale SoC and will feature redundant field-serviceable fans and power supplies. It will also have an expansion slot for those who want to enhance it for e.g. holdover or very low-jitter applications. Both the WRS v4 workshop and the 2021 meeting happened online due to the pandemic. The latter saw in particular very nice progress on absolute calibration and low-noise WR by Anders, Peter and collaborators.
Our latest meeting to date was held in October 2022. Thanks to the heroic efforts of Tristan Gingold and collaborators, version 5 of the WR PTP Core was announced. This was a herculean piece of work. Contributed features had accumulated in more than 100 git branches through the years. Combining and testing it all was really challenging. The actual release happened only quite recently, in December 2023, with Eva Gousiou in charge of the WR team at CERN. This gives you an idea of the amount of work involved. For us it was another sign that “something had to be done”.
Enter the White Rabbit Collaboration. Maciej and I had been talking about it for years, but in 2019, as the standardisation effort came to an end, he decided to push harder and drafted a first proposal. We worked on it on and off until 2021, when after a final round of discussions with several labs and companies, we thought it was time to involve the Knowledge Transfer (KT) team at CERN. They (Myriam Ayass, Amanda Díez and later Ben Frisch, Inma Caño and Dane Tacchini) helped us fine-tune our ideas and drafted the final agreement. The idea is quite simple: have a common pot of money filled with the annual fees of members and use the money to pay people whose main purpose is to provide good support to members and to ensure that the fundamental building blocks of WR (the switch and the WR PTP core) are always in good health. As I write this, the WRC has just started operating, with Maciej, Amanda and Adam in the so-called Collaboration Bureau. The WRC is not only our answer to WR’s scalability issues. It is also an experiment in open-source knowledge and technology transfer, marrying our vision of a fully open-source technology with economic activity and impact. We hope it will succeed and become a template other labs and universities can use to explore open-source tech transfer for other technologies.
As I look back on these past 16 years, I am more convinced than ever that our original vision for open source, standards and collaboration with other institutes and companies was right. The journey so far has been incredible, and I feel privileged for all the amazing people we have met along the way. May this be the start of another 16 years of great collaborative development!