Latest Tweets

Inconsistent Thunderbird UI StatusBar’s state whenever an IMAP Account is being used

The issue

Reading emails using Thunderbird or Icedove through an IMAP4 connection intensively and at random,  presents a weird behaviour. This issue can be reproduced on any flavour of operating system: Windows, Linux and MacOs, starting from versions 2.X with or without the IDLE command support. So, the client software has a defect, which puts the UI in an inconsistent state despite the fact the software is working fine behind the scenes.

As shown in the next screen-shot, the user experiences a hypothetically downloading time-out progress-bar message whenever reading an email, no matter if this very email message has been read before and it is cached on the hard disk. This happens after reading quickly and at random different email messages for a while. This excessive delay will never come to an end, so from the user’s point of view, it looks like a complete hang-up.

A downloading time-out progress bar message which never comes to an end

Whenever this comes to happen, any kind of action taken upon the current selected email message ends up in an unknown mailbox folder state. For example, in deleting the email message, this one could be not deleted at all. Sometimes it could be even worse than that, provoking the opened imap mailbox to be corrupted. Sometimes, nothing wrong happens at all.

You can watch a video showing this behaviour: hangup

The IMAP4 IDLE command

In order to be sure there were no exchanged data between the client (in our case, the Thunderbird/Icedove program) and the IMAP4 server, I read all about the IMAP4 IDLE command.

According to the RFC 2177, the IDLE command is issued by the client in order to allow the server to send it mailboxes events, like new messages recently arrived or expunged. This way, the client does not need to pull the server every time as it is defined in its own configuration, like in the old days. As an example, let’s see what happens while tracing the entire conversation down between a Thunerbird Client and an imap server:

-1672481024[7f3c9edec240]: 9c6f6800:imap-server:S-InboxArchives/Rebuts_dec2k10:SendData: 126 IDLE
-1672481024[7f3c9edec240]: 9c6f6800:imap-server:S-InboxArchives/Rebuts_dec2k10:CreateNewLineFromSocket: + Waiting for DONE
-1695549696[7f3c9edec8a0]: 9b19a000:imap-server:S-INBOX:SendData: DONE
-1695549696[7f3c9edec8a0]: 9b19a000:imap-server:S-INBOX:CreateNewLineFromSocket: * 18 EXISTS
-1695549696[7f3c9edec8a0]: 9b19a000:imap-server:S-INBOX:CreateNewLineFromSocket: * 1 RECENT
-1695549696[7f3c9edec8a0]: 9b19a000:imap-server:S-INBOX:CreateNewLineFromSocket: 43 OK IDLE completed

The first two lines of the previous log snippet show us how it works; the client sends an IDLE command to the server, and then just waits for responses from the server to arrive. In our case, during the IDLE command issued, our server tell us there are 1 new message (RECENT), the server issues a “DONE” message in order to let the server know it wants to close the  established IDLE command channel, and the server accepts that request sending back its “OK IDLE completed” message. Whenever this functionality is enabled in the imap client, that is, Thunderbird, there is no need to pull the server as configured in the check-box: “Check For New Messages Every…. minutes”.

In order to capture all the network traffic between the imap client and the server, I used TBTracer, an excellent Thunderbird add-on.

Automating the use of the Icedove client program.

During the reading of some random email messages, and under relatively heavy stress, the imap client seems to be hanged up, despite the fact that when a new email arrives, the IDLE command seems to work perfectly well  and we can see it right in our mailbox. However, the progress bar continues showing some sort of a task being done, and completely uncompleted. This is a never-ending task, and according to the Activity Manager, there is no current task running at all. So, here’s the first approach: there is, most certainly, an incoherence concerning what is going on for real in the program, and what is being reported through its UI. One can even think that there has to be some sort of a forsaken thread updating the ProgressBar in eternum.

To find that out, it is mandatory to automate some basic stress-tests upon the Icedove email client. The best way to do that, due to the fact it is an UI program, is to use the xmacro utility.

So, I captured some cursor movements, reading some emails more or less quickly and at random. Then, I installed the TBTracer add-on in order to capture all imap messages exchanged between the client and the server, and finally I wrote a bash shell script in order to automate and process all this gathered information, finding patterns in order to determine when it was most feasible to expect that inconsistent UI progress bar indication to occur.

Below, what this bash script execution looked like:

(…)

Playing iteration 23 …Sleeping 0.26530 s …    [X]
Playing iteration 24 …Sleeping 0.15587 s …    [OK]
Playing iteration 25 …Sleeping 0.28253 s …    [OK]
Playing iteration 26 …Sleeping 0.13092 s …    [OK]
Playing iteration 27 …Sleeping 0.16512 s …    [OK]
Playing iteration 28 …Sleeping 0.18861 s …    [OK]
Playing iteration 29 …Sleeping 0.10955 s …    [OK]
Playing iteration 30 …Sleeping 0.12302 s …    [OK]
Playing iteration 31 …Sleeping 0.3657 s …    [OK]
Playing iteration 32 …Sleeping 0.29294 s …    [OK]

(…)

There are 1 hangups to take care of:

23

Well, this script iterated using the xmacro command over some email messages quite quickly, searching for the use of the IDLE command and trying to figure out when it was most probably to happen a hypothetical hangup state.

Having a look at the 23rd iteration’s log, we found this:

1654650624[7f125b476cf0]: 5b8bd800:dfa.upc.es:S-INBOX:SendData: 21 IDLE
1654650624[7f125b476cf0]: 5b8bd800:dfa.upc.es:S-INBOX:CreateNewLineFromSocket: + Waiting for DONE

Running the tests by hand, all the time this odd behaviour happened, we got this last log entries thanks to the TBTracer add-on. Thus, the idea was pretty obvious: the IDLE command had something to do with that error.

… but it turned out I was wrong 🙂

After disabling the IDLE command, I got the same weird behaviour. So, I decided to do it the old way: by using my old friend gdb. First, I got the icedove sources and then I compiled it enabling its debuging symbols:

# apt-get build-dep icedove

# apt-get source icedove

# cd icedove-3.0.11/

# CFLAGS=”-g” dpkg-buildpackage -rfakeroot -b -nc -j2

Then, I ran the compiled icedove email client and attached a gdb debugging session to it:

gdb icedove-3.011/mozilla/dist/bin/icedove-bin <PID>

Wandering around its sources through cscope, I came to see a call to a certain ProgressStatus() method:

6480 NS_IMETHODIMP
446 nsImapMailFolder::ProgressStatus(nsIImapProtocol* aProtocol,
6447                                  PRUint32 aMsgId, const PRUnichar * extraInfo)
(...)

It looked like the best place to put my first (and maybe last if I got lucky), breakpoint. So I did.

Whenever I got that odd behaviour, despite the fact I got a lot of hits on that very breakpoint, the status-bar of the icedove UI was just  showing the same: “Downloading Message”, or “Loading Message”, or even no message at all,  with the progress-bar besides it going on and on and on…eternally. I decided to apply my first idea of having an inconsitent UI and try to see what was going on for real behind the scenes. Then, I decided to have a look at the ProgressStatus() code, during the call to DisplayStatusMsg() parameters, where in theory, and according to the sources, I would have the current message to show up right in the status bar:

p progressMsg
2 $9 = {
3   <nsAString_internal> = {
4    mData = 0x7f9f7dbb6040,
5     mLength = 26,
6     mFlags = 9
7   }, <No data fields>}

Okay, it was a C++ data structure. So far, so good. According to its lenght (mLength field), I could get the entire String this way:

(gdb) x/26s (PRUnichar *)0x7f9f7dbb6040

What came out was very useful:

15 0x7f9f7dbb6040:  “S”
16 0x7f9f7dbb6042:  “e”
17 0x7f9f7dbb6044:  “n”
18 0x7f9f7dbb6046:  “d”
19 0x7f9f7dbb6048:  “i”
20 0x7f9f7dbb604a:  “n”
21 0x7f9f7dbb604c:  “g”
22 0x7f9f7dbb604e:  ” ”
23 0x7f9f7dbb6050:  “l”
24 0x7f9f7dbb6052:  “o”
25 0x7f9f7dbb6054:  “g”
26 0x7f9f7dbb6056:  “i”
27 0x7f9f7dbb6058:  “n”
28 0x7f9f7dbb605a:  ” ”
29 0x7f9f7dbb605c:  “i”
30 0x7f9f7dbb605e:  “n”
31 0x7f9f7dbb6060:  “f”
32 0x7f9f7dbb6062:  “o”
33 0x7f9f7dbb6064:  “r”
34 0x7f9f7dbb6066:  “m”
35 0x7f9f7dbb6068:  “a”
36 0x7f9f7dbb606a:  “t”
37 0x7f9f7dbb606c:  “i”
38 0x7f9f7dbb606e:  “o”
39 0x7f9f7dbb6070:  “n”
40 0x7f9f7dbb6072:  “& “

Wow! I was right, after all. My icedove email program was doing something, according to my gdb debugging session. What was that? Well, just sending a login request to the IMAP server. Despite the fact the IMAP command was, more or less, an AUTH request, I still had that “Downloading” message reported-task on the status bar, and that bloody Progress Bar working for years on end. After a while, it hit this very breakpoint again. This time I got this:

53 0x7f9f774d4aa0:  “T”
54 0x7f9f774d4aa2:  “h”
55 0x7f9f774d4aa4:  “e”
56 0x7f9f774d4aa6:  “r”
57 0x7f9f774d4aa8:  “e”
58 0x7f9f774d4aaa:  ” ”
59 0x7f9f774d4aac:  “a”
60 0x7f9f774d4aae:  “r”
61 0x7f9f774d4ab0:  “e”
62 0x7f9f774d4ab2:  ” ”
63 0x7f9f774d4ab4:  “n”
64 0x7f9f774d4ab6:  “o”
65 0x7f9f774d4ab8:  ” ”
66 0x7f9f774d4aba:  “n”
67 0x7f9f774d4abc:  “e”
68 0x7f9f774d4abe:  “w”
69 0x7f9f774d4ac0:  ” ”
70 0x7f9f774d4ac2:  “m”
71 0x7f9f774d4ac4:  “e”
72 0x7f9f774d4ac6:  “s”
73 0x7f9f774d4ac8:  “s”
74 0x7f9f774d4aca:  “a”
75 0x7f9f774d4acc:  “g”
76 0x7f9f774d4ace:  “e”
77 0x7f9f774d4ad0:  “s”
78 0x7f9f774d4ad2:  ” ”
79 0x7f9f774d4ad4:  “o”
80 0x7f9f774d4ad6:  “n”
81 0x7f9f774d4ad8:  ” ”
82 0x7f9f774d4ada:  “t”
83 0x7f9f774d4adc:  “h”
84 0x7f9f774d4ade:  “e”
85 0x7f9f774d4ae0:  ” ”
86 0x7f9f774d4ae2:  “s”
87 0x7f9f774d4ae4:  “e”
88 0x7f9f774d4ae6:  “r”
89 0x7f9f774d4ae8:  “v”
90 0x7f9f774d4aea:  “e”
91 0x7f9f774d4aec:  “r”
92 0x7f9f774d4aee:  “.”

No doubt; icedove was doing something, and by something I mean: SOMETHING for real. I had to find out where the status bar was updated, because at the time I was pretty sure the problem was directly related to it. In other words, at that time I did state where the real problem was located: the UI was, to say the least, desynchronized.

ImapMailFolder::PercentProgress was buggy and had been infected

I studied the source code, and came to find this method. It was called from the previous back-traced method in gdb. It was in charge of updating the status bar (among other things) of the current running imap thread. I put a breakpoint there, and as long as I got that odd behaviour, I could see these interesting values for two of its parameters:

101 Breakpoint 1, nsImapMailFolder::PercentProgress (this=<value optimized out>,
102     aProtocol=<value optimized out>, aMessage=0x0, aCurrentProgress=98, aMaxProgress=95)

The variable aCurrentProgress had a greater value than aMaxProgress. So, and according to this method, there was no way it could come to an end. Not at all. So, I decided to write a trivial patch on-the-fly using gdb, thanks to the “command” applied to this previous settled breakpoint:

105 Num     Type           Disp Enb Address            What
106 1       breakpoint     keep y   0x00007f4fcdec0256 in nsImapMailFolder::PercentProgress(nsI    ImapProtocol*, unsigned short const*, long, long) at nsImapMailFolder.cpp:6499
107     breakpoint already hit 31 times
108         if aCurrentProgress > aMaxProgress
109           set variable aCurrentProgress = aMaxProgress
110         else
111           c
112         end

The idea was pretty obvious: in case I got CurrentProgress > aMaxProgress , gdb would correct it automatically. Then, I ran the program again, not experiencing these awful behaviour any more.

Chasing the code in search of the bug.

Despite the fact I could fix it, there was still some odd UI behaviour: the Status Bar, simply, froze. I mean, because now the thread could indeed finish, the area where the Status message and Progress bar were was not updated at all. Reading the sources, the method in charge of calling PercentProgress() in order to update the thread’s progress status was PercentProgressUpdateEvent(), implemented in the imap/src/nsImapProtocol.cpp C++ source code file, line 5030. I put another breakpoint right there and re-ran the program.

The infected value for aCurrentProgress was already there. So, we had to update the Status Bar only when aCurrentProgress < aMaxProgress, or as far as this method was concerned, whenever currentProgress < maxProgress. Thus, I modified this method slightly, as shown below:

5060   if (m_imapMailFolderSink && (currentProgress <= maxProgress))
5061     m_imapMailFolderSink->PercentProgress(this, message, currentProgress, maxProgress);

Now, PercentProgress would be called all the time but when the process was really done. I added some sanity code inside this method, as well (file imap/src/nsImapMailFolder.cpp )

6500             if (!request) return NS_ERROR_FAILURE;
6502             if(aCurrentProgress > aMaxProgress ) aCurrentProgress = aMaxProgress;

After recompiling the debian package, I installed the new icedove and its debugging symbols, and used the gdb again in order to trace it down. This time, it worked fine. I did just the same on another computer – this time a laptop -, and it worked too. Now, I am going to compile Icedove for Windows. Stay tuned.