Alex Elliott

The internet home of a prospective software engineer

This is my personal blog where I discuss projects that I'm currently working on, work I've recently completed, or write about any topic which has caught my interest in the world of Computing from my studies or from my personal research.

Latest Articles

Expression Editor Update (2)

January 24th, 2010

Since I’ve had some more time to work on Expression Editor recently I thought it was about time I wrote another update for the progress of the project, and some related news that affects it.

Expression Editor on Mac OSX

A recent screenshot of Expression Editor on Mac OSX

From Last Time…

In the previous post I noted a few areas in progress and some that I wanted to look at in the future.  So to catch up there, Drag&Drop is generally a bit more reliable and produces slightly neater results but is otherwise unchanged so far, and the new testing widget is still waiting.  A significant change has been made in the area of supported regular expression formats however.

The application now has backends for Qt4, PCRE and POSIX ERE formats (though the visualisation could still mess up some PCRE/POSIX elements, let me know if anything breaks).  You can select the format you wish to work in from the menu bar, it will be displayed in the bottom right of the screen so you know which mode it is currently in, and the save format has been slightly extended to save your preference for each particular expression.

The default mode has also been changed to PCRE, since it is probably the most powerful backend available.  Another minor UI change has been included which is an expression status indicator to the right of the text input.  A green tick while valid, and a red exclamation mark when invalid, in addition if you mouseover the invalid indicator, the tooltip is the error returned from the active regular expression backend.

In Related News

As you probably saw above the screenshot used is from Mac OSX.  In order to improve my capacity to test Expression Editor I’ve gotten myself a Mac Mini as well as my Slackware Linux laptop.  Set up with Synergy+ this means I can simultaneously develop the application in Linux and test it in OSX.  One behavioural difference between the two operating systems has already been resolved, so hopefully the application should start behaving much more reliably on OSX as well as Linux from now on.

Expression Editor Update

December 24th, 2009

A fair bit of progress has been made since my last blog entry so I thought I’d note a few things that have landed in the repository and a few things that I intend to add at a later date.

Drag and Drop

Initial support for drag and drop editing has been added.  You can now re-order the elements of the expression by dragging an element in the visualisation to one of the valid drop zones (which are automatically highlighted as you can see in this screenshot).  With this in place it becomes significantly easier to add the other bits of drag/drop editing I want the editor to support.  Eventually as well as reordering (plus the double-click edit dialogs which are also currently included for several elements) I aim to include:

  • Drag/drop adding of new elements from the toolbar to the left of the visualisation.  This should probably spawn a dialog/wizard and then insert the resulting regular expression element into the current expression.
  • Reordering needs more support in the alternatives item, currently there are only valid drop zones to place items inside current alternation branches, and there should be a drop zone allowing the user to drop an element in as a new alternative.
  • Possibly a simple “trash” element, which simply accepts the drop, and results in the item being deleted from the scene.

Regexp Formats

As stated in a few places in the application, before the initial release I hope to support PCRE, POSIX Extended and Qt format regular expressions.  This means supporting a range of different regexp syntaxes, and intelligently warning when switching between formats if some of the expression cannot be used directly in the new format, it should also offer to try to translate the expression if such a problem exists.

For example, if we’re currently in PCRE mode and we have an expression containing “\w” and we switch to POSIX Extended, this should trigger a warning and then offer to translate, turning “\w” into “[[:word:]]”.

At the moment, the application only supports Qt’s internal format, and I think correctly represents much of what it supports internally.  The format is very much  like a slightly restricted PCRE format, so Qt/PCRE conversion should be fairly straightforward.

Expression Testing

The editor currently has an element at the bottom of the layout which allows you to test the regular expression for given short strings.  This is good for most cases, since it allows you to have a few regexp “unit tests” of sorts, where you test fringe cases and observe if it matches, partially matches, and whether the capture groups work as expected.

In addition to this it would be useful to have a few other methods of testing included.  The testing widget should eventually be a tabbed widget with the currently available tester as an option, then also having at least two additional panes.  A “bulk text” pane which  takes paragraph or longer inputs of text and highlights all instances of that section which are matched by the regular expression, and a “replacement” pane which allows you to input a similar length input to “bulk text”, and apply the regular expression with a given replacement string (which could also be a regular expression).

Anyway, that’s what I’ve been working on and some of what I want to include later.  Work goes on. :)

A New Project

December 7th, 2009

I have recently been working on a new project under the working title of “Expression editor”, an application which allows for easy editing of Regular Expressions (regexps) in a similar way to KDE3.x’s KRegExpEditor.  I used to love KRegExpEditor for the incredibly useful functionality it provided, and in particular the visualisation of the regular expression as you edited it.  Being able to see graphically what the regexp was doing made dealing with long cryptic regexps much easier, and I felt it was a shame that it was not (as far as I know) ported to Qt4 and KDE SC 4.x.

A Screenshot of KRegExpEditor in Use

A Screenshot of KRegExpEditor in Use

Since I felt it was a very useful application and one that I felt deserved to be ported to Qt4, I have started my own replacement (I decided to replace it rather than port mostly as a learning experience) written from the ground up in Qt4.  If it reaches a good level of stability I may consider porting it to be a KDE SC 4.x application, but for now I’m just focusing on building a working replacement.  After working on this for two weeks (start date: 23rd November 2009) I’ve reached a state where things are starting to come together.  If anyone’s interested the app is licensed under the GPLv3 and is available from GitHub.  Any bugs or feature requests are welcome at the project’s Issues page.  As of fairly recently it looks like this:

Expression Editor With an Email Matching Regexp Open

Expression Editor With an Email Matching Regexp Open

At the moment it includes some Oxygen icons, but due to the license on those they will be replaced before I release an actual stable version of the application.

Remember, if you do try it, it’s nowhere near stable yet – and a fair bit is yet to be implemented (like the drag and drop / GUI editing of expressions).

My Little FAQ For PHP Pitfalls

November 6th, 2009

There are a few questions which come up time and time again in Zymic’s support channels (Zymic IRC, or the forums) and having answered them several times, I feel I would like to spend some time writing up a response which I hope can make handling these support queries a bit easier. So, ok, firstly:

PHP Notices

Since Zymic’s PHP is set to display errors of the level E_NOTICE, users often find they are getting errors on Zymic that they are not getting on their local test server or previous host.  E_NOTICE level errors are good practice recommendations that point out where you’ve written valid but improper PHP.  It is recommended that rather than suppress E_NOTICE level warnings you display them during the development process and fix your code so that the errors are never generated.  This improves the quality and reliability of your code.

If your local test server does not display notices then I would recommend you change your local PHP configuration to start generating them.  You can do this by modifying the “error_reporting” directive in your php.ini file.  In a system where notices are disabled it is typically set up as this:

error_reporting = E_ALL & ~E_NOTICE

This format takes E_ALL, a group of many error levels including notices, and then removes notices from the list by adding ~E_NOTICE (not E_NOTICE).  So you can enable E_NOTICE level errors by making it simply:

error_reporting = E_ALL

The next few topics are about some specific E_NOTICE level errors you may have found and what they mean then how to resolve them.

PHP Notice: Undefined Index or Undefined Variable

This notice refers to using a variable or a member of an array that has not been defined prior to its use.  The simplest example of this is to take this code:

1
2
3
<?php
   echo $foo; // At no point have we defined $foo, so using it is probably bad
?>

This will produce an error along the lines of “PHP Notice: Undefined variable: foo in __FILE__ on __LINE__”. This is obviously a very trivial example and it rarely comes that simple in real code, but it is an illustration of what triggers this error. Attempting to reference some data that doesn’t exist can easily lead to unexpected behaviour, and it’s fairly obvious that all variables/array members you try to read from should be defined and initialised before you try that read.

A common context where “Undefined Index” will turn up is in people handling the superglobals like $_POST. If you attempt to use $_POST['foo'] without checking some input with the name “foo” has been posted to the script will trigger this notice. So, that’s what the notice is and what causes it, but how can you prevent it from appearing? Well, there are some tests that PHP has which can check the data, and then you can make sure that you’re only attempting to read the data if the variable/array member is set or is not empty. So say you want something that outputs “Hello, {name}” to the browser and it works on GET data, an implementation which will trigger this notice might be:

1
2
3
<?php
   echo 'Hello, ', htmlspecialchars($_GET['name'], ENT_QUOTES);
?>

But this script doesn’t work particularly well when a name isn’t provided, you will get a notice and the output will just be “Hello, “, which is not particularly meaningful – we can do better than that. So if we add in a check using PHP’s empty() function to see if there is some data provided we could instead write:

1
2
3
4
5
6
7
<?php
   echo 'Hello, ';
   if(empty($_GET['name']))
      echo 'Stranger';
   else
      htmlspecialchars($_GET['name'], ENT_QUOTES);
?>

This is very similar to the last, but it will not trigger a notice if there is no name provided to the script, and in fact handles it by instead outputting “Hello, Stranger” if no name is provided. This is a bit more elegant, and there are many ways you could tackle this. You could have the case where if no input is provided we show a form instead for the user to provide a name.

PHP Notice: Use of undefined constant

This one is very much similar to the last one, but is worth mentioning separately because there’s enough to be said about the mistakes that lead to it, and how you can make these mistakes and not realise it without the benefit of notices. This notice is about the use of undefined constants in code, the interesting thing is how PHP handles a case where an undefined constant is found. It will trigger this notice, but since this is valid PHP it will take a value for the constant for the script to use, and that is the identifier (or name) of the constant, thus an undefined constant foo will evaluate to the string ‘foo’.

Because of this fact you often see cases where an array member is referenced via $array[foo] rather than $array['foo']. If notices are disabled, then it is possible that the author of that code will not notice the mistake (though syntax highlighting in your editor should mitigate this somewhat) because while foo is an undefined constant the two are equivalent. However, this is not always the case. An easy example to demonstrate this is this. Take file1.php to be this:

1
2
3
4
5
<?php
   $array = array( 'foo' => 'Hello there', 'bar' => 'Goodbye');
   echo $array[foo]; // Note here we've not quoted foo so it's a constant,
                     // and in the context of just this script it's undefined.
?>

When you visit this page you will find it triggers this notice and outputs “Hello there”, however if we were to also have a file2.php containing this:

1
2
3
4
5
<?php
   define('foo','bar'); // define the constant foo with the value "bar"
   require 'file1.php'; // and then bring in the file we just wrote,
                        // with this constant in scope
?>

When you visit file2.php you’ll find the notice has gone and we’re now presented with “Goodbye”, this is the crux of the issue, when you unintentionally use an undefined constant you no longer have any idea how that script will run when it is part of a larger program. What it will actually do is now undefined behaviour. So, if what you mean to write is a string, make sure it has quotes around it, so the PHP parser knows that it’s a string – not a constant.

That’s it… for now

I may add to this at a later date to add in anything else that gets asked frequently, we shall see.

Musings on Syntax Highlighting for Websites

February 6th, 2009

Syntax highlighting can be very important to some websites, particularly those featuring articles on programming practice/theory or pastebin/nopaste websites for collaborative debugging.  However, most highlighting packages tend to use pattern matching to attempt to correctly highlight a given document rather than a more accurate but more complex lexical parser and do not have the capacity to use multiple highlighting schemes for a given document internally.  If you choose to highlight something as PHP, then only the PHP segments will be syntax highlighted, when it’s plausible that there will also be HTML, XML, Javascript, CSS, etc in the same document.

These are two features lacking from most existing syntax highlighting packages today, and ones that I think would be extremely useful to have in publicly available free software tools.  The question is simply whether it is feasible to include them, or whether what we’ve got currently is as good as it’s likely to get.

Pattern Matching versus Lexical Parsing

These are the two main ways of taking a source document and producing a highlighting for it.  Pattern matching uses regular expressions to attempt to catch recognisable patterns in the given language which is simpler to produce, but does not guarantee good results.  Lexical parsing on the other hand is a much more complex but when done more flexible method for producing a highlighting of some input.

Lexical parsing involves going through the input from start to finish breaking the input up into “tokens”, which are small segments of the input with some associated meta-data.  In essence it breaks the code provided down into it’s components: strings, keywords, variables, etc.  The power of this model is that while the parser is working it can use its state information on things like scope and context to provide more accurate and more informative details.  In fact, a full lexical parser would be able to identify syntax errors and highlight them automatically.

As to providing more information, with tokenised input it would be fairly trivial to note which braces/brackets match one another, and unlike a pattern matching system you can include information from other parts of the program – take the simple example of a C++ typedef, something simple like “typedef vector<string>::iterator vec_iter”, which provides a new shorthand type “vec_iter” as a vector<string> iterator.  While a pattern matching model could probably work out that vec_iter was a type, it would not know what it represented, or if it was valid.  A lexical parser would be able to add a note saying “this is a vector<string> iterator” provided the typedef was in the provided sample.

Of course, while it is probably a superior method from a functional standpoint, it is significantly more complicated to implement.  Which raises the question of whether the benefits are worth the extra outlay of effort required to produce the highlighter.  My personal view is that pattern matching for the moment is the better option for things like articles, where we are confident that the input is a valid piece of code – and thus should be fine in a normal highlighter.  For uses like pastebin/nopaste sites though, it would be beneficial to have this kind of extra information since they are often used for collaborative debugging, and so highlighting of syntax errors, and other possible errors like a definition of a used type not being available (this might not be a true error as the definition may be in a file not provided to the highlighter, but it could still be worth noting – and it would definitely be useful for self-contained testcases).

Language Nesting in Code Samples

The other limitation in many existing syntax highlighters is that they are not able to apply several different language highlighting schemes to one piece of provided input.  This can be annoying when you’re highlighting things like web pages, which can easily contain HTML, with nested CSS (in <style></style>), nested JS (in <script></script>) and perhaps server-side languages like some PHP (in <?php ?>).

For the most part, just selecting one to highlight works, since it’s unlikely that more than one requires significant attention at once, however there are situations where it would be useful to have each block highlighted separately.  However, this would either require the user to select ranges of code to highlight in different language engines, or it would require the highlighter to attempt to automatically determine what language segments of code are.  The first is tedious for the end-user, and would likely lead to the product not being used, and the latter adds significant complication to the highlighter.

These things are something I would like to see included in the functionality of pastebin/nopaste websites, but due to the complexity I can’t expect them to just turn up one day.  So, given that I figure I might give it a go, simply writing a fairly cut-down proof-of-concept to maybe appear with pastesite one day (in C++, not PHP since I expect the performance of PHP to not be capable of this satisfactorily).  As to whether I’ll ever finish it, that remains to be seen, but I do think such a product would be beneficial to the programming community as a whole, and I hope if I don’t do it maybe someone else will.

Arbutus TC v1

January 7th, 2009

So, what have I been doing with my time recently? Bits and pieces of personal project tinkering, and also one small paid project. This project was a website for the Systems Engineering consulting company Arbutus Technical Consulting. It was recognised that they needed an effective web presence to help bring in business for the company, and I was hired to build that website to the specification provided.

The website was specified to be a very simple, mostly static collection of pages including an easy to use blog system for comments the company’s primary consultant had on issues related to Systems Engineering.  I designed a simple interface and translated that into a working website which Arbutus can use to advertise themselves to potential clients.

If you’re interested then have a look at what I came up with for Arbutus Technical Consulting.

zBot No More

December 13th, 2008

The title here is more sensational than it needs to be, I’m not discontinuing the zbot 2.0 project – rather, I’ve just decided that if I want to release it publicly, then I would like a more generic parent name for the software.  The instance of the bot in irc.zymic.com will likely retain the name. :)

So, what’s the new name for the software?  Well, that’s possibly still in flux, but for the moment I’ve decided I might go with polymer.  A slightly nerdy nod to the fact I want to make this release inherently extensible, as much as is needed.

Polymerisation

Of course, with a name like that I really should elaborate on just why it’s going to be more modular and easily extensible than the previous bot.

There was nothing really wrong with the implementation before, and much of it has been kept constant in the new implementation.  The most notable changes are in the format used to write modules, which has been simplified somewhat – and the fact that module interaction will be made possible allowing modules to reuse functionality included in a module that is already loaded into the bot.

My current WIP draft for module layout is this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
<?php
 
// module description up here?
//
// that would make sense.
 
class mod_example
{
   // any local variables required
   private $localvar;
   private $othervar;
 
   /// Core and required methods:
   // init, performs any initialisation required
   public function init()
   {
      // initialisation stuff... for example:
      $this->loadConfig(); // <-- (re)loads configuration in {confdir}/{name}.conf
 
      // the triggers and hooks
      registerTrigger('ping','respond');
      registerHook('passive');
   }
 
   // a rehash method which handles a complete config reload
   public function rehash()
   {
      // rejig internals in case our config has changed
      $this->loadConfig(); // <-- like this again perhaps?
   }
 
   /// about() and help() will probably make an appearance though since some are
   /// internal they need not include them.
 
   /// Trigger/Hook implementation
   // Here's a trigger, triggered of course by a !ping command.
   public function respond()
   {
      // note the lack of any arguments, instead the information will automatically
      // be made available through methods/variables contained within the base
      // class.  This simplifies format, and allows us to make triggers and hooks
      // constant.
      $target = $this->state->target;
      $nick   = $this->state->caller;
      // module intercommunication:
      if( is_object($_irc) )
         $_irc->msg($target,$nick.', pong');
   }
 
   // And here's a hook, called on every new packet
   public function passive()
   {
      // do stuff
   }
 
   /// And as always, you can declare internal functions for personal use.
   private function helper($arg1,$arg2)
   {
      // do something helpful
   }
}
 
?>

This format may undergo changes as I work on it, but it’s likely to look something like the above when I’m done. :)

Any suggestions/comments are very welcome, since this is going to be the interface anyone who wants to write modules will be using, so it’s important that it’s suitably intuitive.

Setting an Agenda

November 30th, 2008

If you’ve checked up on the site since my last blog entry, you’ve probably noticed I have indeed started on the pages for the rest of the site.  The about and contact pages are finished, and the footer now automatically displays the four most recent blog entries.  Now comes the main bit of the work, writing a CMS to manage my current and completed projects.

The Main Site

So, what exactly is going to be one the main site?  If you’ve seen the front page you can probably mostly guess.  There will be project pages for each project I’m currently working on or have completed – and there will be one of each set as “featured” works displayed on the front page and in the blog footer (the other spot will be filled by the most recent project).

The project pages themselves will be written descriptions of the project: what it’s about, what it’s aiming to produce, what I want to learn from it, what’s being used to implement it.  It will also have a section for relevant blog articles, which will be automatically fetched from here by selecting all the articles with a given tag (so for the zbot2 project, I will look for a “zbot” tag on blog articles).

This should provide a good source page to refer people to to answer any questions about the project, and can serve as a home for any projects I decide I would like to release publicly.

What About zbot?

I mentioned I was hoping to start zbot v2.0 soon, I will hopefully be starting that almost directly after finishing the main site on here.  In the meantime it’s time to start some design for the structure of the program and its source files.  When the project does get started I’ll make note here, and hopefully there’ll be a working core available before too long.

Other Work

This site and zbot aren’t the only things I’m doing however, there’s another project I would like to write which I have not really begun looking at yet, but which requires a fair bit of research before I can start.  I’ll probably look into a few of the topics I’ll need to write it while I’m working on the other projects (this site and zbot), and will hopefully write a few articles on them to help cement my understanding and to share what I’ve found out.

So, hopefully we’ll see the rest of the main site taking shape over the next week or two.  But if I don’t have time to blog for a little while, don’t think I’ve stopped working, hopefully it means quite the opposite, but we’ll have to see. ;)

Personal Site Work

November 26th, 2008

So, as predicted there’s already been a pretty big gap, but that’s not to say I’ve been neglecting the blog.  I’ve had a new bespoke design drafted up by a friend I know through a web development community, which I hope to get properly skinned for WP and set up as a personal site on alex-elliott.co.uk soon.  The designer is Adam McPeake, who is linked in the blogroll under wized (wized.net is his portfolio).

Previews

So, while I get to work converting the design into a WP theme and writing a CMS for my personal site, here’s a preview of what it should eventually look like (the main site, and the blog):

Main Site Preview Blog Preview

Hope to have some more updates about this soon, or maybe you’ll be reading this in the lovely new theme. :)

UPDATE: as you may have noticed the new design has been skinned into WP (and I rediscovered precisely why I hate that theme system). Hopefully things still work, but if they don’t please do comment and let me know.

NOTE: yes, the rest of the site isn’t done yet, what’s up is an example of what it should look like when done (at least the index page), I’ll start on getting the main top links at least drafted into markup now – and I should be able to finish the about and contact pages completely within a few days.

Exciting stuff. :-)

Concurrency in PHP

November 13th, 2008

One of the problems you come across when writing real-time applications in PHP is that it is in most cases a linear language, performing the tasks set before it one at a time and doesn’t contain much in the way of tools to help set up concurrent tracks of evaluation.

As far as I’m aware PHP doesn’t contain anything at all for controlling threads, however it does include some process control functions which allows us to use a multi-process model for concurrent programming.

Introducing Process Control

The functions we need come via the Process Control module in PHP.  This is not included by default in the PHP Apache module (because it’s more complicated in such cases), if you want to use it for a website you will need to use PHP as a CGI module or compile mod_php with –enable-pcntl, in this case I’ll be testing using the PHP CLI binary, in which it’s included by default.

The key function that this module provides for us is pcntl_fork().  This works much like the fork you may have come across in other languages, it creates a clone of the current process which then computes a separate code path to the parent that called it.  The important thing of course being that the clone (or child) process does its computation while the parent carries on with whatever it was doing.  This is key to parallelising our PHP applications, and allows us to create responsive applications which can still do lengthy tasks.

Process Control in Use

So where might this be useful?  A trivial example would be when you want to perform a task that takes a number of seconds, and you want the user to receive output from the program while it’s being computed.  A simple example of this could be the code below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
<?php
 
// this is a place-holder for a hypothetical function
// which takes a long time to compute, let's say the
// eventual answer would be 5...
function foo()
{
   sleep(10);
   return 5;
}
 
file_put_contents('tmp',0); // we'll also assume we know the answer isn't 0.
 
// Here's the fork, the process ID will be stored in $pid for the main
// program, and it'll be 0 for the child process.
$pid = pcntl_fork();
if($pid == -1)
   die('Fork failed'); // Our fork failed, thus our program did.
elseif($pid == 0)
{
   // here's our forked process
   file_put_contents('tmp',foo());
   echo "Done\n";
   exit;
}
 
// In the meantime we'll keep our parent process looping.
$timestamp = time();
while(file_get_contents('tmp') == 0)
{
   if($timestamp != time())    // once a second...
      echo 'Calculating...',"\n"; // give some output
   $timestamp = time();
}
 
echo file_get_contents('tmp')."\n"; // output our answer from the fork.
 
?>

As you see we have simply made a test-case here, there’s a function with a sleep(10); which takes the place of any long-running function, and while it is running we inform the user that the task is underway, once a second until it is complete.  It’s quite a trivial example, but it shows that both the echo statements and the sleep were functioning in parallel when viewing the output, which is predictably as follows:

bash-3.1$ php -f test.php
Calculating...
Calculating...
Calculating...
Calculating...
Calculating...
Calculating...
Calculating...
Calculating...
Calculating...
Done
5

Another example might be if you were performing a repetitive task on many targets, such as all the files in a directory.  You could loop across the directory forking off a new process which would handle each file, then back in the main loop you simply wait for all of the processes to finish and then wrap up any loose ends you might have.

Conclusion

So there you go, process control.  A useful tool if you need to speed up a parallelisable program/algorithm.  Of course, in most cases speed isn’t too necessary in PHP applications, so the practical applications of process control aren’t very numerous.  In cases where speed is not important, it is usually better to keep the application simple, since it means it is more likely to be correct.

However, if you are writing something real-time, but don’t want to bog down your main application loop – making your application unresponsive, process control comes into its own and will save the day.

Concurrency in zBot

As I mentioned in the previous post, concurrency is something I’ve been meaning to include in zbot for a while now. zbot is a real-time PHP application, and as an IRC bot, a response time of over a second or two is starting to seem sluggish. Sometimes however some responses can’t be given in this length of time because they rely on an external site, and so the bot must wait for a response before it can reply. In such cases I would like the main loop to be free to continue watching for more easily serviced queries. So that it could reply to those first while it is waiting.

It also allows me to safely parcel off largish tasks without having to worry that the bot will ping out, as the core will continue looking for PING packets. For example, I would like to implement a news plugin which would download five or more RSS feeds and then parse them to check for new news stories. This would be ideal for process control, as I could fork off another process which would handle the downloads and parsing while the main bot continues to function.

For this purpose I have written a taskScheduler class to which I can send jobs which are then evaluated in a forked process. The class keeps track of all scheduled jobs, and performs a callback when the job is completed. This might prove to be a useful separate class, so I’ll likely both include it in zbot, and separately on my site with some documentation when it’s finished.