Alejandro Erickson
Postdoctoral Research Associate at the University of Victoria
flickr/ geoburst/ youtube/ instructables/ facebook/ linkedin/

A little randomness can yield remarkably organic images. The JavaScript snippet below randomly assigns colours to pixels, row-by-row starting at the top, such that the colour is similar to that of other pixels close to it.

Try the Random Canvas generator yourself.

Remember: Each new image you generate has NEVER been created nor seen before. If you like it, then drag it to your desktop! If you make a really nice one, or if you like this, leave a comment.

Press the buttons to generate a new image. Drag images to your desktop to save.

your browser does not support the canvas tag

The JavaScript snippet

I conceived the idea and wrote the snippet in 2012 and both are released under a CC 3.0 Attribution Non-Commercial license.

function newDrawing(cwidth,cheight){
    var c=document.getElementById("myCanvas");
    var ctx=c.getContext("2d");
    var id=ctx.createImageData(cwidth,cheight);
    
    var i,iw4 = id.width*4, iw2 = id.width*2, iw = id.width;
    id.data[0]=Math.random()*255;
    id.data[1]=Math.random()*255;
    id.data[2]=Math.random()*255;
    id.data[3]=255;
    var dir,mv;
    var j;

    //change this to change the amount you move in each colour
    var amount = 10;
    
    for(i = 4; i < id.height*iw4;i+=4){
	//alpha
	id.data[i+3]=255;
	for(j=0;j<3;j++){
	    if(Math.random() <= 0.5){dir = -1;} else{ dir = 1;}
	    if(Math.random() > 0.95) mv = 1; else mv = 0;
	    id.data[i+j]= id.data[i+j-4] + amount*dir*mv;
	    //Math.max(0, Math.min( 255, id.data[i+j-4] + (0.5 + Math.random())*dir ) );
	    if(i>iw4+4){
		if(i%iw2 > iw || Math.random() > 0.6){
		    id.data[i+j] = ( id.data[i+j]+ id.data[i+j-iw4+4]+ id.data[i+j-iw4-4]+ id.data[i+j-iw4])/4 + Math.random() - 0.5;
		}
		else
		{
		    id.data[i+j] = ( id.data[i+j]+ id.data[i+j-iw4+4]+ id.data[i+j-iw4-4]+ id.data[i+j-iw4-4])/4 + Math.random() - 0.5;
		}

	    } else if(i >= iw2+iw && i < iw4 && Math.random() > 0.99){ //in the first row we want to come back to the first colour
		id.data[i+j] = ((iw4-i)*id.data[i+j] + (i-iw2-iw)*id.data[j])/iw + Math.random() - 0.5;

	    }
	}
    }
    ctx.putImageData(id,0,0,0,0,cwidth,cheight);
    var img = c.toDataURL("image/png");
   // window.location = img;
    var imagepng = document.getElementById("image");
    imagepng.src = img;
    var button = document.getElementById("save");
    button.onclick = function () { window.location.href = imagepng.src.replace('image/png','image/octet-stream');};
}

How the Random Canvas works

Think of a digital image as a grid of pixels, and each pixel has a colour

Pixels

In this image, each pixel is a similar colour to the one that is left of it (or the rightmost one in the row above). How do we get similar colours?

Colours

To get from to we subtracted 2 from red, added 2 to green, and subtracted 2 from blue.

To get the whole image, we started by colouring the top left pixel and then we chose a similar colour for the next pixel by adding or subtracting a small amount of each colour, and so on , , , until we coloured all the pixels, row by row, ending with the last one at the bottom right .

We randomly decide whether to subtract or add a bit of red, blue and green, to make the next pixel a random colour similar to the last one.

But that isn’t quite the end of the story, is it. After all, the pixels generated by the buttons below are similar colours to the ones above and below themselves, not just beside. If we only did what I described, the results would look like this:

We fix this hideous behaviour by taking taking the average of two colours. After we have done the first row, we take our random choice of colour, and change it to the average of its own colour with that of the pixel in the row above.

Pixels average

We are almost done! You might have noticed that there are some wavy patterns that can’t be created by merely taking the average with the pixel above. These are done by taking averages with pixels above left, or above right, depending on how far along the row we are. They are just details that don’t really change the main idea.


It’s sugar cane season!

Chewing on stalks of raw sugar cane was a wonderful part of my childhood in Vancouver, and while visiting my home town today, I came across this delight once again. This time, however, I’ve made a delicious drink for grown-ups.

Simply mix sugar cane with rye whisky at a ratio of about 4:3.

The juice from about one section of the sugar cane should yield around 40ml of liquid, so this is enough for a single shot (30ml) of whisky. Chill the rye and the sugar cane. Extracting the juice from the cane (without chewing it) is a little challenging. I used a large heavy duty garlic press and it was considerable work. A conventional vegetable and fruit juicer would do it, I’m sure, as well as industrial cane juicers.

Enjoy the pictures and leave a comment!

Sugar Cane Pieces

Sugar Cane Pieces

In the press

In the press

Pressing the Cane

Pressing the Cane

The Sugar Cane Juice Mixer

The Sugar Cane Juice Mixer

Crown Royal Canadian Rye Whisky

Crown Royal Canadian Rye Whisky (30ml weighs about 28g)

Mixing the Rye with the Sugar Cane

Mixing the Rye with the Sugar Cane

The Finish

The Finish


Datacenters provide low-cost, efficient and flexible computing resources, so that users and applications (e.g., Google Search, Pokemon Go) share a pool of resources, and can expand into that pool as needed. Being such a critical building block of our digital world, every aspect of datacenters undergoes constant research and development both by industrial and academic communities.

At the physical level, all datacenters are built upon a network of servers, but there are many variations on exactly which hardware is used, and the topology (or graph) of the interconnections. A particularly interesting design is the server-centric datacenter network (SCDCN), for several reasons: it encourages the use of very cheap, and even legacy, components; many network topologies are possible; it allows the incorporation of existing, and extensive, research on high-performance interconnection network topologies; communication within the datacenter is fully programmable within the servers, making it an excellent platform for independent and academic research and development. Our work is precisely on algorithms for such communication, called routing algorithms.

Before I go on, let me describe how we model SCDCNs as a graph. An SCDCN with N servers and N/n switches, each with n ports, is represented by a graph on N server-nodes and N/n switch-nodes, connected by links. Each server-node, switch-node, and link represents the corresponding physical component of the topology. I’ll avoid excessive formality here by dropping the “-node” part. For reasons detailed in the paper, there are no switch-to-switch connections.

Many DCN topologies can be defined recursively, where a level k network is built from a number of level k-1 networks that are connected to one another with a number of bridge links. In our research we use the recursive structure and the bridge links to yield efficient routing algorithms for recursively defined networks. We apply them to two well-known SCDCNs that were proposed by Microsoft Research, DCell and FiConn,

We show, via an extensive experimental evaluation using our flow-level simulator, that our newly-proposed routing algorithms can yield significant improvements in terms of hop-length, fault-tolerance, load-balance, network-throughput, end-to-end latency, and workload completion time when we compare our algorithms with the existing routing algorithms for (Generalized) DCell and FiConn.

DCell, FiConn, and many other recursively-defined networks ship with a natural Dimensional routing algorithm, which is described for DCell below. Given a source server src and a destination server dst:

  • Identify the smallest m so that a level m DCell contains both src and dst,
  • Compute the bridge-link (dst’,src’) between the level (m-1) DCell containing src and the level (m-1) DCell containing dst,
  • Recursively compute paths from src to dst’ and from src’ to dst; and,
  • Piece together the recursively computed paths with the bridge-link.

It may be the case, however, that an alternative, proxy, level (m-1) DCell can be used such that a shorter path can be constructed from two bridge links and three recursively computed paths. An algorithm that searches for the proxy substructure is called a Proxy routing algorithm. This is precisely the situation depicted the the Figure below, for a level 2 FiConn that uses switches with 4 ports, where the proxy route is 2 server-server hops shorter than the dimensional route.

Proxy Routing in FiConn

Dimensional route (dashed black-grey), with 7 hops, and proxy route (thick black), with 5 hops, highlighted on a FiConn(2,4).

In our paper we identify and evaluate efficient ways of searching for and identifying proxy substructures which yield short paths. It’s a bit technical, and the details are in the paper, but the basic idea is depicted schematically in the figure below. We restrict the length of at least one of the recursively computed paths by looking at the bridge links incident to nodes near src or dst, and considering only the proxies that those bridge links are connected to. We compute the length of the proxy path through each such proxy, as well as link-states (e.g., traffic congestion) along the path, and then output the best path.

Intelligent search for proxies

Strategy for reducing size of proxy search: exploit the structure of the network to choose proxy substructures X_{k-1}^c that are inherently close to src or dst. Substructures are depicted as rounded rectangles, where GP_I and GP_0 are two possible schemes.

The details are in our paper:

A. Erickson, J. Pascual Saiz, J. Navaridas, and I. A. Stewart.
Routing Algorithms for Recursively-Defined Datacenter Networks. Submitted to a journal. Previous version in Proc. of Trustcom/BigDataSE/ISPA, 2015 IEEE, 3, 84–91, 2015.


In 2012-2013 I travelled to several remote First Nations communities around British Columbia to do hands-on mathematical activities with elementary school kids and teachers. Math Catcher, which provided the free math presentations, has also born several mathematical stories about a young boy, named Small Number, that have been translated into Aboriginal languages from across Canada. I have embedded all of Small Number’s stories and translations here (updated 31 August 2016) for your ears to enjoy the sound of these languages.

Click here to support this program directly by giving to Simon Fraser University.

Small Number Counts to 100

Written by Veselin Jungic & Mark MacLean Illustrated by Simon Roy.

Original English.

Apsch Akichikiwan akichikeet isko mitatomitano. Cree Translation by Barry Cardinal.

Poksskaaksini Kiipipo Ito’tokstakiiwa. Blackfoot Translation by Connie Crop Eared Wolf and Eldon Yellowhorn.

Small Number and the Old Canoe

Written by Veselin Jungic & Mark MacLean Illustrated by Simon Roy

Original English.

Small Number and the Old Canoe: Squamish. Squamish Translation by T'naxwtn, Peter Jacobs of the Squamish Nation.

Skwi’ikwexam Qes te Wet’ot’e Slexwelh. Halq'eméylem Translation by Siyamiyateliyot, Kwelaxtelot, and Kwosel, Seabird Island First Nation.

šɛ nuxʷɛɬ hɛga Mɛnaθey. Sliammon Translation by Mabel Harry, Karen Galligos, and Oshelle from the Sliammon Nation.

Gadim G̱an g̱anhl W̓ii Mm̓aal. Nisga'a Translation by Hlguwilksihlgum Maaksgum Hlbin (Emma Nyce), Ksim Git Wil Aksnakw (Edna Nyce-Tait), and Wilp Sim’oogit Hleeḵ (Allison Nyce).

Háusayáyu'u du glwa. Heiltsuk Translation by Constance Tallio and Evelyn Windsor.

ʔanaḥʔis Huksyuu ʔuḥʔiš qiickʷiiʔaƛʔi č̓apac. Huu-ay-aht Translation by Benson Nookemis from the Nuu-Chah-Nulth Nation.

Small Number and the Old Canoe. Hul'q'umi'num' Translation by Ruby Peter (Sti’tum’at).

Small Number and the Basketball Tournament

Written by Veselin Jungic, SFU, and Mark MacLean, UBC Illustrated by Jean-Paul Csuka, Canada

Original English.

APSCH AKICHIKIWAN IKWA KAMISIKTIT KOSKEWNTOWAN METAWEWINA. Cree Translation by Barry Cardinal from Bigstone Cree Nation.

Poksskaaksini Itsinssiwa Ki’tsa’komootsiisini. Blackfoot Translation by Connie Crop Eared Wolf and Eldon Yellowhorn.

čišɛtawɬ kʷ ƛiqɛtəs ta p̓əʔač - Mɛnaθey. Sliammon Translation by Mabel Harry, Karen Galligos, and Oshelle from the Sliammon Nation.

Small Number and the Big Tree

Written by Veselin Jungic and Mark MacLean Illustrated by Simon Roy and Jess Pollard

Original English.

kwi'kwushnuts 'i' tthu thi thqet. Hul'q'umi'num' Translation by and narration by Delores Louie (Swustanulwut).

mɛnaθɛy hɛga t̓aχəmay. Sliammon Translation by Betty Wilson (Ochele) from the Sliammon Nation.

Etsím Skw’shim iy ta Hiyí Stsek. Squamish Translation by Setálten, Norman Guerrero Jr. of the Squamish Nation.

Small Number and the Skateboard Park

Written by Veselin Jungic & Mark MacLean Illustrated by Simon Roy

Original English.

Apsch Akichikewin ikwa eta kasooskawtuykeetaw metaweewnihk. Cree Translation by Barry Cardinal.

Small Number and the Salmon Harvest

Written by Veselin Jungic & Mark MacLean Illustrated by Simon Roy & Jess Pollard

Original English.

Etsíim Skw’eshim na7 ta Tem Cháyilhen. Squamish Translation by X̱elsílem Rivers.

Skw'ikw'xam Kw'e S'ole_xems Ye Sth'oqwi. Halq'eméylem Translation by _Xotxwes.

Small Number and the Salmon Harvest. Sliammon Translation by Betty Wilson.

kwi’kwushnuts ’i’ tthu tsetsul’ulhtun’. Hul'q'umi'num' Translation by Ruby Peter (Sti’tum’aat).

ʔeʔimʔaƛquu mimityaqš ʔanaḥʔis Huksyuu. Huu-ay-aht Translation by Benson Nookemis from the Nuu-Chah-Nulth Nation.

Small Number and the Old Totem Pole

Written by Veselin Jungic and Mark MacLean Illustrated by Bethani L’Heureux

Original English.

I am unfamiliar with these (and all) Aboriginal languages, so please let me know if I have made any errors in this blog post, and especially if the special characters are not rendering correctly.


“Cracking the Coding Interview 6e” question 4.10: T1 and T2 are binary trees with m and n nodes, respectively. Test whether T1 contains T2 as a subtree. Here is my solution, which performs much better than the book’s solution.

The problem asks, more specifically, if there is a node in T1 such that the subtree rooted there has exactly the same structure as T2, as well as the same values at each node. Spoilers follow the image below, so if you are looking for hints, proceed carefully.

Test subtree containment

Test subtree containment

CCtI6e discusses two solutions, the first one makes the observation that a tree is uniquely identified by its pre-order traversal if you include special identifiers for null child-nodes, so the tree containment problem is reduced to a substring problem: does toString(T1) contain toString(T2)? CCTI6e blurs the solution by using a library call to solve the substring problem (it is not trivial to do it efficiently), but they do claim that this solution runs in O(m+n) time and O(m+n) space. It’s a valid answer, but I want something better.

The second solution tries to improve on the O(m+n) space of the first solution. The naïve way would be to do a depth-first search comparison of T2 and the subtree rooted at each node of T1, resulting in an O(log(m) + log(n)) space algorithm that uses O(min(m*log(m),m*n)) time. CCtI6e suggests using the values of the root of the T1-subtree and the root of T2 to decide whether or not to perform the search, but this ultimately makes no difference at all. First, if the values are different, then the search would abort immediately, and second, the nodes of T1 and T2 could have all the same data, say 0. Below is a drawing of a bad case, where most of T2 occurs at many nodes of T1, but T1 does not contain T2.

A bad case for recursive subtree testing

A bad case for recursive subtree testing

I was surprised to see such naïve solution to this problem, as CCtI6e has been very good so far. The recursive solution can be improved upon considerably by using a better heuristic for whether or not to search/compare T2 with the subtree at a given node of T1.

Hash the structure and data of a binary tree

We want to hash each node so that if the hashes of two trees match, there is a high probability that their subtrees match. As a proof of concept, I adapted Knuth’s multiplicative method to work on binary trees with integer values. The idea is to re-hash the hash values of the left and right children, as well as the height and value of the current node, using distinct 32-bit primes for each part, and then add the four multiplicative hash values. I’ve read that hashing is part guess-work and part theory, and there you have it.

uint32_t hash(uint64_t left,
              uint64_t right,
              uint64_t value,
              uint64_t height){
  return ((left&(0xFFFFFFFF))*2654435761 +
          (right&(0xFFFFFFFF))*2654435741 +
          (value&(0xFFFFFFFF))*2654435723 +
          (height&(0xFFFFFFFF))*2654435713);
}

uint32_t dfs_hash(node *t){
  if(!t) return 0;
  return hash(dfs_hash(t->left),dfs_hash(t->right),t->value,t->height);
}

My hash function seems to work quite well for randomly generated trees, but it is not thoroughly tested or anything. I’d like to eventually adapt a more sophisticated function, perhaps from Eternally Confuzzled.

Using the tree-hashes

My algorithm works as follows:

  • pre-compute the hash of T2.
  • Do a depth-first search of T1 where at each node:
    • if one of child nodes of the current node contains T2, return True. Otherwise, record the hashes of the child nodes.
    • hash the current node in O(1) time
    • if the hash of the current node matches the hash of T2:
      • do a depth-first search that compares T2 to the subtree rooted at the current node
      • if T2 is identical to the subtree rooted at the current node, then return True
    • return the hash of the current node.

Note that I do not cache hash values. The idea in this exercise is not to use unnecessary space, beyond the O(log(m)+log(n)) space necessarily used by depth-first searches. Caching hash values would not only use O(n+m) additional space, but keeping the hash values up to date in a non-static binary tree requires modifying the binary tree implementation.

The above pseudo-code is implemented below, and the complete source-code is at the end of this blog post.

typedef struct node{
  int value,height;
  struct node *left;
  struct node *right;
}node;

node *rec_contains_subtree(node *t1, node *t2, uint32_t *t1ChildHash, uint32_t t2hash){
  uint32_t leftChildHash,rightChildHash;
  node *res;
  *t1ChildHash=0;
  if(!t1)
    return NULL;
  if((res=rec_contains_subtree(t1->left,t2,&leftChildHash, t2hash))||
     (res=rec_contains_subtree(t1->right,t2,&rightChildHash,t2hash)))
    return res;
  count_nodes_hashed++;
  *t1ChildHash=hash(leftChildHash,rightChildHash,t1->value,t1->height);
  if(*t1ChildHash==t2hash){
    count_dfs_compare++;
    if(dfs_compare_trees(t1,t2))
      return t1;
  }
  return NULL;
}

node *contains_subtree(node *t1, node *t2){
  //hash t2
  uint32_t t2hash = dfs_hash(t2);
  uint32_t t1ChildHash;
  if(!t2)
    exit(-1);
  return rec_contains_subtree(t1,t2,&t1ChildHash,t2hash);
}

Below is the other depth-first search, that compares T2 with a subtree of T1. Notice that the CCtI6e solution is implicit in the last conditional.

int dfs_compare_trees(node *t1, node *t2){
  if(!t1 && !t2)
    return 1;
  if(!t1 && t2)
    return 0;
  if(t1 && !t2)
    return 0;
  if(t1->value!= t2->value)
    return 0;
  return dfs_compare_trees(t1->left,t2->left) &&
    dfs_compare_trees(t1->right,t2->right);
}

Testing on some large inputs

I did some tests on randomly generated trees. This is by no means a robust test, especially as regards hash collisions. I’ll paste some output and then explain it.

Below, I insert 10,000,000 nodes into T1 and 100 nodes into T1, by descending the trees on randomly selected branches and inserting at the first empty node. All data values are 0s. Since the result is that T1 doesn’t contain T2, then all 10,000,000 nodes of T1 are searched, but notice that the hashing saves me from doing any node-by-node comparisons with T2.

1e-05 seconds to allocate trees
10.3298 seconds to create t1
2e-05 seconds to create t2
t1 has 10000000 nodes
t2 has 100 nodes
0.928297 seconds to check if t1 contains t2
t1 does not contain t2
Did dfs_compare_trees 0 times
Hashed 10000000 nodes of t1

With the same tree T1 as above, I regenerated T2 100 times and repeated the same test as above. Again, the node-by-node comparison, dfs compares, was done 0 times (per search).

Test containment in t1 of randomly generated trees t2
Nodes: t1 10000000 t2 100; Trials: 100
All nodes hold value=0
Average per search:
	time 0.944213;
	dfs compares 0;
	prop. nodes hashed 1;
	containment 0

I repeated the previous experiment, below, with smaller trees T2, on 20 nodes. Now we see that 6/100 of these T2s are contained in T1, but dfs_compares is done 10 times more often than we find something. It shows that the hashing is giving quite a few false positives for small T2! More testing needed.

Test containment in t1 of randomly generated trees t2
Nodes: t1 10000000 t2 20; Trials: 100
All nodes hold value=0
Average per search:
	time 0.96471;
	dfs compares 0.58;
	prop. nodes hashed 0.964319;
	containment 0.06

In order to test containment of larger trees T2, I simply used subtrees of T1, selected (pseudo) uniformly at random. Since my search does not consider the values of the references to the nodes, it is unaware that T1 and T2 share locations in memory. Below we see that very few nodes of T1 are hashed, and that a hash collision is experienced in about 12% of the searches, since 1.12=dfs_compares/containment.

Test containment in t1 of randomly selected subtrees of t1
Nodes: t1 10000000 t2 (unknown); Trials: 100
All nodes hold value=0
Average per search:
	time 0.0682518;
	dfs compares 1.12;
	prop. nodes hashed 0.0711984;
	containment 1

The running time is difficult to compute (and maybe impossible to prove), because of the way it depends on the hash function. Informally, we can state an expected time of O(m+(1+c)n), where c is the number of hash collisions. If c is small enough, this becomes O(m). This is much better than CCtI6e’s time of O(m+kn), where k is the number of nodes of T1 that have the same value as the root of T2, since, in our 0-value trees, that could easily amount to O(m*log(m)); the cost of doing a full depth-first search of T1 at every node.

The code, which is a bit rough and quick, is available as a gist: