Migrating to Markdown Pt3 From Domino To Markdown

Domino and in particular BlogSphere V3 by the wonderful Declan Lynch has been my blogging platform for years now and has served me well and faithfully, but all good things come to and end as part of the constant fiddling with new stuff LDC do, I have moved my blogging platform to markdown ( The wretch Ben Poole got me started on it ) on the Statamic platform, but what about years of blog entries that are happily snuggled down in my nsf file,

Java to the rescue. I have done a little agent that takes all blogs and exports them to markdown format regardless if they are html of rich text, glues the existing comments on the end of the blog posts, exports quick images and emoticons (while changing their references in the blog posts) and makes a redirects file so all your old external links work

Just copy the below code into a Java agent and set the ‘baseExportDir’ to where ever you want the site to export too. when you run the agent you will end up with a bunch of mark down files representing the blog entries, a redirects.txt file containing the 301 redirects who’s contents you can past into your root directory .htaccess file so all your old link work, a “page” directory with the emoticons in it, and a “blog” directory with all the quick images in it.

As always, yell if there is something missing or wrong

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.Writer;
import java.text.Normalizer;
import java.util.Calendar;
import java.util.Date;
import java.util.Enumeration;
import java.util.Vector;
import java.util.regex.Pattern;
import lotus.domino.AgentBase;
import lotus.domino.AgentContext;
import lotus.domino.Database;
import lotus.domino.DateTime;
import lotus.domino.Document;
import lotus.domino.DocumentCollection;
import lotus.domino.EmbeddedObject;
import lotus.domino.NotesException;
import lotus.domino.RichTextItem;
import lotus.domino.Session;
import lotus.domino.View;
public class JavaAgent extends AgentBase {
    public void NotesMain() {
        try {
            Session session = getSession();
            AgentContext agentContext = session.getAgentContext();
            Database db = agentContext.getCurrentDatabase();
            String baseExportDir = "C:\markdownexport\";
            // ****** start document export ******
            View content = db.getView("vw_Content_Blogs");
            File theDir = new File(baseExportDir);
            if (!theDir.exists())
                theDir.mkdir();
            // create a file to store all the 301 redirections for existing blog
            // entires
            File redirectfile = new File(baseExportDir + "redirects.txt");
            Writer redirect = new BufferedWriter(new FileWriter(redirectfile));
            Document doc = content.getFirstDocument();
            while (doc != null) {
                Writer output = null;
                if (doc.getItemValueString("FORM").equals("content_BlogEntry")) {
                    String filename = "";
                    Vector dM = doc.getItemValue("EntryDate");
                    DateTime dt = (DateTime) dM.elementAt(0);
                    System.out.println(dt.getLocalTime());
                    Date date = dt.toJavaDate();
                    Calendar cal = Calendar.getInstance();
                    cal.setTime(date);
                    // create file name in the correct format for statamic
                    filename = Integer.toString(cal.get(Calendar.YEAR)) + "-" + String.format("%02d", cal.get(Calendar.MONTH) + 1) + "-" + String.format("%02d", Integer.valueOf(cal.get(Calendar.DAY_OF_MONTH)));
                    filename = filename + "-" + doc.getItemValueString("EntryTitle").trim().replaceAll(" ", "-").replaceAll("\\", "-").replaceAll("/", "-").replaceAll(":", "-").replaceAll("\?", "-").replaceAll(""", "-");
                    File file = new File(baseExportDir + filename + ".md");
                    redirectfile.getParentFile().mkdirs();
                    // add redirect to redirect file
                    redirect.write("Redirect 301 /d6plinks/" + doc.getItemValueString("PermaLink") + " /blog/" + filename);
                    redirect.write("rn");
                    output = new BufferedWriter(new FileWriter(file));
                    output.write("---");
                    output.write("rn");
                    output.write("title: '" + doc.getItemValueString("EntryTitle").trim() + "'");
                    output.write("rn");
                    if (doc.getItemValueString("EntryStatus").trim().equals("Published")) {
                        output.write("status: live");
                        output.write("rn");
                    } else {
                        output.write("status: draft");
                        output.write("rn");
                    }
                    // end of meta
                    output.write("---");
                    output.write("rn");
                    output.write("rn");
                    String body = "";
                    // get the body text and clean it up for UTF-8 standard
                    if (doc.getItemValueString("EntryHTML").trim().length() > 1) {
                        body = doc.getItemValueString("EntryHTML").trim();
                    } else {
                        body = doc.getItemValueString("EntryRICH").trim();
                    }
                    body = Normalizer.normalize(body, Normalizer.Form.NFD).replaceAll("\p{InCombiningDiacriticalMarks}+", ""); // .replaceAll("[^\p{ASCII}]",
                    // "");
                    // smart single quotes and apostrophe
                    body = removeMSRubbish(body);
                    output.write(body.replaceAll("http.*?\$File", "/assets/img/blog"));
                    output.write("rn");
                    output.write("rn");
                    // get all the old comments and add them to the bottom of
                    // the blog
                    DocumentCollection responses = doc.getResponses();
                    if (responses.getCount() > 0) {
                        output.write("Old Comments");
                        output.write("rn");
                        output.write("------------");
                        output.write("rn");
                        output.write("rn");
                        Document rdoc = responses.getFirstDocument();
                        while (rdoc != null) {
                            // setting as h5 in markup
                            output.write("##### " + rdoc.getItemValueString("nameAuthor") + "(" + rdoc.getCreated().toString() + ")");
                            output.write("rn");
                            String comment = Normalizer.normalize(rdoc.getItemValueString("body").replaceAll("http.*?\$File", "/assets/img/page"), Normalizer.Form.NFD).replaceAll("\p{InCombiningDiacriticalMarks}+", "").replaceAll("[^\p{ASCII}]", "");
                            comment = removeMSRubbish(comment);
                            output.write(comment);
                            output.write("rn");
                            output.write("rn");
                            rdoc = responses.getNextDocument(rdoc);
                        }
                    }
                    output.close();
                }
                doc = content.getNextDocument(doc);
            }
            redirect.close();
            // ****** end document export ******
            // ****** start export quick images, the html references
            // '/assets/img/blog' but im just exporting them to 'blog'******
            String imagesExport = "blog\";
            View view = db.getView("lkp_QuickImages");
            doc = view.getFirstDocument();
            File theImagesDir = new File(baseExportDir + imagesExport);
            if (!theImagesDir.exists())
                theImagesDir.mkdir();
            boolean saveFlag = false;
            while (doc != null) {
                RichTextItem body = (RichTextItem) doc.getFirstItem("ImageFile");
                Vector v = body.getEmbeddedObjects();
                Enumeration e = v.elements();
                while (e.hasMoreElements()) {
                    EmbeddedObject eo = (EmbeddedObject) e.nextElement();
                    if (eo.getType() == EmbeddedObject.EMBED_ATTACHMENT) {
                        eo.extractFile(baseExportDir + imagesExport + eo.getSource());
                    }
                }
                doc = view.getNextDocument(doc);
            }
            // ****** end export quick images ******
            // ****** start export emoticon images, the html references
            // '/assets/img/page' but im just exporting them to 'page'******
            imagesExport = "page\";
            view = db.getView("lkp_Emoticons_Web");
            doc = view.getFirstDocument();
            theImagesDir = new File(baseExportDir + imagesExport);
            if (!theImagesDir.exists())
                theImagesDir.mkdir();
            while (doc != null) {
                System.out.println(doc.getItemValueString("EmoticonName"));
                Vector v = session.evaluate("@AttachmentNames", doc);
                System.out.println("emoticon:" + v.firstElement().toString());
                EmbeddedObject eo = doc.getAttachment(v.firstElement().toString());
                eo.extractFile(baseExportDir + imagesExport + v.firstElement().toString());
                doc = view.getNextDocument(doc);
            }
            // ****** end export images ******
        } catch (NotesException e) {
            System.out.println(e.id + " " + e.text);
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
    public String removeMSRubbish(String body) {
        body = body.replaceAll("[u2018|u2019|u201A]", "'");
        // smart double quotes
        body = body.replaceAll("[u201C|u201D|u201E]", """);
        // ellipsis
        body = body.replaceAll("u2026", "...");
        // dashes
        body = body.replaceAll("[u2013|u2014]", "-");
        // circumflex
        body = body.replaceAll("u02C6", "^");
        // open angle bracket
        body = body.replaceAll("u2039", "<");
        // close angle bracket
        body = body.replaceAll("u203A", ">");
        // spaces
        body = body.replaceAll("[u02DC|u00A0]", " ");
        return body;
    }
}

Microsoft Naff Characters

I should have done this ages ago, the daft none standard Microsoft characters have bitten me in the bum for years, and I always have to strip them out, here is a quick and dirty function for it

 public String removeMSRubbish(String body) {
        body = body.replaceAll("[u2018|u2019|u201A]", "'");
        // smart double quotes
        body = body.replaceAll("[u201C|u201D|u201E]", """);
        // ellipsis
        body = body.replaceAll("u2026", "...");
        // dashes
        body = body.replaceAll("[u2013|u2014]", "-");
        // circumflex
        body = body.replaceAll("u02C6", "^");
        // open angle bracket
        body = body.replaceAll("u2039", "<");
        // close angle bracket
        body = body.replaceAll("u203A", ">");
        // spaces
        body = body.replaceAll("[u02DC|u00A0]", " ");
        return body; 
    }

Migrating to Markdown Pt1 The Rant

A couple of months ago that wretch Ben Poole introduced me to the joys of markdown (he is looking at octopress) I fell instantly in love with it. The fact that it was a simple format that I could edit on any machine with a text editor and I did not have to remember much in the way of formatting was a winner – being a bit slap dash I tend to make a lot of mistakes and while I compensate on clients’ work by double-checking everything, it can get in the way when all you want to do is write a quick blog entry.

I also did not want to use a system such as Squarespace even though that was what I set my dad up on and also what a lot of my colleagues use, because I love to roll my own and have the complete control over my content that only my own server will give. On that note, there are some things I do want to give up and stop paying for as I’m using amazon webservices so I’m being stingy on every cpu cycle and byte transferred, so search is out, I will use a custom google search (which I will enable once this new site has indexed), I’m also bored of content moderation (I get more spam than comments but don’t want to be an arse about forced logons for comments as I dont leave ANY comments when other people make me jump thought hoops) so I’m going to use disqus and I’m not hosting any files I can push to another server (such as jquery).

Finally there is connectivity. I write the vast majority of my blogs and stuff when I am offline (if I was online then I would be working) and I am sick to the high teeth with connection issues or having to have a special client to be able to write a blog entry. I just want to write it in a normal human readable text file which I can then just sync up to my server is just what I want.

Next blog entry: the nuts and bolts of the set-up on the platforms I have chosen statamic on an amazon web service box using Dropbox to handle the syncing and updates.

See you in a day or so.

Migrating to Markdown Pt2 Nuts and Bolts

Setting up the awesome Statamic on Amazon EC2 and syncing via Dropbox is straight forward and once set up is dead reliable, I love the fact that I can edit files anywhere I can access Dropbox and they appear straight away on my site(s) with backups all sorted

Prerequisites

A) A drop box account, I would recommend a new one just for this otherwise you end up either having stuff you don’t need on the server or having to exclude loads of directories which have to be updated every time you add a new one, just get a standard free one to start with as that will give you 2 gig plus data and that is well enough, then just share the directory you will be storing your website in with your normal dropbox account.

B) An Amazon web services account you will need one of these as that is where the site will actually be hosted.

C) A licensed version of Statamic

Soo..
  • Create a EC2 instance (just use the quick start wizard) using Amazon Linux AMI 2012.09 (32 bit)
  • I prefer static IP address, so these instructions assume you have requested a ‘Elastic IP’ address and associated it with your new EC2

(I’m not putting in blow by blow instructions for creating an EC2 as the Amazon AWS site is as easy as their shopping one, kick me if you need full instructions )

Once you have your EC2 instance up you will need to connect to it via SSH for the Linux and Mac boys this is easy, you can just open up a terminal, for windows peasants I would recommend putty, use the command below remembering to have your pem (the security file you created when you make the EC2) file in the same directory as the one your are running the command from.

ssh -i xx.pem ec2-user@xx.xxx.xxx.xxx

Now you are on the server we want to install Apache and php, this is dead easy

sudo yum install httpd php

Next we will be installing Dropbox with the following command ( note: that this is for the 32 bit version )

cd ~ && wget -O - "https://www.dropbox.com/download?plat=lnx.x86" | tar xzf -

Next, start the Dropbox daemon which will try and run drop box on the server

~/.dropbox-dist/dropboxd

If you’re running Dropbox on your server for the first time, you will be asked to copy a link into a browser, do this and when asked to log into your new Dropbox account,
While you’re there create a folder in your Dropbox that you are going to store the site in, in this example mine will be called “XXXXX”
you will see that a folder called ‘Dropbox’ will be created in your server’s (or more precisely the ‘ec2-user’) home directory.
(by the way I had to do a ctrl-c to get my prompt back so I could carry on)
Now we need a cool little package called Dropbox.py, this is the Dropbox command line tool.

$ wget -O ~/dropbox.py "http://www.dropbox.com/download?dl=packages/dropbox.py"

(we are storing it in our home directory root, so when you are calling it, you will either have to be in your home directory or reference it as ~/dropbox.py)
Now we’re going to add a symbolic link. This links our Dropbox folder to our web root. first run this command to stop drop box

python dropbox.py dropbox stop

This should give you the message ‘Dropbox daemon stopped’.
Now Link the Apache www root to your new website directory on Dropbox

ln -s /var/www ~/Dropbox/XXXXX

Next you will have to change some security rights so that drop box can write to these directories

sudo chown -R $USER /var/www
sudo chmod -R u+rw /var/www

and restart Dropbox

python ~/dropbox.py dropbox start

you will now find that the contents of the www directory is appearing in your Dropbox folder. 🙂
you should see 4 folders , ‘cgi-bin’ , ‘error’ , ‘icons’ and ‘html’
extract the contents of your Statamic zip download into the ‘html’ directory
that should be it really and you should just be able to go to the IP address and see the default Statamic website, but I got a odd error message regarding default date formats when I tried it.
to fix this we need to set a timezone in your php.ini file, edit your php.ini file like this

sudo vi /etc/php.ini

find the “data.timezone” by typing

/timezone

It will most likly be like this

date.timezone = ""

I changed mine to

date.timezone = "Europe/London"

To update a file in VI, press “I” to go to edit mode, change the text, then press the escape key to move out of edit mode and “:x” to save and exit.
vi is a bit of a pain to use, you can find a nice reference [here](http://www.lagmonster.org/docs/vi.html)
That should be it, Statamic should work fine and you should be able to see the default website and update it via Dropbox.
##### Extra Notes
If you want to use the online content manager for Statamic you will need to set the following security so that it can write to directories

sudo chmod -R 777 /var/www/html/_config/users/
sudo chmod -R 777 /var/www/html/_content/

and finally you want the Dropbox daemon to restart if the server gets restarted
so enter

crontab -e

This will give you a blank text file in vi (see above on how to navigate in vi)
add the following line and save

@reboot ~/.dropbox-dist/dropboxd

kick me if anything is unclear.