Readme for GEDCom HTML Output Translation Interface (GHOTI)



This is the Perl program that I am developing which generates the HTML pages for my "Family history", and "Borsetshire history" web pages. I am making this program available on an open source basis, but for commercial applications, I assert my intellectual property rights as described at the bottom of this file.

GHOTI stands for GEDCom HTML Output Translation Interface. Arguably this should be pronounced "Fish"(GH is pronounced "f", as in "enough", O is pronounced "i" as in "women", and TI is pronounced "sh" as in "nation"). However, I pronounced it "got-he". It's my program, and that's how I say it should be pronounced, so there!!

Features of Generated HTML:

The format of the web pages that ghoti generates are, as far as I know, unique in a several ways:

Data filtering optimised for HTML pages organised as family groups

It has a fairly complex privacy filtering algorithm which maximizes what can be published while respecting and protecting information of living individuals. The algorithm completely removes all individuals that are living (or probably living) or had a spouse that is (probably) living.

Only includes in the reports relatives of a selected person (known as the seed individual).

Use a configuration file to assert that all details of an individual are suppressed irrespective of their living/status. This overrides any filtering, and still removes the details even if filtering is not enabled. A private individual also suppresses any people with who are only related to the "seed" individual" through a private person.

Web pages are organised as family groups

The number of boys and girls is listed for each family. Where some of these may be alive, the number of children who are living is shown, but not their names.

Half siblings are included in each family group.

Children of a family which have no known partner are listed at the bottom of their parents family page rather than having their own family of one person.

The file names for families are based on the surnames of the husband and wife. Hence making it easier to find, update and upload individual family pages, rather than having to update several hundred pages every time.


The generator refers to a configuration file, which has all the preferences for the web pages. Changing a single argument when calling the program changes the configuration file. Hence, generation of the web-pages is repeatable and consistant.

HTML Optimisations

The generated html references a single style-sheet, or .css file, thus making it easier to change the look and feel of all the web pages by changing this file. This makes it easier to make the family history pages have the same look and feel as the rest of a web site. This has the added benefit of minimising the size of the individual family HTML files.

The final web page contains no new-lines! Thus minimising download time for the user. Since new-line has no meaning in html, this does not affect the look of the final web-pages. The HTML generator program has an option to include new lines and other white space characters during program development.

If a family has unknown grandparents or no children, no box is drawn for them.

All links are relative, so that the HTML may be relocated to a web-server without having to change all the directory paths.

Automated searching and checking of all links to check for broken links to any HTML or picture which is referenced in the individual/marriage notes.

back to top

Pre-requisits before being able to use

Before being able to run ghoti, you must have a perl interpreter program.

Perl is a open-source language, which is designed to read and process text files It runs on many operating systems with few compatibility issues. On Windows XP, I use Active Perl ( and on Solaris and Linux, I use the default Perl interpreter. Ghoti run on all of these with no dependancies on the operating system.

back to top

List of files in

borsetshire.ged          - GEDCom file of test data
BorsetshireWww.gcf       - Ghoti Configuration File to produce test html
ghoti.css                - example cascading style sheet                 - Ghoti perl program           - Perl modules used by Ghoti
ghotiSkeletonCss.css     - default cacading style sheet.
ghotiSkeletonIndex.html  - default top page for generated html
ghotiSkeletonReadme.html - default html introduction page.
readme.html              - This file.

Other files generated during a run:
back to top

How to set up Ghoti and run the test data

Ghoti is distributed as a zipped file, which contains the open source ghoti perl program and a small fictional data test set.

To run the test data,

Step 1. Create a new directory and expand all the files into that directory.
Step 2. go to the command prompt and "cd" to the directory that you created.
Step 3. type "perl BorsetshireWww.gcf" this should now generate the html.
Step 4. run you preffered web-browser to view ./BorsetshireWww/index.html

You should be able to move/copy this directory anywhere, and all links should be retained.

back to top

How to set up you configuration and organise linked files.

In the following instructions, you may use any name you choose instead of "myName":

To set up the configuration for your own data:-

Step 1. Make a copy of BorsetshireWww.gcf and name it myName.gcf This file contains configuration data to direct ghoti to the appropriate data and to configure the options that you want for the generation. The example file contains descripions of what each setting is for. For all lines with a "#" in then, the characters after the "#" are comments and are ignored by ghoti.

Step 2. Use a text editor, such as Notepad to update myName.gcf as required. Hopefully the comments in The file are adequate to explain what each setting does. Note that you may need to organise your files as described below in order to enable ghoti to locate external html or jpg files.

Step 3. Run "perl myName.gcf"

Linked html and jpg files is able to include links to your own html and jpg files, and will import them into it's own directory structure. However, it requires cetain rules to be followed.

The master location for all html files must be <masterDir>/<htmlRoot> This is the location for any html reports that are referenced in the individual or marriage notes in the GEDCom files. <masterDir> and <htmlRoot> are defined in the *.gcf file.

The master location for small jpg files must be <masterDir>/<smallPicRoot> These are small versions of pictures which are included in the family description. I recommend a maximum of 300 pixels along any edge. <masterDir> and <smallPicRoot> are defined in the *.gcf file.

The master location for larger jpg files must be <masterDir>/<bigPicRoot> These are larger versions of pictures which are displayed if the user clicks on the small picture. I recommend a maximum of 1024 x 768 pixels to fit on an XGA monitor. <masterDir> and <bigPicRoot> are defined in the *.gcf file.

In the individual/marriage notes, and in the GEDCom file, the reference to the html files must be enclosed in curly brackets:- {text fileName.html}.

where "text" is the text which is clickable in the generated html and "filename.html" is the name of the html file (hopefully, pretty obvious). You may include sub-directories in the filename, for example myDir/fileName.html, but this directory structure must be duplicated in <masterDir>/<htmlRoot>.

In the GEDCom file, the link to the html files must be in the following format:- {fileName.jpg}

where "filename.jpg" is the name of the jpg file. You may include sub-directories in the filename, for example myDir/fileName.jpg, but this directory structure must by the same as it is in <masterDir>/<smallPicRoot> and <masterDir>/<bigPicRoot>.

When runs, the last operation it performs is to check the links in the generated html. It will copy any linked files into the <targetDir> if it can find them in <masterDir> and if it cannot find the file, it outputs a message to warn you.

back to top

Future enhancements and Legal stuff

This is my "To Do list", in order of my priorities and it is my intention to implement them as and when I get around to it.

GHOTI Gedcom Html Output Translation Interface.
Copyright (c) 2006 Christopher Robert Squires
All rights reserved.

(Sorry about the legal stuff, I just want to minimise the likelyhood of commercial interests profitting from my work without some recognition.)

ANY commercial use of this code or the algorithms developed for GHOTI is prohibited without written permission. Especially the privacy filtering!

However, permission is hereby granted to private users of GHOTI to make copies and modifications to this code for their own personal use and to distribute such modified copies, provided that all three of the following conditions are met:

1 this copyright notice is retained on all modified or distributed copies,
2 and that the changes are record in the revision history of modified files,
3 and a copy of the modified code is forwarded to the me.

If I consider that a modification is suitable for inclusion, I will add it to the master source, and add details, and acknowledgements to the revision history.

back to top