mattzag.com Blog

My blog of random thoughts, developments, and tech-goodness.

Installing Archangel, MacPorts, numpy, cfitsio, matplotlib, Xcode3, and pyfits on a Mac

This isn’t a “normal blog post” but I’m putting it here for my future reference (and the reference of those going through the Archangel installation process).  Archangel is the last step in this setup, so this guide should be beneficial to many others when attempting to install packages like numpy or pyfits on a mac.

For background, Archangel is a utility for analyzing and extracting the photometric properties of extended objects in astronomical data.  For more information, visit the Archangel Homepage.

A note before we get started:  This guide is for installing these programs on a REMOTE Mac box, however all of these commands still work for if you are installing it locally.

Strategy

Our goal is to install archangel, but there’s a ton of stuff to do first.  Archangel directly depends on 5 packages: python, numpy, pyfits, matplotlib, and cfitsio.  These are NOT trivial to install.  Things like numpy are an absolute mess because they rely on packages like BLAS and LAPACK, linear algebra routines that are about as easy to install as catching a greased pig.  So, we will use MacPorts to assist us.  MacPorts is essentially an “easy install” application for many unix utilities.  It automatically downloads, configures, and installs a ton of great software and any prerequisates for the software you want to install.  We will use it to install python and numpy (the two hardest things to install).  Note that MacPorts is VERY comprehensive, almost any large utility can be found in its database, ready to install…visit macports.org for more information.  I should note that the default python that ships with Mac OSX is terrible, it is a chopped-down version of the real thing, so we need to make sure we install the real thing using MacPorts.

However, MacPorts is dependent on a few things as well.  First, X11, which mac ships with so we don’t have to worry about it.  Second, it depends on XCode, the apple software development suite.

Installation

XCode3

To install XCode3, visit the Apple Developer’s site here and register for a free developer’s account.  After registration, download Xcode 3 using the links towards the bottom of the page.  Make sure you download Xcode 3! not Xcode 4 (unless you want to pay for it).  Note: This is a .dmg disk image file that is about 4 GB in size. Be ready for a long download.  If you are installing it locally, just double click the .dmg file and off you go.  However, I am currently having to use the command line to install it because I’m installing it remotely, so here are the steps for that:  (This is for OSX 10.6.6, you need to make sure your xcode download is compatible with your system.  Previous Xcode’s can be found here under “Developer Tools”.  You will want Xcode 3.1.4 if you are running Leopard, 3.2.6 for Snow Leopard)

  • scp the downloaded file to the target system
  • ssh into the target system
  • Mount the .dmg disk image using the following (assuming 3.2.6 is being installed, just change all of the variables if installing 3.1.4).
hdiutil attach ./xcode_3.2.6_and_ios_sdk_4.3.dmg
  • Next, run the installer on the .mpkg file inside of the mounted volume.  This installer command tells it to install the .mpkg on the device “/” (our main boot partition).
cd /Volumes/Xcode\ and\ iOS\ SDK
sudo installer -verbose -pkg ./Xcode\ and\ iOS\ SDK.mpkg -target /
  • Now that it is installed, unmount the disk image with
hdiutil detach /Volumes/Xcode\ and\ iOS\ SDK
  • You can delete the .dmg now as well.

MacPorts

Next up is MacPorts.  It is fairly straightforward to install:

  • Go to a new temporary directory on the target machine
  • Get the tarball from macports.org (About 1 meg)
wget http://distfiles.macports.org/MacPorts/MacPorts-1.9.2.tar.gz
  • CRAP! wget isn’t in Mac!  We have to use MacPorts to install wget…so we’re stuck with downloading the file to our current computer, and scp’ing it to the temporary directory on the target machine.  Download the URL above and SCP it to your target machine (or just keep it locally if you are installing all of this on your local machine).
  • Extract it and install it:
tar -zxvf MacPorts-1.9.2.tar.gz
cd MacPorts-1.9.2
./configure
make
sudo make install
cd ../
rm -rf MacPorts*
  • Ta-da! All done installing, now we can get to business. We need to tell MacPorts to syncronize with the MacPorts server so issue the following command:
sudo port -v selfupdate
  • This will update MacPorts with all of the most recent repository changes.  As a little tutorial, to install something with MacPorts, you simply enter “sudo port install <package name>”.  Note: you should run the selfupdate command on a regular basis to keep everything up to date.  You can “man port” for more information after you install MacPorts.
  • Note: If your system doesn’t recognize the “port” command, you need to add /opt/local/bin to your PATH (add the following line to your .cshrc file):
setenv PATH "/opt/local/bin:$PATH"

Python, Numpy, Scipy, matplotlib (2.6), & cfitsio (THANK GOD FOR MACPORTS)

Start by installing Python 2.6.  We do this separately from the rest because we want to make sure we set it as our default python version before proceeding.  In short, the python that ships with OSX is chopped up so much because Apple uses it a lot when booting up the computer, so they take out a whole bunch of stuff that slows the boot a little bit to make it as fast of a boot as possible.  Unfortunately, this means half of python is missing!

sudo port -v install python26 python_select

Give it time, quit being impatient! (that’s a note to myself)  Now, set our python install as the default for our system:

sudo port select --set python python26

Note that doing “python_select -l” will list all of the python installs to choose from on your machine.

Now, we’ll install numpy, scipy and all of their dependencies all at once!  Scipy is not necessary, but it’s extremely useful for other things, so I’m installing it as well.  Scipy depends on numpy, which depends on python…so if we tell MacPorts to install Scipy for python 2.6, everything will be installed!  (also there are a ton of dependencies here I’m not going into.  When you run the command it will tell you all the stuff it is going to install in addition to scipy, numpy and python…you can see it is a HUGE list of dependencies.  Thank goodness we don’t have to install all of these by hand!).  We also throw matplotlib and cfitsio into this command to get them taken care of as well (and wget, I can’t live without wget).

sudo port -v install py26-scipy py26-matplotlib cfitsio wget

Warning: This will take FOREVER…hours on a decent machine (took one of the ones I did this on over 5 hours).  There are so many libraries to install and BLAS, LAPACK, numpy, and scipy alone take forever to configure and install.  Go get a delicious beverage, eat a donut, or better yet, sleep…it might be done by morning if you’re lucky.

After all that work, we need to test it.  Fire up a new terminal and try importing each of the libraries, if there’s no error, everything went smoothly :)

% python
Python 2.6.6
>>> import numpy
>>> import scipy
>>> import matplotlib
>>> exit()

Pyfits

Pyfits is maintained by STSci and is a simple ‘normal python install’:

wget http://www.stsci.edu/resources/software_hardware/pyfits/pyfits-2.4.0.tar.gz
tar -zxvf pyfits-2.4.0.tar.gz
cd pyfits-2.4.0
sudo python setup.py install

It’s always good to test the install so, in a DIFFERENT DIRECTORY (never ever ever ever run a python package that you just installed from the install directory) do:

% python
Python 2.6.6
>>> import pyfits
>>> exit()

If there’s no error, you’re good!

Archangel

FINALLY, we’re ready to install archangel.  It has been a long road, but if all went smoothly, this should be SIMPLE.  Go into a NON-temporary directory (the directory where you want to store the archangel utility subdirectory, I did it in a /archangel root directory so everyone would have access to it) and issue:

mkdir /archangel
cd /archangel
sudo wget http://abyss.uoregon.edu/~js/archangel/archangel_v2.5.tar.gz
tar -zxvf archangel_v2.5.tar.gz
mv archangel/* .

This next part is important…you choose. If you are just setting archangel up for one account, you can run the following line of code NOT as sudo and choose the place you want to place the binaries for yourself.  Then you will need to modify your .cshrc file to reflect where these binaries are.  Here’s how:

python setup.py build
(choose binary directory)

Next, view the readme file in that directory to see how to modify your .cshrc file.

For my install, I want everyone to be able to use it so I run:

python setup.py build
Binaries going to -> *location* Ok? /usr/local/bin

And it should put everything into my system path at /usr/local/bin.   /usr/local/bin should be in everyone’s default path so there isn’t a need to do that .cshrc stuff BUT you do need to do one .cshrc thing:

setenv ARCHANGEL_HOME /archangel

Note what I did here.  I made the archangel_home directory a new directory at the / level.  This is so that all users can access it.  EVERYONE who is going to use archangel then needs to put that in their .cshrc file.

Test it!  Source the .cshrc file (I would actually recommend totally logging out and back in), go to your home directory (or wherever you want to store your archangel outputs) and run:

cp /archangel/examples/ic5271_j.fits .
profile ic5271_j.fits

If all goes well, you have successfully installed Archangel :) .  Now go read the documentation!

Many, many updates

Over the next couple weeks I’ll cover exactly what has been going on with all of these web hosting issues.  For now, I’ll just tell you the current setup.  If you are reading this, the new server is up and running!  I’ve elected to go with a VPS (virtual private server) for lots of reasons.  Stability, comfort, expandability, control.  Pretty much everything that a standard “website host” doesn’t have.  The beauty of VPS’s is the freedom and control.  You are quite literally running an entire server machine remotely.  I have root access to the box, can install any OS I want, and choose how I want to run my server in gory detail.  The only limitations are the amount of RAM, hard drive space, and bandwidth allotted to my virtual machine (which can be increased for fees of course).

In short, VPS’s arose as the idea of the “cloud” arose a while ago.  The idea is that a huge datacenter can split up its resources into many small chunks, and each of these chunks can operate as a full machine.  This is extremely attractive because now I’m not just limited to building a website on some silly PHP-only server.  I can run tasks in the background, host a SVN server, run a minecraft server, just crunch numbers, or anything you can think of in a VPS.  It’s quite sexy and a change I’ve been wanting to implement for quite some time.

WAIT A MINUTE!  “Matt, you have a server sitting right next to your desk in your office.  It has a public IP address and is a real machine that you have full control over.  Why don’t you just use that???”  It’s true, I definitely could and I would have 16x the RAM of my VPS, 10x the storage space, and an infinite amount of bandwidth.  But I have a few reasons why I don’t turn my office server into a public web server.

  1. Down-time.  The nice thing about a VPS is that the responsibility is on Linode (my VPS company of choice)!  I don’t have to worry about downtime, broken disks, slow network (not a problem at all but just for completeness…), etc.  If my computer shuts down because the power goes out at the IfA in the middle of the night, my website is down!  If the power goes out in my huge datacenter in Dallas, TX, they have tons of backup power and people on standby to fix everything fast.
  2. Research machines should NEVER be web servers.  If I’m pouring through GB’s of data, I don’t want a web server running in the background slowing things down.  Also, exposing my machine to the interwebz is very secure, but it’s even more secure when I’m not running any services like mail, web server, SVN server, etc.  Remember, every open port is a door and if someone has the key…
  3. Legal.  Quite simply, everything I do that uses ANY university resource is considered to be co-owned by the University of Hawaii.  This can get me into a sticky legal mess if I release software and the like.  Even with open source software, UH COULD (I doubt they would) seize it and decide to charge for it.  Yes, it’s a pain in the ass but their thing is if I’m using their resources, they get a fair share…which makes sense.  So, hosting everything non-research related off-site ensures that I can’t get into any trouble with the folks here.
  4. Life.  It’s very possible that I’m moving around and traveling tons in the next few years…it makes no sense to not go with a remote service for this reason alone.

A few more little items.  I’m in the process of completely re-doing mattzag.com, so for now this blog is the home page.  I’ve got a few software projects in the works, expect to see lots of tech-posts on here because of that.  This blog will be totally random, I have no idea how this is going to go.  In the coming weeks I’ll go over all of my server stuff, but I think I should end this post for now.  I’m designing a blog layout as well, but in the mean-time I figured I’d promote a fellow designer with one of his templates, Manifest by Jim Barraud.

Maintenance Log

Here’s the log from during maintenance server transfer…

Update #1 – lighttpd, php, mysql, iptables all installed successfully.
Update #2 – Old server fully backed up, going to bed, hopefully DNS will be ready when I wake up.
Update #3 – DNS fail. The old server is finally recognizing the DNS requests so as of right now, I have it set up such that mattzag.com redirects to the IP address of this page (my new server). It could take as long as 48 hours for the old domain registrars to approve my transfer request. In the mean-time, 12 days until I go to Korea :)
Update #4 – I am surprised by the amount of traffic mattzag.com got over the past few months…there was no content and it recieved something like 250 hits per day! Why folks?! When I release the newest version, there will be plenty of content, although that may take some time. My current plan is once all of the DNS stuff is done, my blog will become the main page of the site until the rest is fully developed.
Update #5 – Blog is all ready to go, just waiting on DNS. Performing some MySQL optimization and setting up a mail server as well, final configuration will have to wait until DNS is up.
Update #6 – For now I’m just going to assume my old DNS registrar won’t be releasing my domain name until Monday, so if they do, that is when mattzag.com will return in its blogger-glory.
Update #7 – My old registrar just released mattzag.com and it is now registered with the new registrar. Name servers have been changed, but it will take a bit of time for these to populate through the interwebz. This could be as fast as 2 hours, or as slow as 48 hours, we shall see. 11 days, 2 hours until Korea.