Thursday, October 10, 2019

LUC server back online(?)

Well, weirdly, my original old server seems to be back online. I'm going to continue this rebuilding process, especially because I was about to get into PHP, which is where my memories of what actually worked and what didn't become extremely fuzzy. But at least this means I have a viable version ready to go for DHCS in a few weeks!

Wednesday, October 9, 2019

H2B SOTU-db Step 3: Apache

This series of posts documents the process of getting SOTU-db back online after losing access to the LUC servers upon which it was originally hosted.

Now that I've got my VM provisioned and running R, the next step is to turn it into a web server by installing Apache. The goal for the completion of this step will be to visit www.sotu-db.com and have it display a basic HTML page, which I'll use to post a temporary notice the SOTU-db is offline for now. 

Before I do that, I'm going to activate UFW, the Uncomplicated Firewall by following the steps on this Digital Ocean Ubuntu setup guide (starting with Step Seven). It's pretty... uncomplicated, so I won't go into detail here. Basically, I'm just allowing OpenSSH connections from anywhere for now, then after installing Apache we'll give it permissions as well. It looks like activating UFW this was successful, so I'll move on.

Fortunately, Apache is extremely popular and is easy to access from the standard Ubuntu package repositories. Installation should be as simple as:
sudo apt update
sudo apt install apache2
And indeed, the installation process went very smoothly. However, I was frustrated to see that I couldn't seem to access the "It works!" page that was supposed to be generated as part of the install. The file is in /var/www/html on the VM like it's meant to be, and I added "Apache Full" to the UFW allow-list. After some running around in circles, I eventually realized that I had left blank two checkboxes in the Google Compute Engine dashboard for my instance: under a small section called "Firewalls," I had left "allow HTTP traffic" and "allow HTTPS traffic" unchecked. So, I don't think my requests were ever even reaching my VM. Once I checked those boxes and restarted the VM, I had no problem accessing the "It works!" page.

Next, I opened up Atom and wrote a quick new index.html file, letting users know that SOTU-db is down and directing them to the dev blog.

I also promoted the VM's public IP address from an ephemeral to a static IP, then went into my Google Domains dashboard and added an A record linking the domain to that IP. This way, www.sotu-db.com is connected to my VM and, unlike with the old LUC VM, should allow "sotu-db" to remain in the address bar as long as a user is on the site (instead of being redirected to sotu-db.cs.luc.edu like previously).

So, we're now up and running with
1. a VM
2. R installed
3. a static page served to visitors to www.sotu-db.com. Try it out!

Next steps will be to begin migrating the project data itself onto the server!

H2B SOTU-db Step 2: Installing R

This series of posts documents the process of getting SOTU-db back online after losing access to the LUC servers upon which it was originally hosted.

After getting my Ubuntu virtual machine up and running, I'll need to install some software onto the machine that is not contained in the SOTU-db project repo. The two main programs I'll need are the web server Apache and the statistical analysis package R. I've decided to install R first.

Installing R

Installing R is pretty straightforward once you know the fundamentals of installing software in Ubuntu. The command line installation guide provided by CRAN (the organization that distributes R) is pretty helpful. I'll walk through it here:
  1. add the CRAN repositories to sources.list
Basically, this step tells Ubuntu about the remote servers where it can find R. So, first I'll open an SSH shell into my VM, through the Google Cloud Engine dashboard. Then, I'll navigate over to the folder I want with cd /etc/apt/. Now comes the tricky part... which text editor do I use? How long until I get stuck or accidentally erase the whole file? Let's try nano... sudo nano ./sources.list. That works... the Cloud Engine console even allows me to paste from the Windows clipboard with CTRL+SHIFT+V. So, I'll just plop in the source listing that corresponds to my version of Ubuntu:
deb https://cloud.r-project.org/bin/linux/ubuntu xenial-cran35/
and save, or CTRL+O for "WRITE OUT" just to remind me why I hate Linux text editors, then CTRL+X to exit, and we should be good on step 1!

STEP 2: Install the "Complete R System"
This step is really simple:

sudo apt-get update
sudo apt-get install r-base

STEP 3: Go back a step, add security keys and another repo

This is where the R setup process always drives me crazy, and is an issue with countless Linux installation guides: running the apt-get update command from Step 2, above, results in a warning that the public key for the CRAN repo that we added in Step 1 is not available. If you scroll down the R installation guide, it tells you how to address this, and also mentions that "Installation and compilation of R or some of its packages may require Ubuntu packages from the “backports” repositories..." So:


STEP 3A: Add the Backports repo

I'm adding this to the same sources.list file as before, which now has two entries that look like this:


STEP 3B: Add public keys for R packages
Keep this step simple by just running
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
Now we can go back to Step 2 and run those commands again. I'm actually still getting a couple warnings on the update, but they seem ignorable (such as backports not having a "release" file, which seems to make sense to me). And install r-base ran without any errors.

STEP 4: Install r-base-dev
Finally, we're going to install some additional developer packages. The guide says "Users who need to compile R packages from source [e.g. package maintainers, or anyone installing packages with install.packages()] should also install the r-base-dev package," and that's us, so here we go. This should be a simple matter of running
sudo apt-get install r-base-dev
Hm, interestingly, it's telling me I already have the newest version of r-base-dev installed. I wonder if adding in the backports repo before installing r-base had something to do with it?

STEP 5: Profit
Now I can run R (not r) and see that the program runs successfully. Next stop, Apache!

How to build SOTU-db: Step 1

This series of posts documents the process of getting SOTU-db back online after losing access to the LUC servers upon which it was originally hosted.

Step 1: Get a VM

The first step is actually procuring a server upon which to run SOTU-db. I looked at Azure and Google and decided on Google, partially because they have a simpler interface and a good amount of credits for new users. But I also wanted to see about using Firebase more, and figured that integration would be easier if I used Google for the VM (they call this service the Google Compute Engine). 

I decided to start low in specs, figuring I could scale up if needed, so right now I just have a single VM running Ubuntu 16.04 with 1 vCPU and 3.75GB memory. I assume I'll need to scale that up, but am hoping I can do that after getting more of the platform set up properly. Since initializing the VM, the only steps I have taken have been:
  • run apt update & upgrade
  • point my www.sotu-db.com domain to the VM's public address, which I should probably undo and point to this blog until at least getting the web server running
Step 2, coming up, will be to install either Apache or R; I'll probably glance through the documentation of each to see if doing those in a particular order will be helpful, but I don't think it will really make a difference. 

Tuesday, October 8, 2019

Fall 2019 Update

Good news: I've been invited to present SOTU-db at the 2019 Chicago Colloquium for Digital Humanities and Computer Science!

Bad news: SOTU-db is offline, as our host servers at Loyola seem to have been taken offline.

This means I have about one month to get SOTU-db back up and running. Ideally, I would also like to ask and address another contemporary research question, but that might not be realistic given the timeframe.

So, the plan going forward is going to be to essentially rebuild SOTU-db's web presence on a new VM, document the process here in blog posts, and get everything up and running again by November 9. Stay tuned!