Thursday, December 6, 2012

Verifying the MongoDB DataStore with the Rails Console

UPDATE: The broker data model has switched to using the Mongoid ODM rubygem.  This significantly improves the consistency of the broker data model and simplifies coding broker objects.  It also obsoletes this post.

See the new one on Verifying the Mongod DataStore with the Rails Console: Mongoid Edition


In the last post I showed how I'd verify the configuration of the OpenShift Bind DNS plugin using the Rails console.  In this one I'll do the same thing for the DataStore  back end service. (not strictly a plugin, but hey...).

DataStore Configuration


Right now the DataStore back end service is not pluggable.  The only back end service available is MongoDB. I've posted previously on how to prepare a MongoDB service for OpenShift.  Now I'm going to work it from the other side and demonstrate that the communications are working.

Since the DataStore isn't pluggable, it isn't configured from the /etc/openshift/plugins.d directory.  Rather it has it's own section in the /etc/openshift/broker.conf (or broker-dev.conf).

This is just the relevant fragement from broker.conf

...
#Broker datastore configuration
MONGO_REPLICA_SETS=false
# Replica set example: "<host-1>:<port-1> <host-2>:<port-2> ..."
MONGO_HOST_PORT="data1.example.com:27017"
MONGO_USER="openshift"
MONGO_PASSWORD="dbsecret"
MONGO_DB="openshift"
...

These are the values that will be used when the broker application creates an OpenShift::DataStore (and OpenShift::MongoDataStore) object.

The DataStore: Abstract and Implementation

At one point the OpenShift::DataStore was intended to be pluggable.  At some point the concept of an abstracted interface was dropped and the tightly bound MongoDB interface was allowed to grow organically. The remains of the original pluggable interface are still there.  Both source files live now in the openshift-origin-controller rubygem package.


The OpenShift::DataStore class still follows the plugging conventions.  It implements the provider=() and instance() methods.  The first takes a reference to a class that "implements the datastore interface" and the second provides an instance of the implementation class all pre-configured from the configuration file.

Observing MongoDB


Unlike named MongoDB writes to its own log by default.  The logs reside in /var/log/mongodb/mongodb.log. (as controlled by the logpath setting in /etc/mongodb.conf.) Verbose logging is controlled in the mongodb.conf as well.  For this demonstration I'm going to enable that by uncommenting the line in /etc/mongodb.conf and restarting the mongod service.

MongoDB also has a command line tool that can be used to interact with the database as well.  The CLI tool is called mongo. I can invoke it like this:

mongo --username openshift --password dbsecret data1.example.com/openshift
MongoDB shell version: 2.0.2
connecting to: data1.example.com/openshift
> show collections;
system.indexes
system.users

This shows an initialized database, but no OpenShift data has been stored yet. The two existing collections are the system collections.  OpenShift will add collections as needed to store data.

With these two mechanisms I can observe and verify access and updates from the broker to the database through the OpenShift::DataStore object.

Creating an OpenShift::DataStore Object


I'm going to create the OpenShift::DataStore object in the same way I did with the OpenShift::DnsService object.  I call the instance() method on the OpenShift::DataStore object.

cd /var/www/openshift/broker
rails console
Loading production environment (Rails 3.0.13)
irb(main):001:0> store = OpenShift::DataStore.instance
=> #<OpenShift::MongoDataStore:0x7f9e42ed4918
 @host_port=["data1.example.com", 27017], @db="openshift", @user="openshift",
 @replica_set=false, @password="dbsecret",
 @collections={:application_template=>"template", :user=>"user", :district=>"district"}>
irb(main):002:0>

Now I have a variable named db which contains a reference to an OpenShift::MongDataStore object. I can see from the instance variables that it is configured for the right host, port, database, user etc.

Checking Communications: Read


Now that I have something to work with its time to see if it will talk to the database.

The interface to the DataStore is much more complex than the DnsService interface is.  Since we're only checking connectivity that's not a problem.  Once I've checked connectivity, I can craft more checks of the DataStore methods themselves later.

The DataStore has a couple of methods that expose the Mongo::DB class that's underneath.  With that I can force a query for the list of collections currently available in the database. If the broker service has not yet been run and users and applications created then only the system collections will exist.  In the example below there are only two collections.


rails console
Loading production environment (Rails 3.0.13)
irb(main):001:0> store = OpenShift::DataStore.instance
=> #<OpenShift::MongoDataStore:0x7f1ac50ac698
 @host_port=["data1.example.com", 27017], @db="openshift",
 @user="openshift", @replica_set=false, @password="dbsecret",
 @collections={:application_template=>"template", :user=>"user", :district=>"district"};gt;
irb(main):002:0> collections = store.db.collections
=> [#<Mongo::Collection:0x7f1ac5096ac8 @cache_time=300, 
...
 @pk_factory=BSON::ObjectId>]
irb(main):003:0> collections.size
=> 2
irb(main):004:0> collections[0].name
=> "system.users"
irb(main):005:0> collections[1].name
=> "system.indexes"

On the MongoDB host I can confirm that there are indeed two collections.

mongo --username openshift --password dbsecret data1.example.com/openshift
MongoDB shell version: 2.0.2
connecting to: data1.example.com/openshift
> show collections
system.indexes
system.users

Finally I can check that the broker app really did issue that query and get a response:


...
Thu Dec  6 14:06:29 [conn2] Accessing: openshift for the first time
Thu Dec  6 14:06:29 [conn2]  authenticate: { authenticate: 1, user: "openshift",
 nonce: "d2083e4185cb7d22", key: "c7c3628fe64eb1aedaaf4c87a4d5e723" }
Thu Dec  6 14:06:29 [conn2] command openshift.$cmd command: { authenticate: 1, u
ser: "openshift", nonce: "d2083e4185cb7d22", key: "c7c3628fe64eb1aedaaf4c87a4d5e
723" } ntoreturn:1 reslen:37 5ms
Thu Dec  6 14:06:29 [conn2] query openshift.system.namespaces nreturned:3 reslen
:142 0ms
...

Checking Communications: Write


Now that I'm convinced that I'm connecting to the right database and I'm able to make queries, the next check is to be sure I can write to it when needed.

Since the database has not yet been used, it's empty.  I want to be careful regardless not to mess with any real OpenShift collections. I'll create a test collection, write a record to it, read it back and drop the collection again.  If I do this in a consistent way I can use this test at any time to check connectivity without danger to the service data.

 I'm using the ruby Mongo classes underneath the OpenShift::MongoDataStore class, so I'll have to look there for the syntax. The Mongo::DB class has a create_collection() method which will do the trick.  I'll issue the command in the rails console, then check the MongoDB logs and view the list of collections using the mongo CLI tool.

Create a Collection


First, the create query (entered into an existing rails console session):

irb(main):005:0> store.db.create_collection "testcollection"
=> #<Mongo::Collection:0x7f76567e9ae0 @cache_time=300,...
...
 @name="testcollection", @logger=nil, @pk_factory=BSON::ObjectId>
irb(main):006:0>

Next I'll check logs:

...
Thu Dec  6 14:37:13 [conn4] run command openshift.$cmd { authenticate: 1, user: 
"openshift", nonce: "665cedb4baf82b0d", key: "eec7b08761151c858c14058c2629dee6" 
}
Thu Dec  6 14:37:13 [conn4]  authenticate: { authenticate: 1, user: "openshift",
 nonce: "665cedb4baf82b0d", key: "eec7b08761151c858c14058c2629dee6" }
Thu Dec  6 14:37:13 [conn4] command openshift.$cmd command: { authenticate: 1, u
ser: "openshift", nonce: "665cedb4baf82b0d", key: "eec7b08761151c858c14058c2629d
ee6" } ntoreturn:1 reslen:37 0ms
Thu Dec  6 14:37:13 [conn4] query openshift.system.namespaces nreturned:3 reslen
:142 0ms
Thu Dec  6 14:37:13 [conn4] run command openshift.$cmd { create: "testcollection
" }
Thu Dec  6 14:37:13 [conn4] create collection openshift.testcollection { create:
 "testcollection" }
Thu Dec  6 14:37:13 [conn4] New namespace: openshift.testcollection
Thu Dec  6 14:37:13 [conn4] adding _id index for collection openshift.testcollec
tion
Thu Dec  6 14:37:13 [conn4] build index openshift.testcollection { _id: 1 }
Thu Dec  6 14:37:13 [conn4] external sort root: /var/lib/mongodb/_tmp/esort.1354
804633.1660751058/
Thu Dec  6 14:37:13 [conn4]   external sort used : 0 files  in 0 secs
Thu Dec  6 14:37:13 [conn4] New namespace: openshift.testcollection.$_id_
Thu Dec  6 14:37:13 [conn4]   done building bottom layer, going to commit
Thu Dec  6 14:37:13 [conn4]   fastBuildIndex dupsToDrop:0
Thu Dec  6 14:37:13 [conn4] build index done 0 records 0.001 secs
Thu Dec  6 14:37:13 [conn4] command openshift.$cmd command: { create: "testcolle
ction" } ntoreturn:1 reslen:37 1ms
...

Finally I'll connect and query the database locally to check for the presence of the new collection.

mongo --username openshift --password dbsecret data1.example.com/openshift
MongoDB shell version: 2.0.2
connecting to: data1.example.com/openshift
> show collections
system.indexes
system.users
testcollection

This is really enough to demonstrate that the MongoDataStore object is properly configured and has the ability to read and write the database.  Just for completeness I'll go one step further and create a document.

Add a Document to the testcollection


Since the testcollection is the most recently added, it should be the last one in the collections list in the Rails console Mongo::DB object.  I can check by looking at the name attribute of that collection

irb(main):007:0> store.db.collections[2].name
> "testcollection"

Now that I know I have the right one, I can add a document to it using the Mongo::Collection insert() method:

irb(main):008:0> store.db.collections[2].insert({'testdoc' => {'testkey' => 'testvalue'}})
=> BSON::ObjectId('50c0c2016892df2d56000001')

The logs show the insert like this:

Thu Dec  6 16:04:36 [conn11] run command openshift.$cmd { getnonce: 1 }
Thu Dec  6 16:04:36 [conn11] command openshift.$cmd command: { getnonce: 1 } nto
return:1 reslen:65 0ms
Thu Dec  6 16:04:36 [conn11] run command openshift.$cmd { authenticate: 1, user:
 "openshift", nonce: "19ca25a92ca483ee", key: "f7ac36d2e36a3a00a91d234a59a559e3"
 }
Thu Dec  6 16:04:36 [conn11]  authenticate: { authenticate: 1, user: "openshift"
, nonce: "19ca25a92ca483ee", key: "f7ac36d2e36a3a00a91d234a59a559e3" }
Thu Dec  6 16:04:36 [conn11] command openshift.$cmd command: { authenticate: 1, 
user: "openshift", nonce: "19ca25a92ca483ee", key: "f7ac36d2e36a3a00a91d234a59a5
59e3" } ntoreturn:1 reslen:37 0ms
Thu Dec  6 16:04:36 [conn11] query openshift.system.namespaces nreturned:5 resle
n:269 0ms
Thu Dec  6 16:04:36 [conn11] insert openshift.testcollection 0ms

And a quick CLI query to confirm that the document has been created:

mongo --username openshift --password dbsecret data1.example.com/openshift
MongoDB shell version: 2.0.2
connecting to: data1.example.com/openshift
> db.testcollection.find()
{ "_id" : ObjectId("50c0c2016892df2d56000001"), "testdoc" : { "testkey" : "testvalue" } }

Read a Document from the testcollection


In traditional database style, when you make a query, you don't get  back the single thing you asked for.  You get a Mongo::Cursor object which collects all of the documents which match your query.  Cursors respond to a next() method which does what you would think, returning each match in turn and nil when all documents have been retrieved.  The Mongo::Cursor also has a method which converts the entire response into an array. I'll use that to get just the one I want.

irb(main):035:0> store.db.collections[2].find.to_a[0]
=> #<BSON::OrderedHash:0x3fbb2b27fea4
 {"_id"=>BSON::ObjectId('50c0c2016892df2d56000001'),
 "testdoc"=>#<BSON::OrderedHash:0x3fbb2b27fd50 {"testkey"=>"testvalue"}>}>

I won't take up space showing the log entries for this query.  I know how to find them now if there's a problem.

Cleanup: Remove the testcollection


The final step in a test like this is always to remove any traces.  I can drop the whole collection with a single command.  This one I will confirm with the local CLI query, but the logs I'll leave for an exercise unless something goes wrong.

irb(main):037:0> store.db.collections[2].drop
=> true

You may notice that this was WAY too easy. Do be careful when you're working on production systems. Prepare and test backups OK?

When I look now on the CLI and ask for the list of collections, I only see two:

mongo --username openshift --password dbsecret data1.example.com/openshift
MongoDB shell version: 2.0.2
connecting to: data1.example.com/openshift
> show collections
system.indexes
system.users

Summary

In this post I showed how to access the OpenShift broker application using the Rails console.  I created an OpenShift::MongoDataStore object (using the OpenShift::DataStore factory).  I showed how to access the database from the CLI and where to find the MongoDB log files.  With these I was able to confirm that the OpenShift broker DataStore configuration was correct and that the database was operational.

References






Wednesday, December 5, 2012

Verifying the DNS Plugin using Rails Console

Each of the OpenShift broker plugins provides an interface implementation class for the  plugin's abstract behavior.   In practical terms this means that I can fire up the rails console, create an instance of the plugin class and then use it to manipulate the service behind the plugin.

Since the DNS plugin has the simplest interface and Bind has the cleanest service logs, I'm going to demonstrate with that.  The technique is applicable to the other back-end plugin services.

Preparing Logging


To make life easy I'm going to configure logging on the DNS server host so that the logs from the named service are written to their own file.

A one line file in /etc/rsyslog.d will do the trick:

if $programname == 'named' then /var/log/named.log

Write that into /etc/rsyslog.d/00_named.conf and restart the rsyslog service.  Then restart the named service and check that the logs are appearing in the right place.

If I didn't filter out the named logs, I could still use grep on /var/log/messages to extract them.

Configuring the Bind DNS Plugin


As indicated in previous posts, the OpenShift DNS plugin is enabled by placing a file in /etc/openshift/plugins.d with the configuration information for the plugin.  The name of the file must be the name of the rubygem which implements the plugin with the suffix .conf. The Bind plugin is configured like this:

/etc/openshift/plugins.d/openshift-origin-dns-bind.conf

BIND_SERVER="192.168.5.11"
BIND_PORT=53
BIND_KEYNAME="app.example.com"
BIND_KEYVALUE="put-your-hmac-md5-key-here"
BIND_ZONE="app.example.com"

When the Rails application starts, it will import a plugin module for each .conf file and will set the config file values.

The Rails Console


Ruby on Rails has an interactive testing environment.  It it started by invoking rails console from the root directory of the application.  If I start the rails console at the top of the broker application I should be able to instantiate and work with the plugin objects.

The rails console command runs irb to offer a means of manual testing.  In addition to the ordinary ruby script environment it imports the Rails application environment which resides in the current working directory. Among other things, it processes the Gemfile which, in the case of the OpenShift broker, will load any plugin gems and initialize them. I'm going to use the Rails console to directly poke at the back end service objects.

I'm going to go to the broker application directory.  Then I'll check that bundler confirms the presence of all of the required gems.  Then I'll start the Rails console and check the plugin objects manually.

cd /var/www/openshift/broker
bundle --local
....
Your bundle is complete! Use `bundle show [gemname]` to see where a bundled gem is installed.
rails console
Loading production environment (Rails 3.0.13)
irb(main):001:0> 

The last line above is the Rails console prompt.

Creating a DnsService Object


The OpenShift::DnsService class is a factory class for the DNS plugin modules. It also contains an interface definition for the plugin, though Ruby and Rails don't seem to be much into formal interface specification and implementation.  The plugin interface definitions reside in the openshift-origin-controller rubygem:

https://github.com/openshift/origin-server/tree/master/controller/lib/openshift

The factory classes provide two methods by convention: provider=() sets the actual class which implements the required interface and instance() is the factory method, returning an instance of the implementing class.  They also have a private instance variable which will contain a reference to the instantiating class.  When the plugins are loaded, a reference to the instantiating class is set into the factory class.

Once the broker application is loaded using the rails console I should be able to create and work with instances of the DnsService implementation.

The first step is to check that the factory class is indeed loaded and has the right provider set. Since I can just type at the irb prompt it's easy to see what's there.

irb(main):001:0> OpenShift::DnsService
=> OpenShift::DnsService
irb(main):002:0> d = OpenShift::DnsService.instance
=> #<OpenShift::BindPlugin:0x7f540dfb9ee8 @zone="app.example.com",
 @src_port=0, @server="192.168.5.2",
 @keyvalue="GwhJNLZPghbpTya2M6N+lvcLmBQx6TYbuH7j6TPyetE=",
 @port=53, @keyname="app.example.com",
 @domain_suffix="app.example.com">

Note that the class is OpenShift::BindPlugin and the instance variables match the values I set in the plugin configuration file. I now have a variable d which refers to an instance of the DNS plugin class.

The DnsService Interface


The DNS plugin interface is the simplest of the plugins.  It contains just four methods:
  • register_application(app_name, namespace, public_hostname)
  • deregister_application(app_name, namespace)
  • modify_application(app_name, namespace, public_hostname)
  • publish()
All but the last will have a side-effect which I can check by observing the named service logs and by querying the DNS service itself.

Note that the publish() method is not included in the list with side-effects.  publish() is always called at the end of a set of change calls.  It is there to accommodate batch update processing.  Third party DNS services which use use web interfaces may require batch processing. The OpenShift::BindPlugin submits changes instantly.

Change and Check


The process of testing now will consist of three repeated steps:

  1. Make a change
  2. Check the DNS server logs
  3. Check the DNS server response

I will repeat the steps once for each method. (though I'll only show a couple of samples here)

The logs are time-stamped.  To make it easier to find the right log entry, I'll check the time sync of the broker and DNS server hosts, and then check the time just before issuing each update command.

First I check the date and add an application record.  An application record is a DNS CNAME record which is an alias for the node which contains the application. Here goes:

irb(main):003:0> `date`
=> "Wed Dec  5 15:25:59 GMT 2012\n"
irb(main):002:0> d.register_application "testapp1", "testns1", "node1.example.com"
=> ;; Answer received from 192.168.5.11 (129 bytes)
;;
;; Security Level : UNCHECKED
;; HEADER SECTION
;; id = 25286
;; qr = true    opcode = Update    rcode = NOERROR
;; zocount = 1  prcount = 0  upcount = 0  adcount = 1

OPT pseudo-record : payloadsize 4096, xrcode 0, version 0, flags 32768

;; ZONE SECTION (1  record)
;; app.example.com. IN SOA

The register_application() method returns the Dnsruby::Message returned from the DNS server.  A little digging should indicate that the update was successful.

Next I'll examine the named service log on the DNS server host.

tail /var/log/named.log
...
Dec  5 15:26:41 ns1 named[11178]: client 10.16.137.216#54040/key app.example.com: signer "app.example.com" approved
Dec  5 15:26:41 ns1 named[11178]: client 10.16.137.216#54040/key app.example.com: updating zone 'app.example.com/IN': adding an RR at 'testapp1-testns1.app.example.com' CNAME

Finally, I'll check that the server is answering queries for that name:

dig @ns1.example.com testapp1-testns1.app.example.com CNAME

; <<>> DiG 9.9.2-rl.028.23-P1-RedHat-9.9.2-8.P1.fc18 <<>> @ns1.example.com testapp1-testns1.example.com CNAME
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 10884
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;testapp1-testns1.example.com. IN CNAME

;; AUTHORITY SECTION:
example.com.  10 IN SOA ns1.example.com. hostmaster.example.com. 2011112904 60 15 1800 10

;; Query time: 3 msec
;; SERVER: 192.168.1.11#53(192.168.1.11)
;; WHEN: Thu Dec  5 15:28:41
;; MSG SIZE  rcvd: 108

That's sufficient to confirm that the DNS Bind plugin configuration is correct and that updates are working. In a real case I'd go on and check each of the operations. for Now I'll just delete the test record and go on.


d.deregister_application "testapp1", "testns1"
=> ;; Answer received from 192.168.5.11 (129 bytes)
;;
;; Security Level : UNCHECKED
;; HEADER SECTION
;; id = 26362
;; qr = true    opcode = Update    rcode = NOERROR
;; zocount = 1  prcount = 0  upcount = 0  adcount = 1

OPT pseudo-record : payloadsize 4096, xrcode 0, version 0, flags 32768

;; ZONE SECTION (1  record)
;; app.example.com. IN SOA



dig @ns1.example.com testapp1-testns1.app.example.com CNAME
; <<>> DiG 9.9.2-rl.028.23-P1-RedHat-9.9.2-8.P1.fc18 <<>> @ns1.example.com testapp1-testns1.example.com CNAME
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 50598
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;testapp1-testns1.example.com. IN CNAME

;; AUTHORITY SECTION:
example.com.  10 IN SOA ns1.example.com. hostmaster.example.com. 2011112904 60 15 1800 10

;; Query time: 3 msec
;; SERVER: 192.168.5.11#53(192.168.5.11)
;; WHEN: Thu Dec  5 15:31:20
;; MSG SIZE  rcvd: 108

This is what a negative response looks like. There's a question section but no answer section.
Things are back where I started and I can move on to the next test.

Resources

Monday, December 3, 2012

OpenShift Broker Configuration and Log Files

There are a lot of moving parts in an OpenShift Broker service. There are the four back-end services to start with. Then there's the front end HTTP daemon and the Rails broker application. There's SELinux security and the Passenger Rails accelerator service. Each of these needs some kind of configuration which may need some tweaking. Each of them also has either a specific log file or some other output somewhere that can be used for status checks and diagnostics.

In this post I'm going to run down a list of these configurations and logs and the service components they relate to.  Each of these gets some attention in the Build-Your-Own wiki instructions.

Configuration Directories


The OpenShift broker service (even if you set aside the back-end services) is an amalgam of components.  Each of these may have some customization for for the final working environment.  Each is also an opportunity for something to get broken or tweaked.

Without some understanding of the interactions between the components the set of configurations might seem unfathomable.  Even with some understanding it can be complex, but it does not need to be overwhelming.

These are the places where configuration files are known to lurk.  

OpenShift Broker Configuration Directories
DirectoryPurposeDescription
/etc/openshift Master Location Master configuration directory for all openshift related services
/etc/openshift/plugins.d Broker Plugin Configuration This is where plugin configuration files are placed. These files select the plugins for each back end service. They also contain customization (service location, authentication information etc).
/var/www/openshift/broker Rails Application Root This directory contains the Rails application which is the OpenShift Broker service. At the top level are the Gemfile and Gemfile.lock which control the application rubygems.
/var/www/openshift/broker/config/environments Rails configuration This directory contains the Rails application "environments". Each file here corresponds to a possible run mode for the OpenShift broker service. See also /etc/openshift/development
/var/www/openshift/broker/config/httpd/conf.d Broker HTTPD This directory contains the broker httpd configuration files.
/etc/httpd/conf.d Front end HTTPD This directory is the standard configuration location for the front-end Apache2 daemon.

If you're poking around wondering what goes on behind the scenes and how it's controlled, these are the places to start.

Configuration Files


Each of the locations above can contain a number of different and only marginally related configuration files. The list below contains all of the files that appear to need special attention of some kind during service configuration.  I don't try to mention every possible setting or switch here.  I'm just trying to give you an idea of what you might find in each one.  See the Build-Your-Own wiki page and the official OpenShift Enterprise service documentation for details.

This file defines a number of parameters for the service. This is the development configuration.


OpenShift Broker Configuration Files
FileFormatDescription
/etc/openshift/broker.conf Shell Key/Value This file defines a number of parameters for the service. This is the production configuration.
/etc/openshift/broker-dev.conf Shell Key/Value
/etc/openshift/development none When this file exists the broker service will start in dev mode, using the broker-dev.conf and developement.rb files.
/etc/openshift/server_priv.pem PEM/RSA This key file is used to authenticate optional services.
Generated by openssl
/etc/openshift/server_pub.pem PEM/RSA This key file is used to authenticate optional services
Generated by openssl
/etc/openshift/rsync_id_rsa.* SSH/RSA This key file pair is used to authenticate when moving gears from one node to another.
Generated by ssh-keygen
/etc/openshift/plugins.d/*.conf Shell Key/Value These are magic files. The file name must match the name of a local rubygem and end with .conf.The gem is loaded and the configuration file is parsed and included by the plugin gem
These plugins are loaded as part of the Rails start up process, as specified in the Gemfile
/var/www/openshift/broker/Gemfile Rails/Bundler This file defines the rubygem package requirements for the broker application.
It is used by the bundle command to generate the Gemfile.lock
/var/www/openshift/broker/Gemfile.lock Rails/Bundler This file defines the actual rubygem packages which fullfill the broker application requirements on this system. It is regenerated each time the openshift-broker service is restarted.
/var/www/openshift/broker/httpd/conf.d/*.conf Apache Pick one of the auth conf samples.
This file controls the broker service user identification/authentication when the "remote user" plugin is selected. The "remote user" plugin delegates the authentication to the httpd service which can then use any auth module.
Currently there are example config files for Basic auth, for LDAP and Kerberos.
/etc/openshift/htpasswd Apache If the broker httpd uses the Basic Auth module, this file contains the username/password pairs for the broker service.
/var/www/openshift/broker/config/environments/production.rb Ruby/Rails This file defines the production configuration values for the OpenShift broker service. Debugging stack traces are suppressed.
/var/www/openshift/broker/config/environments/development.rb Ruby/Rails This file defines the development configuration values for the OpenShift broker service. Debugging stack traces are returned in line.
/etc/httpd/conf.d/000000_openshift_origin_broker_proxy.conf Apache2 This file defines the proxy configurations for the Openshift broker and console services. It also sets the ServerName for the system as a whole
/etc/mcollective/client.cfg YAML This file defines the Mcollective client communications parameters. It connects to the underlying message service.  It also can indicate where the client activity is logged and control the logging level.

Broker Plugin Configuration Files


The files in /etc/openshift/plugins.d are a bit magical.  They are loaded when the Gemfile is processed as the Rails application starts.  Each file in that directory that ends in .conf will be processed.  The file name (minus the .conf extension must be the name of a locally installed rubygem.  The named gem is loaded and the config  file is then processed by the gem.  You can't just create a new config file there and put config values in it.  Well you can but it will cause your broker to fail.


Log Files



If things aren't behaving as you think they should, or if you just want to get a sense of how things should look, these are places you can check.

OpenShift Log Files
FileSourceDescription
/var/log/messages syslog System wide log file
/var/log/mcollective-client.log MCollective client Mcollective log file. Location defined in client.cfg. Log level also defined.
/var/log/httpd/access_log httpd Front end proxy httpd
/var/log/httpd/error_log httpd Front end proxy httpd
/var/log/httpd/ssl_access_log httpd Front end proxy httpd
/var/log/httpd/ssl_error_log httpd Front end proxy httpd
/var/log/secure syslog System access
/var/log/audit/audit.log syslog SELinux activity
/var/www/openshift/broker/log/development.log Rails Logs from development mode
/var/www/openshift/broker/log/production.log Rails Logs from production mode
/var/www/openshift/broker/httpd/logs/access_log Apache 2 Broker access
/var/www/openshift/broker/httpd/logs/error_log Apache 2 Broker errors

Tuesday, November 27, 2012

OpenShift Back End Services: DNS - Dynamic Updates

In the previous post I created an authoritative DNS server for the zone app.example.com running on ns1.example.com. The server will answer external queries and authoritative ones, but it will not yet accept update queries.  I need to add that facility before an OpenShift Origin broker service can use it.

OpenShift and DNS Updates

I don't know if I've actually posted this over and over but it feels like it since I've re-written this post at least 4 times since I started on it a week ago.  I'm going to go over it again on the off chance that someone reading this hasn't read the rest.

OpenShift depends on Dynamic DNS for one of its primary functions: Publishing developer applications.  If you can't make the applications public, OpenShift isn't very useful.

OpenShift publishes applications by creating new DNS records which bind the fully qualifed domain name (or FQDN) of the application to a specific host which contains and runs the application service.  The FQDN can then be used in a URL to allow users to access the application with a web browser.

The OpenShift broker provides a plug-in interface for a Dynamic DNS module. Currently (Nov 27 2012) there is only one published plugin.  This uses the DNS Update protocol defined in RFC 2136 along with a signed transaction defined in RFC 2845 to allow authentication of the updates.

The table below lists the information that the OpenShift Broker needs to communicate with a properly configured Bind DNS server to request updates.

OpenShift Origin Bind DNS Plugin information
VariableValueComment
Dynamic DNS Host IP address192.168.5.2ns1.example.com
Dynamic DNS Service port53/TCPDefault, but configurable
Zone to update (domain_suffix)app.example.comapps will go here.
DNSSEC Key TypeHMAC-MD5
DNSSEC Key Nameapp.example.comArbitrary name
DNSSEC Key ValueA really long stringTSIG size range: 1-512 bits

Any DNS update service would require similar information but it would be tailored to the service update protocol.

I need to generate a DNSSEC signing key for the update queries and then insert the  information into the DNS service configuration so that it will accept updates.

Generating an Update Key


The last value in the table above is missing.  I like to have all the ingredients before I start a recipe.  I'm going to generate that key value before moving on.  I'll need it for the OpenShift broker plugin configuration and for the back end DNS service setup.

I'll use dnssec-keygen to create a pair of DNSSEC key files (public and private).

dnssec-keygen -a HMAC-MD5 -b 256 -n USER app.example.com
Kapp.example.com.+157+45890

This command translates as "Create a 256 bit USER key with the HMAC-MD5 algorithm, named example.com". A USER key is one that is used to authenticate access. The HMAC-MD5 keys can be from 1 to 512 bits. A 256 bit key fits on a single line making it easier to manage in this post.

The output indicates the filename of the resulting key files. This command produces two files:

  • Kapp.example.com.+157+45890.key
  • Kapp.example.com.+157+45890.private

The HMAC-MD5 key is a symmetric key, so the two files contain the same key value. The ".key" file is a single line and represents a DNS Resource Record. The ".private" file contains the key string and some metadata.

The key isn't actually used as a password.  Rather it is a "signing key" for a Digital Signature which will be attached to the update query.  The signature is generated using the signing key and the contents of the query. It is mathematically unlikely that someone would be able to generate the correct signature without a copy of the signing key. The DNS server generates a signature string with its copy of the key and the update query.  if the signatures match then the DNS server can be confident that the message did come from an authorize update client.

The key files look like this:

cat Kapp.example.com.+157+45890.key
app.example.com. IN KEY 0 3 157 LHKu/QNeSikkf1kob7irn816/9shxtD++mMTPYc4/do=

cat Kexample.com.+157+45890.private
Private-key-format: v1.3
Algorithm: 157 (HMAC_MD5)
Key: LHKu/QNeSikkf1kob7irn816/9shxtD++mMTPYc4/do=
Bits: AAA=
Created: 20121122011903
Publish: 20121122011903
Activate: 20121122011903

The only part I care about at the moment is the Key: line of the .private file:

LHKu/QNeSikkf1kob7irn816/9shxtD++mMTPYc4/do=

I'll copy that string and save it for later.

Enable DNS Updates


Now that I have an update key and a running authoritative server I can use them to enable DNS updates. I need to provide a copy of the key string to the named and inform it that it should accept updates signed with that key.

I'm going to add some contents to the /var/named/app.example.com.conf file to provide this information.  I'm going to put the key string in its own file and use another include directive to bring it into the configuration.  This way if I need to change the key, I only need to edit a single line file rather than trying to find the key string in a larger configuration.

I can append the key information to the file fairly simply.  The key clause looks like this:

key app.example.com {
  algorithm HMAC-MD5;
  // secret "some long key string in base64 encoding";
  include "app.example.com.secret";
};

Append that to the /var/named/app.example.com.conf file.  The include line indicates that the secret key will be placed in /var/named/app.example.com.secret.

The secret line format is indicated in the key clause comment.  It's a single line containing the word "secret" and the key string in double quotes, terminated by a semicolon.

echo 'secret "LHKu/QNeSikkf1kob7irn816/9shxtD++mMTPYc4/do=" ;' \
> /var/named/app.example.com.secret

Finally I need to inform the named that the app.example.com zone can be updated using that key. I have to add an allow-update option to the zone section. When that's done the config file will look like this:

/var/named/app.example.com.conf

zone "app.example.com" IN {
    type master;
    file "dynamic/app.example.com.db";
    allow-update { key app.example.com ; } ;
};

key app.example.com {
  algorithm HMAC-MD5;
  // secret "some long key string in base64 encoding";
  include "app.example.com.secret";
};

Re check the files once and try restarting the named.

Verify the Named Operation

The first thing to do as always when making a configuration change is check that I haven't broken anything.  I'll do several things to check:
  1. Observe the restart results on the CLI
  2. Search for named lines at the end of /var/log/messages and scan for errors
  3. Run queries for the SOA, NS and A glue records for the app.example.com zone (localhost)
  4. Re-run the queries from another host pointing dig at ns1.example.com
These are the same checks that I did when establishing the authoritative server in the previous post.

Verify DNS Update Operations


I'm almost finished now. The final step is to verify that I can add and remove resource records in the app.example.com zone.

The bind-utils package includes the nsupdate utility.  This is a command line tool that can send DNS update queries.

I have the server hostname and IP address.  I have the TSIG key. I know the dynamic zone.  To test dynamic updates I'm going to add a TXT record named "testrecord.app.example.com" with a value "this is a test record". The nsupdate command below expresses that.

nsupdate -k Kexample.com.+157+45890.private
server 192.168.5.2
update add testrecord.app.example.com 1 TXT "this is a test record"
send
quit

When this command completes we should be able to send a query for that name and get an answer:

dig @127.0.0.1 testrecord.app.example.com txt

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6_3.5 <<>> @127.0.0.1 testrecord.app.example.com txt
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 18488
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;testrecord.app.example.com. IN TXT

;; ANSWER SECTION:
testrecord.app.example.com. 1 IN TXT "this is a test record"

;; AUTHORITY SECTION:
app.example.com. 30 IN NS ns2.example.com.
app.example.com. 30 IN NS ns1.example.com.

;; ADDITIONAL SECTION:
ns1.example.com. 600 IN A 192.168.5.2
ns2.example.com. 600 IN A 192.168.5.3

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Nov 28 02:20:33 2012
;; MSG SIZE  rcvd: 146


I can also check the named logs in /var/log/messages:

grep named /var/log/messages
...
Nov 28 02:12:46 ns1 named[30675]: client 127.0.0.1#60808: signer "app.example.com" approved
Nov 28 02:12:46 ns1 named[30675]: client 127.0.0.1#60808: updating zone 'app.example.com/IN': adding an RR at 'testrecord.app.example.com' TXT
Nov 28 02:12:46 ns1 named[30675]: zone app.example.com/IN: sending notifies (serial 2011112911)

I can see that an update request arrived and the signature checked.  A record was added, and update notifications were sent (or would have been sent) to any secondary servers.

Removing the record again looks very similar:

nsupdate -k bind/Kapp.example.com.+157+13871.private 
server 127.0.0.1
update delete testrecord.app.example.com TXT
send
quit

When I query for the record again I get a negative response now:

dig @127.0.0.1 testrecord.app.example.com txt

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6_3.5 <<>> @127.0.0.1 testrecord.app.example.com txt
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 24061
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;testrecord.app.example.com. IN TXT

;; AUTHORITY SECTION:
app.example.com. 10 IN SOA ns1.example.com. hostmaster.example.com. 2011112912 60 15 1800 10

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Nov 28 02:56:01 2012
;; MSG SIZE  rcvd: 95


Note that there is no answer section.  No section.  No answer.

The logs will also show that the record was removed:

grep named /var/log/messages
...
Nov 28 02:50:42 ns1 named[30675]: client 127.0.0.1#59920: signer "app.example.com" approved
Nov 28 02:50:42 ns1 named[30675]: client 127.0.0.1#59920: updating zone 'app.example.com/IN': deleting rrset at 'testrecord.app.example.com' TXT
Nov 28 02:50:42 ns1 named[30675]: zone app.example.com/IN: sending notifies (serial 2011112912)

It took four blog posts to get here, but I now have a working DNS server capable of DNS Updates.

The final steps are to create a secondary server on ns2.example.com and to submit the nameserver information to my IT department for delegation.

DNS is ready for Openshift.

References


  • RFC 2136 defines the DNS Update protocol.
  • RFC 2845 defines signed transactions which allow DNS updates to be authenticated.
  • dnssec-keygen Generates TSIG keys suitable for signing DNS updates
  • nsupdate is a command line tool for making DNS updates

Openshift Back End Services: DNS - An Authoritative Server

OpenShift Origin dynamic DNS requires an authoritative zone for the application records.  So the first step to creating the DNS back end service is to establish that authoritative zone.  When I'm sure it will serve records then I'll add the dynamic update feature to it.

There are a number of good resources on setting up DNS on Linux.  I'll post a few links to them in the Resources.  I've even written part of one, but each time I do this I see it a little differently and (I think) improve my understanding.  So I'm going to do it again.

Ingredients

I'm going to try to create a proper (ready to be delegated) zone.

Here's the list of information I'll need to set it up:

DNS Service Configuration Variables
VariableValueComments
Primary Nameserver
IP Address192.168.5.2
Hostnamens1.example.comNot in the app domain
Secondary Nameserver
IP Address192.168.5.3
Hostnamens2.example.comNot in the app domain
Update Configuration
Application Zoneapp.example.comNot the top level
Application Zone Key Nameapp.example.comArbitrary name
Application Zone Key TypeHMAC-MD5
Application Zone Key Size64 bits512 bits in real life
Application Zone Key ValueA Base64 encoded stringGenerated by dnssec-keygen

I'll be setting up both a primary and secondary server for the zone.  The update configuration information won't be needed here except to establish a static zone that will be made dynamic later.

The operations listed below will all be performed on ns1.example.com.

Bind on Linux


Installing ISC Bind 9 on most modern Linux distributions is pretty easy. It's an old and well worn tool.  For RPM based distributions you can just use yum. I generally install the bind-utils package as well just so they're handy.
yum install bind bind-utils

The DNS service binary is called named.  When I refer to Bind, I'll be talking about the software in general. When I refer to named I'll be referring to the daemon and its configuration files.

Once the package is installed there are a number of small configuration changes that are needed to enable and verify the caching service.

  • Enable listener ports
  • Enable rndc
  • Enable start on boot
  • Start the named service
  • Confirm caching service
  • Confirm rndc controls

Named Configuration and Management


The named service follows a traditional layout model on Linux (It's actually probably one of the canonical services).  Before I start making changes I want to

Named Configuration Files


The primary service configuration file is /etc/named.conf.  Additional configuration information and service data reside in the /var/named directory.  The named daemon logs to /var/log/messages through syslog.
  • /etc/named.conf - primary configuration file
  • /var/named - additional configuration and data
  • /var/log/messages - service logging

You can filter for named related log messages in /var/log/messages by grepping for (wait for it!) "named".

ISC provides a complete reference of the named configuration options on their site.

The named service adds one non-traditional feature.  It has a tool called rndc which might stand for "remote name daemon controller".  The rndc tool use used locally by the service command to provide status for the named service.  It can also be used to adjust the logging level, to dump the current zones to a file, or to force a zone reload  without restarting the daemon.

rndc does require some additional setup.  It is not enabled by default as it requires an authentication key.  A default key would present a security risk.  The bind package provides rndc-confgen to help set up the rndc access to the local named. This command produces a required key file which will reside at /etc/rndc.key.

  • /etc/rndc.key - Remote name daemon control access key

rndc also requires an addition to the /etc/named.conf file. I'll do that just before I try starting the service.

Before making any change to a configuration file I copy the original and name the new file <filename>.orig. For example, /etc/named.conf would become /etc/named.conf.orig. This way I can track my changes and revert them to the initial values if I need to.

Enable Listeners


The initial configuration of /etc/named.conf restricts queries and updates to the local IP interfaces (IPv4 127.0.0.1 and IPv6 ::1).  Since I want to allow basically anyone to find out about my zone, I have to open this up.  There are three configuration lines that control listeners and query access:

  • listen-on
  • listen-on-v6
  • allow-query

Each of these options takes a semi-colon (;)delimited list of listeners.  The list is encapsulated in a paired set of curly-braces ( {} ). Since this will be a public service, I just have to replace the appropriate localhost address entries with the keyword "any".

/etc/named.conf

...
options {
 listen-on port 53 { any ; };
 listen-on-v6 port 53 { any; };
 directory  "/var/named";
 dump-file  "/var/named/data/cache_dump.db";
        statistics-file "/var/named/data/named_stats.txt";
        memstatistics-file "/var/named/data/named_mem_stats.txt";
 allow-query     { any; };
 recursion yes;
...

If you want to be tricksy a sed one-liner (and a safety copy) will do the trick if you're starting from the default:

cp /etc/named.conf /etc/named.conf.orig
sed -i -e 's/127.0.0.1\|::1\|localhost/any/' /etc/named.conf


Enable rndc

rndc is the Remote Name Daemon Control program. It's now used for all communications and control to the named on a bind server host. rndc does allow you to securely control a name server daemon remotely, but we won't be using it that way. rndc is also the program that gets and reports the status information when you run service named status so it's nice to have it configured.

rndc comes with a nice little configuration tool to help: rndc-confgen. rndc-confgen creates a unique rndc access key and places the rncd configuration in /etc/rndc.key for you.

rndc-keygen -a

This creates a new file named /etc/rndc.key which contains the access key for the named process. Both the named and the rndc command must have access to this key. rndc uses the /etc/rndc.key file by default. named must be configured for it.

If the rndc-keygen command hangs it is because there is not enough entropy (randomness) on the system. I could Log onto another window and type random commands for a bit and it would complete. If I'm impatient I could run it with -r /dev/urandom which will always complete, but which may be less secure because it will not block waiting for enough randomness to generate a good key. I often do that for lab systems.

rndc-keygen does not set the SELinux context for the key file.  It also does not set the ownership and permissions so that the named can read it. restorecon will do it for me though.

restorecon -v /etc/rndc.key
chown root:named /etc/rndc.key
chmod 640 /etc/rndc.key

Now that we have a key, we have to tell the named to use it. Append the section below to the bottom of the /etc/named.conf file.

// enable service controls via rndc
// use the default rndc key
include "/etc/rndc.key";

controls {
        inet 127.0.0.1 port 953
        allow { 127.0.0.1; } keys { "rndc-key"; };
};

Verifying a caching DNS server


With this configuration I have a caching DNS server.  I need to check the operation now before going ahead to add a new zone.

I use the expected tools to start the named service and get status:

service named start
Starting named:                                            [  OK  ]

service named status
version: 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6_3.5
CPUs found: 16
worker threads: 16
number of zones: 19
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 0/0/1000
tcp clients: 0/100
server is up and running
named (pid  31010) is running...

This indicates that the server started and that it is responding to rndc queries. If the daemon does not start or if I got errors from the status query, I'd check /var/log/messages for error messages.

Next we want to verify that the server is actually answering queries. I use dig or host to check. host has a simpler interface and output but for this I want the verbosity of dig to help diagnose if there are any problems. I have to try from two different locations: localhost and "somewhere else". I want first to verify that the service is running and answering, and second that it is accessable from outside itself. I'll only show the localhost queries, but the remote ones are identical.

dig @127.0.0.1 www.example.com

; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6_3.5 <<>> @127.0.0.1 www.example.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29408
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 4

;; QUESTION SECTION:
;www.example.com.  IN A

;; ANSWER SECTION:
www.example.com. 172789 IN A 192.0.43.10

;; AUTHORITY SECTION:
example.com.  172788 IN NS a.iana-servers.net.
example.com.  172788 IN NS b.iana-servers.net.

;; ADDITIONAL SECTION:
a.iana-servers.net. 172788 IN A 199.43.132.53
a.iana-servers.net. 172788 IN AAAA 2001:500:8c::53
b.iana-servers.net. 172788 IN A 199.43.133.53
b.iana-servers.net. 172788 IN AAAA 2001:500:8d::53

;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Nov 21 20:42:21 2012
;; MSG SIZE  rcvd: 185

This is actually the real right answer.  Since RFC 2606 reserves the example.com domain, the IANA also  serves that domain.  When I build my test server I'll override that.  If I were building a real OpenShift Origin service I'd get a properly delegated sub-domain of my organization or get my own domain from a registrar.

Take a moment to look at that output. It's meaningful. The ANSWER SECTION contains the only response, an A record. The AUTHORITY SECTION lists the servers which are the designated sources for the content in the example.com zone. It contains two NS records. NS record values are fully qualified domain names (FQDN). That means those names don't have some implied suffix. They end with a dot (.) which anchors them to the root of the DNS. The ADDITIONAL SECTION provides IP address resolution for the NS record FQDNs. The last section indicates that the answer came from the IPv4 localhost address and gives the date/time stamp.

So this answer tells you not only that the IP address for www.example.com is 192.0.43.10 but where the answer came from.

I'll do that again from some other host and set the server address to the public IP address of my nameserver host.


Zone Configuration


Now that I have a caching server running, it's time to start adding some content. The /etc/named.conf file syntax has a directive to include another file in line. Rather than putting the entire configuration section in the master configuration file, I'll put my configuration information in another file and include it. That just requires appending one line to /etc/named.conf

echo 'include "app.example.com.conf" ;' >> /etc/named.conf

Other than the /etc/named.conf file, all of the named configuration files reside in /var/named/. That's the default location for relative path names in the /etc/named.conf as well. I'll create a configuration file fragment there. Since I'm creating the app.example.com domain I'll call the file /var/named/app.example.com.conf.

zone "app.example.com" IN {
    type master;
    file "dynamic/app.example.com.db";
};

The directory option in the /etc/named.conf file determines the location of any files listed with relative path names. (see the fragment in the Enabling Listeners section)  The file directive above means that the absolute path to the zone file will be /var/named/dynamic/app.example.com.db

This will eventually be a dynamic zone.  That's why I'm putting it in "dynamic/app.example.com.db".  If it were to be a static zone I'd probably put it right at the top of the /var/named tree.

The Zone File

The last file I need to create is the zone file. This file defines the initial contents of the application zone. It also defines the NS (nameserver) records for the zone and the default TTL (time to live).

/var/named/dynamic/app.example.com.db

$ORIGIN .
$TTL 1800 ; Default TTL: 30 Minutes
app.example.com. IN SOA ns1.example.com. hostmaster.example.com. (
                         2011112904 ; serial
                         60         ; refresh (1 minute)
                         15         ; retry (15 seconds)
                         1800       ; expire (30 minutes)
                         10         ; minimum (10 seconds)
                          )
                     NS ns1.example.com.
                     NS ns2.example.com.
;; prime the nameserver IP addresses for the app zone.
ns1.example.com.               A        192.168.5.2
ns2.example.com.               A        192.168.5.3

Verifying The App Zone

Once the configuration file and the zone database file are in place it's time to try restarting the named service. I use service named restart. and observe the results.

service named restart
Stopping named: .                     [ OK ]
Starting named:                       [ OK ]

Then I use grep named /var/log/messages to observe the typical start up messages. I look for a line indicating that the app.example.com zone has been loaded.

grep named /var/named/messages | grep loaded
...
Nov 22 17:40:10 ns1 named[3888]: zone 0.in-addr.arpa/IN: loaded serial 0
Nov 22 17:40:10 ns1 named[3888]: zone 1.0.0.127.in-addr.arpa/IN: loaded serial 0
Nov 22 17:40:10 ns1 named[3888]: zone 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa/IN: loaded serial 0
Nov 22 17:40:10 ns1 named[3888]: zone example.com/IN: loaded serial 2011112904
Nov 22 17:40:10 ns1 named[3888]: zone app.example.com/IN: loaded serial 2011112906
Nov 22 17:40:10 ns1 named[3888]: zone localhost.localdomain/IN: loaded serial 0
Nov 22 17:40:10 ns1 named[3888]: zone localhost/IN: loaded serial 0
Nov 22 17:40:10 ns1 named[3888]: managed-keys-zone ./IN: loaded serial 53

Now that I know that the zone has been loaded successfully, I'll check that it's served properly. I first request the SOA (Start of Authority) record, and then the full zone dump.

dig @localhost app.example.com soa
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6_3.5 <<>> @localhost app.example.com soa
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3502
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2

;; QUESTION SECTION:
;app.example.com.  IN SOA

;; ANSWER SECTION:
app.example.com. 30 IN SOA ns1.example.com. hostmaster.example.com. 2011112906 60 15 1800 10

;; AUTHORITY SECTION:
app.example.com. 30 IN NS ns2.example.com.
app.example.com. 30 IN NS ns1.example.com.

;; ADDITIONAL SECTION:
ns1.example.com. 600 IN A 10.16.137.243
ns2.example.com. 600 IN A 10.16.137.244

;; Query time: 0 msec
;; SERVER: ::1#53(::1)
;; WHEN: Thu Nov 22 17:29:32 2012
;; MSG SIZE  rcvd: 148


Now test a complete zone dump, (and save it for comparison)

dig @127.0.0.1 app.example.com axfr
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.10.rc1.el6_3.5 <<>> @localhost app.example.com axfr
; (2 servers found)
;; global options: +cmd
app.example.com. 30 IN SOA ns1.example.com. hostmaster.example.com. 2011112906 60 15 1800 10
app.example.com. 30 IN NS ns1.example.com.
app.example.com. 30 IN NS ns2.example.com.
app.example.com. 30 IN SOA ns1.example.com. hostmaster.example.com. 2011112906 60 15 1800 10
;; Query time: 1 msec
;; SERVER: ::1#53(::1)
;; WHEN: Thu Nov 22 17:18:28 2012
;; XFR size: 4 records (messages 1, bytes 152)


Summary


At this point I have a working authoritative server for the app.example.com zone.  To get it properly delegated I need to create a secondary server and configure zone transfers.  Then I can provide the nameserver NS and A records and a contact to my IT department and they can complete the delegation.

For now I'm going to skip delegation. The next post will describe the configuration of dynamic updates

References


Monday, November 26, 2012

OpenShift Back End Services: The DNS - Concepts

Most of the time, DNS just works. I have heard some people (hardware lab techs mostly) talk about how they use IP addresses for their work because DNS is so unreliable.  Then I explain to them that (at least outside the lab) DNS is probably the most reliable and fundamental service on the modern internet.  Next to the TCP/IP and the routing protocols, DNS is the most critical service.  Without DNS the rest of the net doesn't matter because no one can find anything.  But very seldom does DNS actually fail on a large scale.

For a system like OpenShift, DNS is life's blood.  The whole purpose of a PaaS is to make applications available to users.  OpenShift does that by adding a new DNS record each time an application is created.  The name portion of the record is crafted from the developer's namespace and application name.  The value portion directs a user to the node host which offers the application service.  But before the browser can find the application, the DNS resolver must find the DNS record.

If you're going to run your own DNS it's important to understand how DNS services interact.

What's so hard about DNS?


Given the ubiquity of DNS and its critical function, I've been surprised at the amount of difficulty it has caused configuring it into OpenShift Origin.  I think part of the issue is that ordinarily it works so well that few sysadmins and developers have to work with it in any depth.  Most people's experiences with DNS consist of checking and setting the /etc/resolv.conf nameserver and search lists, and the occasional dig command to check if a zone is responding.

In most companies there's one or two of the IT folks who are "The DNS guys" (or girls?).  They manage the external DNS (which shouldn't change fast) and the internal DNS (which uses lots of DHCP for desktops, laptops, wireless).  They own the DNS domains and getting new IP name/address assignments from them has a well defined process.  Getting a delegated sub-domain is generally a more involved process.  The DNS Guys don't like to do it (because they get the calls when your DNS  breaks) so people who need DNS (like lab spaces) will make do with their own or go without.

A number of geeks like me have set up split DNS in their houses.  This requires DNS forwarding features, but not delegation.  There are even tools now like Dnsmasq which implement simple split DNS and combine DNS, DHCP and Dynamic DNS all in one nice relatively simple service. These are meant for small labs or home networks where they can control the entire DNS namespace.  They provide the hostname to IP address mapping of DNS and combine that with the host resolver configuration offered by DHCP. They only work at the bottom of the namespace hierarchy and they do not require delegation, as nothing in the local database is ever published outside that bottom layer zone. Again, this removes the need for the average system administrator to think much about what's happening behind the scenes.

And there's sometimes rather a lot going on behind the scenes.

The Domain Name Service Behind the Scenes


If you're familiar with DNS operations, you can skip this part.

If you've never managed a DNS hierarchy, you might think that to install a DNS service you just install the bind package, edit the configuration file, add some zone data and start the daemon. Done.  Right?  Not quite.  Unlike nearly all other typical database services, DNS requires the participation of other servers to work properly.

The DNS  is a specialized distributed database with a hierarchical namespace.  Note something really important here.  I didn't say "Bind is..".  I said "The DNS is.." The DNS is ONE DATABASE. The data is distributed across the entire internet and the authority for portions are delegated to hundreds of thousands (or more) origanizations and individuals, but if it weren't a single unified entity it wouldn't work.  The magic of the DNS is the way in which the namespace and data are "glued" together. To see how that works I'm going to walk through an example.

DNS Queries and the /etc/resolv.conf file


When I try to access a web site from my laptop, the first thing that happens is a DNS query to resolve the URL host name to the IP address of the destination host. My computer is going to send a DNS request to some server. A computer that answers DNS requests is called a nameserver.  I have to know the address of the nameserver.  If I only knew the name of the name server, I'd have to do a look-up query for that, and since I don't have the address of someone to ask I'd be stuck in a loop.

Fortunately, the pump is primed by the /etc/resolv.conf file.

The /etc/resolv.conf file contains a list of nameserver entries. This is a list of IP addresses.  Each of the addresses corresponds to a DNS nameserver host.

; generated by /usr/sbin/dhclient-script
search westford.example.com example.com
nameserver 192.168.4.2
nameserver 192.168.4.3
nameserver 172.30.41.12

When I make a request with a hostname, the resolver library on my laptop issues a query to the first nameserver in the list.  Say I wanted to visit openshift.example.com.  The resolver would send a query which asks essentially "tell me what you can about anything named 'openshift.example.com'"  You can simulate this with either the dig or host commands.

The dig and host commands


The dig and host commands are programs who's only purpose is to issue DNS queries and report the responses.  I tend to use host when all I need is the answer.  host has much simpler output and looks to me like it is designed for use in command-line scripts.  It responds with a single line and each field in the output is space separated.

host www.example.com
www.example.com has address 192.0.43.10
www.example.com has IPv6 address 2001:500:88:200::10

I use dig when I am verifying or diagnosing DNS operation.  By default dig prints a (mostly) human readable  report of the entire DNS response from the nameserver.  This includes not only the requested records but the authority records which indicate where the answer ultimately came from.  The format is not horribly human friendly or even string-parser-friendly, but once you learn to read it it is very concise and informative.

dig www.example.com

; <<>> DiG 9.9.2-RedHat-9.9.2-2.fc17 <<>> www.example.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30626
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 5

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.example.com.  IN A

;; ANSWER SECTION:
www.example.com. 172791 IN A 192.0.43.10

;; AUTHORITY SECTION:
example.com.  163951 IN NS a.iana-servers.net.
example.com.  163951 IN NS b.iana-servers.net.

;; ADDITIONAL SECTION:
b.iana-servers.net. 1044 IN A 199.43.133.53
b.iana-servers.net. 1044 IN AAAA 2001:500:8d::53
a.iana-servers.net. 1044 IN A 199.43.132.53
a.iana-servers.net. 1044 IN AAAA 2001:500:8c::53

;; Query time: 43 msec
;; SERVER: 10.11.255.156#53(10.11.255.156)
;; WHEN: Mon Nov 26 18:14:47 2012
;; MSG SIZE  rcvd: 196


In the examples that follow I'm mostly going to use dig though I may also trim them some to highlight the important parts.


A Name Lookup Example


This example is not at all contrived.  This kind of thing happens to me every day.. Really..  Sure it does.

Say I'm sitting at my desk, working on my laptop.  I'm a mid level sysadmin at Example.Com in the office in Boston, MA.   My desktop has a DNS name something like "llamadesk.boston.example.com".  My co-worker (the lucky SOB) is working from the beach outside company office on Maui.  He posts a document file on a web server in the office there.  The web server hostname is "www.maui.example.com". I want to see the document so he sends me a url for it and I dutifully paste it into my web browser address bar and hit the enter key.

My web browser is linked with the local resolver library (usually called libresolv). It has a function called getaddrinfo (used to be gethostbyname) which takes a hostname string as input and returns (among other things) an IP address associated with that name.  What happens between the call and the return is the interesting part.

The first thing the resolver library does is read /etc/resolv.conf. Then it crafts a query packet and sends it off to the first IP address in the nameserver list and waits for a response.

The nameserver is listening for query packets. I receives the packet and tries to find the best answer.

A nameserver, when looking at a query, can know one of three things:

  • I know the answer.
  • I don't know the answer, but I know the nameserver for a domain that contains the answer.
  • I don't know the answer or the domain, but I know where the root domain is.

A server which knows the answer is called the authoritative nameserver for the domain.  It will answer all queries for the contents of the domains it serves.

If the nameserver is not authoritative it then has a choice.  It can merely return a response which means "I don't know" or it can perform a recursive query.  Most of the nameservers which are at the edge of the DNS where desktops will be making queries will be configured to recurse.

So, my nearest (recursive) nameserver is in boston.example.com. It doesn't know about servers in maui.example.com. However, it does know that it's in example.com and it knows how to find the example.com nameservers.  These are servers in the NS records for example.com. So the nearby nameserver issues a query to one of the NS servers for example.com and asks for the NS records from the maui.example.com domain.  The reply will contain the names of the authoritative nameservers.  The nearby nameserver then requests the A records for the maui.example.com nameservers.  Now it knows someone who does know the answer.  It sends one final query to the maui nameserver for www.maui.example.com.  The maui nameserver returns the answer (or an error response) and the local nameserver returns the answer to my browser which can finally make a connection to the actual target host.

Did you follow all that?  See if this helps:


Or this?
  1. llamadesk -> ns1.boston.example.com
    "tell me the IP address for www.maui.example.com"
  2. ns1.boston.example.com -> ns1.example.com
    "tell me who serves maui.example.com"
  3. ns1.boston.example.com -> ns1.example.com
    "tell me the IP address for ns1.maui.example.com"
  4. ns1.boston.example.com -> ns1.maui.example.com
    "tell me the IP address for www.maui.example.com"
  5. llamadesk <- ns1.boston.redhat.com
    "here's the IP address you asked for"

Glue Records: Binding the Internet Together

The links that make the DNS work are known as glue records. The process of establishing a link between one layer in the hierarchy and the next layer down is called delegation.

The nameserver at the example.com level has to know about all of the sub-domains below example.com.  It must have two types of records for each sub-domain.  It must have a set of NS records which contain the DNS name of the authoritative servers for the sub-domain.  Since NS records return the hostname of the nameservers, the parent must also provide an A record for each nameserver.

As noted in other places, the technical aspects of delegation are much less significant than the political or organizational aspects.  Delegation requires the establishment of a relationship communication and of trust between groups that may normally be somewhat territorial. Once a sub-domain is delegated it is the responsibility of the receiving administrator to ensure that the domain is always available so that the it contains remain accessible and to be able to accept and respond to problem reports.

Development and Test Environments: Rogue DNS

"Rogue DNS" is the term I use for an undelegated DNS zone.  Some people try to soften the term but I think "rogue" carries just the right connotations.  A rogue is an independent, slightly unsavory character who none the less is capable and possibly even attractive in a "bad boy" sort of way.  A rogue is not always a bad guy and sometimes it takes a rogue to save the day.

Every NAT network which includes split DNS would be considered "rogue" under this definition.  That's pretty much every home network and most commercial business networks today.  Rogues aren't all bad.

Rogue zones are also common in testing and small or personal development environments.  They don't require any negotiation. They're pretty much required for demos or livecd try-it-out implementations.

The establishment of a rogue zone is really easy: Just create the servers and start adding resource records to the zone. The problem is that without delegation, the rogue zone is invisible to everyone else. 

The real problem with rogue DNS is that every client that wants to participate in the zone must be manually re-configured to see the rogue.  The first nameserver in the client /etc/resolv.conf must be one of the rogue name servers.  In a typical NAT environment the owner of the DNS also owns the DHCP services. Since the DHCP server also provides dynamic nameserver information, all of the DHCP clients automatically participate in the DNS as well.  In a lab setting, the lab administrators may control the DHCP as well.

Rogues and DNS Forwarding


The other problem with a Rogue DNS service is that the rogue. because it is not delegated, does not know about anything outside itself.  Bind does have a facility for "forwarding" requests.  This is commonly used in NAT environments.

When a forwarding server gets a query for which it is not authoritative, rather than trying to recurse, it will forward the request directly to one of a list of "upstream" servers.  These are usually the servers that would normally have been in the nameserver host's /etc/resolv.conf.

Dynamic DNS


The final concept to cover is Dynamic DNS.  As noted in the previous post, OpenShift depends on the ability to add and remove resource records from a DNS zone.

In most corporate DNS services the zone files are fairly static.  They are often mechanically generated from some other database on a regular basis.  It is common for DNS updates to require from 1 or 2 hours to as much as 24 hours.   OpenShift requires the updates to be applied instantly and propagation times of more than a few seconds are considered unacceptable performance.

The one exception is DNS assigned from DHCP.  Microsoft Active Directory is especially good at this.  Dnsmasq is a combined DNS/DHCP/TFTP service designed for home and small business NAT networks.  When it assigns an IP address it can also bind the address to a hostname requested by the client.  It is also possible to connect ISC Bind and ISC DHCP to do Dynamic DNS.

OpenShift does Dynamic DNS through the DNS plugin.  I want to say "plugins", but right now there is only one DNS plugin.  I've written a few posts on writing a new DNS plugin, but it needs the last few, and the sample I picked will only be useful for labs.  Personally I think we need plugins for the greatest possible variety of external DNS services, from Microsoft Active Directory DNS to commercial services.

 DNS Update services will all have similar communications requirements.  Server and access information as well as the zone to update and the new resource record content.

Closing


I'm sure you'll agree I've lectured enough on the relevant capabilities and behaviors of DNS.  I mostly went through this exercise to be sure I hadn't missed anything myself.

You'll notice that I refer a lot to RFCs (Request for Comments).  These are the official specifications for the behavior of parts of the internet.  A lot of people find the idea of the RFCs intimidating. They're dense and bland.  They're also your friend. Don't be scared to go looking for information you need.  You don't have to read them like a good novel, but it's good to scan the relevant documents and at least know where to find answers.  I think a lot of people also skip the RFCs because people are looking for "how do I do it".  The RFCs only tell you "How does it work".  I think often the latter helps illuminate the former.

When I go looking for an RFC, I usually don't know the right one to look for.  Use the search engines
Google for your topic and add "RFC" to the beginning of the query and you'll very likely get a good reference.

Scan the RFCs. You'll be glad you did.

The next post will describe the creation of an authoritative DNS server using ISC Bind 9.  As I go along I mean to include not only the configuration steps but to demonstrate some tools and resources for checking the status of the service and for diagnosing any problems that might arise.

References

RFCs specifically significant to OpenShift:
  • RFC 1033 - Domain Administrators Operations Guide
  • RFC 1034 - Domain Names - Concepts and Facilities
  • RFC 1035 - Domain Names - Implimentation and Specification
  • RFC 2136 - DNS Update
  • RFC 2845 - Secret Key Transaction Authentication for DNS (TSIG)