Wrap overlong lines, follow markup conventions. Co-Authored-By: Diane Fleming <diflemin@cisco.com> Change-Id: I4b2321dd8bbc66cf7ab84aace7a24dd6fe041bfe
23 KiB
Introduction to the fractals application architecture
This tutorial works with a scalable cloud application that generates fractals - beautiful images made using only mathematics, like the following image.
This section introduces the application architecture and explains how it was designed to take advantage of cloud features in general and OpenStack in particular. It also describes some commands in the previous section.
(for Nick) Improve the architecture discussion.
dotnet
Warning
This section has not yet been completed for the .NET SDK.
fog
Warning
This section has not yet been completed for the fog SDK.
jclouds
Warning
This section has not yet been completed for the jclouds SDK.
node
Warning
This section has not yet been completed for the pkgcloud SDK.
openstacksdk
Warning
This section has not yet been completed for the OpenStack SDK.
phpopencloud
Warning
This section has not yet been completed for the PHP-OpenCloud SDK.
Cloud application architecture principles
Cloud applications typically share several design principles. These principles influenced many Fractals application design decisions.
Modularity and micro-services
Micro-services are an important design pattern that helps achieve application modularity. Separating logical application functions into independent services simplifies maintenance and re-use. Decoupling components also makes it easier to selectively scale individual components, as required. Further, application modularity is a required feature of applications that scale out well and are fault tolerant.
Scalability
Cloud applications often use many small instances rather than a few large instances. Provided that an application is sufficiently modular, you can easily distribute micro-services across as many instances as required. This architecture enables an application to grow past the limit imposed by the maximum size of an instance. It's like trying to move a large number of people from one place to another; there's only so many people you can put on the largest bus, but you can use an unlimited number of buses or small cars, which provide just the capacity you need - and no more.
Fault tolerance
In cloud programming, there's a well-known analogy known as "cattle vs pets". If you haven't heard it before, it goes like this:
When you're dealing with pets, you name them and care for them and if they get sick, you nurse them back to health. Nursing pets back to health can be difficult and very time consuming. When you're dealing with cattle, you attach a numbered tag to their ear and if they get sick you put them down and move on.
That, as it happens, is the new reality of programming. Applications and systems used to be created on large, expensive servers, cared for by operations staff dedicated to keeping them healthy. If something went wrong with one of those servers, the staff's job was to do whatever it took to make it right again and save the server and the application.
In cloud programming, it's very different. Rather than large, expensive servers, you're dealing with virtual machines that are literally disposable; if something goes wrong, you shut it down and spin up a new one. There's still operations staff, but rather than nursing individual servers back to health, their job is to monitor the health of the overall system.
There are definite advantages to this architecture. It's easy to get a "new" server, without any of the issues that inevitably arise when a server has been up and running for months, or even years.
As with classical infrastructure, failures of the underpinning cloud infrastructure (hardware, networks, and software) are unavoidable. When you're designing for the cloud, it's crucial that your application is designed for an environment where failures can happen at any moment. This may sound like a liability, but it's not; by designing your application with a high degree of fault tolerance, you're also making it resilient in the face of change, and therefore more adaptable.
Fault tolerance is essential to the cloud-based application.
Automation
If an application is meant to automatically scale up and down to meet demand, it is not feasible have any manual steps in the process of deploying any component of the application. Automation also decreases the time to recovery for your application in the event of component failures, increasing fault tolerance and resilience.
Programmatic interfaces (APIs)
Like many cloud applications, the Fractals application has a RESTful API. You can connect to it directly and generate fractals, or you can integrate it as a component of a larger application. Any time a standard interface such as an API is available, automated testing becomes much more feasible, increasing software quality.
Fractals application architecture
The Fractals application was designed with the principles of the
previous subsection in mind. You'll note that in section1
, we deployed the
application in an all-in-one style, on a single virtual machine. This
isn't good practice, but because the application uses micro-services to
decouple logical application functions, we can change this easily.
images/architecture.dot
Message queues are used to facilitate communication between the Fractal application services. The Fractal application uses a so-called work queue (or task queue) to distribute tasks to the worker services.
Message queues work in a way similar to a queue (or a line, for those of us on the other side of the ocean) in a bank being served by multiple clerks. The message queue in our application provides a feed of work requests that can be taken one-at-a-time by worker services, whether there is a single worker service or hundreds of them.
This is a useful pattern for many cloud applications that have long lists of requests coming in and a pool of resources from which to service them. This also means that a worker may crash and the tasks will be processed by other workers.
Note
The RabbitMQ getting started tutorial provides a great introduction to message queues.
images/work_queue.dot
The worker service consumes messages from the work queue and then processes them to create the corresponding fractal image file.
Of course there's also a web interface which offers a more human friendly way of accessing the API to view the created fractal images, and a simple command line interface.
There are also multiple storage back ends (to store the generated
fractal images) and a database component (to store the state of tasks),
but we'll talk about those in /section4
and /section5
respectively.
How the Fractals application interacts with OpenStack
Description of the components of OpenStack and how they relate to the Fractals applicaiton and how it runs on the cloud. TF notes this is already covered in the guide, just split across each section. Adding it here forces the introduction of block storage, object storage, orchestration and neutron networking too early, which could seriously confuse users who don't have these services in their cloud. Therefore, this should not be done here.
The magic revisited
So what exactly was that request doing at the end of the previous section? Let's look at it again. (Note that in this subsection, we're just explaining what you've already done in the previous section; you don't need to execute these commands again.)
libcloud
../../samples/libcloud/section2.py
We explained image and flavor in section1
, so in the following sections, we will
explain the other parameters in detail, including
ex_userdata
(cloud-init) and ex_keyname
(key
pairs).
Introduction to cloud-init
cloud-init
is a tool that performs instance configuration tasks during the boot of
a cloud instance, and comes installed on most cloud images.
ex_userdata
, which was passed to create_node
,
is the configuration data passed to cloud-init.
In this case, we are presenting a shell script as the userdata.
When create_node
creates the instance,
cloud-init
executes the shell script in the
userdata
variable.
When an SSH public key is provided during instance creation, cloud-init installs this key on a user account. (The user name varies between cloud images.) See the Obtaining Images section of the image guide for guidance about which user name you should use when SSHing. If you still have problems logging in, ask your cloud provider to confirm the user name.
libcloud
../../samples/libcloud/section2.py
After the instance is created, cloud-init downloads and runs a script
called install.sh
. This script installs the Fractals
application. Cloud-init can consume bash scripts and a number of
different types of data. You can even provide multiple types of data.
You can find more information about cloud-init in the official
documentation.
Introduction to key pairs
Security is important when it comes to your instances; you can't have just anyone accessing them. To enable logging into an instance, you must provide the public key of an SSH key pair during instance creation. In section one, you created and uploaded a key pair to OpenStack, and cloud-init installed it for the user account.
Even with a key in place, however, you must have the appropriate security group rules in place to access your instance.
Introduction to security groups
Security groups are sets of network access rules that are applied to an instance's networking. By default, only egress (outbound) traffic is allowed. You must explicitly enable ingress (inbound) network access by creating a security group rule.
Warning
Removing the egress rule created by OpenStack will cause your instance networking to break.
Start by creating a security group for the all-in-one instance and adding the appropriate rules, such as HTTP (TCP port 80) and SSH (TCP port 22):
libcloud
../../samples/libcloud/section2.py
Note
ex_create_security_group_rule()
takes ranges of ports as
input. This is why ports 80 and 22 are passed twice.
You can list available security groups with:
libcloud
../../samples/libcloud/section2.py
Once you've created a rule or group, you can also delete it:
libcloud
../../samples/libcloud/section2.py
To see which security groups apply to an instance, you can:
libcloud
../../samples/libcloud/section2.py
print() ?
Once you've configured permissions, you'll need to know where to access the application.
Introduction to Floating IPs
As in traditional IT, cloud instances are accessed through IP addresses that OpenStack assigns. How this is actually done depends on the networking setup for your cloud. In some cases, you will simply get an Internet rout-able IP address assigned directly to your instance.
The most common way for OpenStack clouds to allocate Internet rout-able IP addresses to instances, however, is through the use of floating IPs. A floating IP is an address that exists as an entity unto itself, and can be associated to a specific instance network interface. When a floating IP address is associated to an instance network interface, OpenStack re-directs traffic bound for that address to the address of the instance's internal network interface address. Your cloud provider will generally offer pools of floating IPs for your use.
To use a floating IP, you must first allocate an IP to your project, then associate it to your instance's network interface.
Note
Allocating a floating IP address to an instance does not change the IP address of the instance, it causes OpenStack to establish the network translation rules to allow an additional IP address.
libcloud
../../samples/libcloud/section2.py
If you have no free floating IPs that have been previously allocated for your project, first select a floating IP pool offered by your provider. In this example, we have selected the first one and assume that it has available IP addresses.
libcloud
../../samples/libcloud/section2.py
Now request that an address from this pool be allocated to your project.
libcloud
../../samples/libcloud/section2.py
Now that you have an unused floating IP address allocated to your project, attach it to an instance.
libcloud
../../samples/libcloud/section2.py
That brings us to where we ended up at the end of /section1
. But where do we go
from here?
Splitting services across multiple instances
We've talked about separating functions into different micro-services, and how that enables us to make use of the cloud architecture. Now let's see that in action.
The rest of this tutorial won't reference the all-in-one instance you created in section one. Take a moment to delete this instance.
It's easy to split out services into multiple instances. We will
create a controller instance called app-controller
, which
hosts the API, database, and messaging services. We'll also create a
worker instance called app-worker-1
, which just generates
fractals.
The first step is to start the controller instance. The instance has the API service, the database, and the messaging service, as you can see from the parameters passed to the installation script.
Parameter | Description | Values |
---|---|---|
-i |
Install a service | messaging (install RabbitMQ) and faafo
(install the Faafo app). |
-r |
Enable/start something | api (enable and start the API service),
worker (enable and start the worker service), and
demo (run the demo mode to request random fractals). |
libcloud
../../samples/libcloud/section2.py
Note that this time, when you create a security group, you're including a rule that only applies for instances that are part of the worker_group.
Next, start a second instance, which will be the worker instance:
libcloud
../../samples/libcloud/section2.py
Notice that you've added this instance to the worker_group, so it can access the controller.
As you can see from the parameters passed to the installation script, you are specifying that this is the worker instance, but you're also passing the address of the API instance and the message queue so the worker can pick up requests. The Fractals application installation script can take several parameters.
Parameter | Description | Example |
---|---|---|
-e |
The endpoint URL of the API service. | http://localhost/ |
-m |
The transport URL of the messaging service. | amqp://guest:guest@localhost:5672/ |
-d |
The connection URL for the database (not used here). | sqlite:////tmp/sqlite.db |
Now if you make a request for a new fractal, you connect to the
controller instance, app-controller
, but the work will
actually be performed by a separate worker instance
-app-worker-1
.
Login with SSH and use the Fractal app
Login to the worker instance, app-worker-1
, with SSH,
using the previous added SSH key pair "demokey". Start by getting the IP
address of the worker:
libcloud
../../samples/libcloud/section2.py
Now you can SSH into the instance:
$ ssh -i ~/.ssh/id_rsa USERNAME@IP_WORKER_1
Note
Replace IP_WORKER_1
with the IP address of the worker
instance and USERNAME to the appropriate user name.
Once you've logged in, check to see whether the worker service
process is running as expected. You can find the logs of the worker
service in the directory /var/log/supervisor/
.
worker # ps ax | grep faafo-worker
17210 ? R 7:09 /usr/bin/python /usr/local/bin/faafo-worker
Open top
to monitor the CPU usage of the
faafo-worker
process.
Now log into the controller instance, app-controller
,
also with SSH, using the previously added SSH key pair "demokey".
$ ssh -i ~/.ssh/id_rsa USERNAME@IP_CONTROLLER
Note
Replace IP_CONTROLLER
with the IP address of the
controller instance and USERNAME to the appropriate user name.
Check to see whether the API service process is running like
expected. You can find the logs for the API service in the directory
/var/log/supervisor/
.
controller # ps ax | grep faafo-api
17209 ? Sl 0:19 /usr/bin/python /usr/local/bin/faafo-api
Now call the Fractal application's command line interface
(faafo
) to request a few new fractals. The following
command requests a few fractals with random parameters:
controller # faafo --endpoint-url http://localhost --verbose create
2015-04-02 03:55:02.708 19029 INFO faafo.client [-] generating 6 task(s)
Watch top
on the worker instance. Right after calling
faafo
the faafo-worker
process should start
consuming a lot of CPU cycles.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17210 root 20 0 157216 39312 5716 R 98.8 3.9 12:02.15 faafo-worker
To show the details of a specific fractal use the subcommand
show
of the Faafo CLI.
controller # faafo show 154c7b41-108e-4696-a059-1bde9bf03d0a
+------------+------------------------------------------------------------------+
| Parameter | Value |
+------------+------------------------------------------------------------------+
| uuid | 154c7b41-108e-4696-a059-1bde9bf03d0a |
| duration | 4.163147 seconds |
| dimensions | 649 x 869 pixels |
| iterations | 362 |
| xa | -1.77488588389 |
| xb | 3.08249829401 |
| ya | -1.31213919301 |
| yb | 1.95281690897 |
| size | 71585 bytes |
| checksum | 103c056f709b86f5487a24dd977d3ab88fe093791f4f6b6d1c8924d122031902 |
+------------+------------------------------------------------------------------+
There are more commands available; find out more details about them
with faafo get --help
, faafo list --help
, and
faafo delete --help
.
Note
The application stores the generated fractal images directly in the
database used by the API service instance. Storing image files in
database is not good practice. We're doing it here as an example only as
an easy way to allow multiple instances to have access to the data. For
best practice, we recommend storing objects in Object Storage, which is
covered in section4
.
Next steps
You should now have a basic understanding of the architecture of cloud-based applications. In addition, you now have had practice starting new instances, automatically configuring them at boot, and even modularizing an application so that you may use multiple instances to run it. These are the basic steps for requesting and using compute resources in order to run your application on an OpenStack cloud.
From here, you should go to /section3
to learn how to scale the application
further. Alternately, you may jump to any of these sections:
/section4
: to learn how to make your application more durable using Object Storage/section5
: to migrate the database to block storage, or use the database-as-as-service component/section6
: to automatically orchestrate the application/section7
: to learn about more complex networking/section8
: for advice for developers new to operations
Full example code
Here's every code snippet into a single file, in case you want to run it all in one, or you are so experienced you don't need instruction ;) If you are going to use this, don't forget to set your authentication information and the flavor and image ID.
libcloud
../../samples/libcloud/section2.py