In: Computer Science
Please explain the answer as much as possible. If picture can be drawn to explain please draw the picture as well.
1. Demonstrate Containerization knowledge and experience
2. Demonstrate usage of DockerHub
3. Demonstrate an understanding of the virtual hosts configuration in Apache2
4. Demonstrate the implementation of the https protocol in Apache2
Containerization knowledge and experience
There are many ways for operations teams to approach hardware management. Some build dedicated hardware for different applications within their stack. Some use virtualization to simplify the task or manage dozens of applications. Both of these approaches come with serious drawbacks. Dedicated hardware is costly, both for the hardware and the power needed to run it. What’s more, space is at a premium in many data centers. Adding a new server to an environment which is already squeezed for space is like the world’s worst Tetris game.
Likewise, virtualization comes along with some serious problems. Virtual machines come with a whole host of issues in terms of dedicated resource management. They simplify the task of working in ever-changing data centers, but can be difficult to scale. Being able to run four virtual machines on a single piece of dedicated hardware is a boon for space and energy-starved operations teams. But it still means that you’re devoting time and energy to maintaining redundant operating systems.
In the last five or so years, teams have been solving these problems by adopting containerization. If you’re curious about containerization and what it can do for your organization, read on. We’re going to dive into just how it works, and some best practices that’ll unlock the true power of containers for your ops organization.
What Is Containerization?
In a lot of ways, containerization is best thought of as the natural evolution of virtualization. Virtualization treats each virtual machine as its own logically (but not physically) distinct server. Containerization treats each application as its own logically distinct server. Multiple applications will share one underlying operating system. Those containers don’t know that any other containers are running on their dedicated hardware. If they want to communicate with another server, they need to communicate via a network interface, just like if they were on different physical devices.
Layered File System
The benefit is that you don’t need to devote hardware resources to redundant operations. Instead of needing to dedicate CPU cores and memory to an operating system for each virtual machine, you build a server with one underlying virtual machine, and only dedicate cores and memory for the single application that runs in each container.
How Does Containerization Work?
We’ve hinted at this, but application containers work by virtualizing the operating system – specifically the Linux kernel. Each application container thinks that it’s the only application running on a machine. Each container is defined as a single running application and a set of supporting libraries. One significant way that containers differ from virtual machines is that containers are immutable. That is, they can’t be modified by any action that you take. This is on purpose—it means that every time you create a container, it’ll be the same.
No matter where you create it, on what hardware or underlying operating system, the container will work exactly the same. That kind of consistency eliminates an entire class of bugs. If you’re used to virtual machine snapshots, it’s easy to think of a container in that way. Each time you start a container, it’s restored to the initial snapshot no matter what actions you took the last time it ran.
That approach comes with some drawbacks, but it also packs a real punch for an ops team. It means that each container will run exactly the same, no matter where you set it up. A container running on a developer’s laptop will run exactly the same as that container in your data center. This consistency eliminates an entire class of development and deployment bugs. Your team never has to worry about some developer writing code against a different version of PostgreSQL or Java than your application uses in production. Your environment is identical everywhere it runs.
usage of DockerHub
Docker Hub is a service provided by Docker for finding and sharing container images with your team. It provides the following major features:
Step 1: Sign up for Docker Hub
Start by creating an account.
Step 2: Create your first repository
To create a repo:
Sign in to Docker Hub.
Click on Create a Repository on the Docker Hub welcome page:
Name it <your-username>/my-first-repo as shown below. Select Private:
You’ve created your first repo. You should see:
Step 3: Download and install Docker Desktop
We’ll need to download Docker Desktop to build and push a container image to Docker Hub.
Download and install Docker Desktop. If on Linux, download Docker Engine - Community.
Open the terminal and sign in to Docker Hub on your computer by
running docker login
.
Step 4: Build and push a container image to Docker Hub from your computer
cat > Dockerfile <<EOF
FROM busybox
CMD echo "Hello world! This is my first Docker image."
EOF
Run docker build -t <your_username>/my-first-repo
.
to build your Docker image.
Test your docker image locally by running docker run
<your_username>/my-first-repo
.
Run docker push <your_username>/my-first-repo
to push your Docker image to Docker Hub.
You should see output similar to:
And in Docker Hub, your repository should have a new
latest
tag available under Tags:
Congratulations! You’ve successfully:
Next steps
virtual hosts configuration in Apache2
What is Apache Virtual Hosts?
Virtual Host term refers to the method of running more than one website such as host1.domain.com, host2.domain.com, or www.domain1.com, www.domain2.com etc., on a single system. There are two types of Virtual Hosting in Apache, namely IP-based virtual hosting and name-based virtual hosting. With IP-based virtual hosting, you can host multiple websites or domains on the same system, but each website/domain has different IP address. With name-based virtual hosting, you can host multiple websites/domains on the same IP address. Virtual hosting can be useful if you want to host multiple websites and domains from a single physical server or VPS. Hope you got the basic idea of Apache virtual hosts. Today, we are going to see how to configure Apache virtual hosts in Ubuntu 18.04 LTS.
Download – Free eBook: “Apache HTTP Server Cookbook”
Configure Apache Virtual Hosts in Ubuntu 18.04 LTS
My test box IP address is 192.168.225.22 and host name is ubuntuserver.
First, we will see how to configure name-based virtual hosts in Apache webserver.
Configure name-based virtual hosts
1. Install Apache webserver
Make sure you have installed Apache webserver. To install it on Ubuntu, run:
$ sudo apt-get install apache2
Once apache is installed, test if it is working or not by browsing the apache test page in the browser.
Open your web browser and point it to http://IP_Address or http://localhost. You should see a page like below.
Good! Apache webserver is up and working!!
2. Create web directory for each host
I am going to create two virtual hosts, namely ostechnix1.lan and ostechnix2.lan.
Let us create a directory for first virtual host ostechnix1.lan. This directory is required for storing the data of our virtual hosts.
To do so, enter:
$ sudo mkdir -p /var/www/html/ostechnix1.lan/public_html
Likewise, create a directory for second virtual host ostechnix2.lan as shown below.
$ sudo mkdir -p /var/www/html/ostechnix2.lan/public_html
The above two directories are owned by root user. We need to change the ownership to the regular user.
To do so, run:
$ sudo chown -R $USER:$USER /var/www/html/ostechnix1.lan/public_html
$ sudo chown -R $USER:$USER /var/www/html/ostechnix2.lan/public_html
Here, $USER refers the currently logged-in user.
Next, set read permissions to the Apache root directory i.e /var/www/html/ using command:
$ sudo chmod -R 755 /var/www/html/
We do this because we already created a separate directory for each virtual host for storing their data. So we made the apache root directory as read only for all users except the root user.
We have created required directories for storing data of each virtual host, setup proper permissions. Now, it is time to create some sample pages which will be served from each virtual host.
3. Create demo web pages for each host
Let us create a sample page for ostechnix1.lan site. To do so, run:
$ sudo vi /var/www/html/ostechnix1.lan/public_html/index.html
Add the following lines in it:
<html> <head> <title>www.ostechnix.lan</title> </head> <body> <h1>Hello, This is a test page for ostechnix1.lan website</h1> </body> </html>
Save and close the file.
Likewise, create a sample page for ostechnix2.lan site:
$ sudo vi /var/www/html/ostechnix2.lan/public_html/index.html
Add the following lines in it:
<html> <head> <title>www.ostechnix.lan</title> </head> <body> <h1>Hello, This is a test page for ostechnix2.lan website</h1> </body> </html>
Save and close the file.
4. Create configuration file for each host
Next, we need to create configuration files for each virtual host. First, let us do this for ostechnix1.lan site.
Copy the default virtual host file called 000-default.conf contents to the new virtual host files like below.
$ sudo cp /etc/apache2/sites-available/000-default.conf /etc/apache2/sites-available/ostechnix1.lan.conf
$ sudo cp /etc/apache2/sites-available/000-default.conf /etc/apache2/sites-available/ostechnix2.lan.conf
Please be mindful that you must save all configuration files with .conf extension at the end, otherwise it will not work.
Now, modify the configuration files to match with our virtual hosts.
Edit ostechnix.lan1.conf file:
$ sudo vi /etc/apache2/sites-available/ostechnix1.lan.conf
Edit/modify ServerAdmin, ServerName, ServerAlias and DocumentRoot values matches to virtual host.
<VirtualHost *:80> # The ServerName directive sets the request scheme, hostname and port that # the server uses to identify itself. This is used when creating # redirection URLs. In the context of virtual hosts, the ServerName # specifies what hostname must appear in the request's Host: header to # match this virtual host. For the default virtual host (this file) this # value is not decisive as it is used as a last resort host regardless. # However, you must set it for any further virtual host explicitly. #ServerName www.example.com ServerAdmin [email protected] ServerName ostechnix1.lan ServerAlias www.ostechnix1.lan DocumentRoot /var/www/html/ostechnix1.lan/public_html # Available loglevels: trace8, ..., trace1, debug, info, notice, warn, # error, crit, alert, emerg. # It is also possible to configure the loglevel for particular # modules, e.g. #LogLevel info ssl:warn ErrorLog ${APACHE_LOG_DIR}/error.log CustomLog ${APACHE_LOG_DIR}/access.log combined # For most configuration files from conf-available/, which are # enabled or disabled at a global level, it is possible to # include a line for only one particular virtual host. For example the # following line enables the CGI configuration for this host only # after it has been globally disabled with "a2disconf". #Include conf-available/serve-cgi-bin.conf </VirtualHost>
Save and close the file.
Next, edit ostechnix2.lan.conf file:
$ sudo vi /etc/apache2/sites-available/ostechnix2.lan.conf
Make the necessary changes.
<VirtualHost *:80> # The ServerName directive sets the request scheme, hostname and port that # the server uses to identify itself. This is used when creating # redirection URLs. In the context of virtual hosts, the ServerName # specifies what hostname must appear in the request's Host: header to # match this virtual host. For the default virtual host (this file) this # value is not decisive as it is used as a last resort host regardless. # However, you must set it for any further virtual host explicitly. #ServerName www.example.com ServerAdmin [email protected] ServerName ostechnix2.lan ServerAlias www.ostechnix2.lan DocumentRoot /var/www/html/ostechnix2.lan/public_html # Available loglevels: trace8, ..., trace1, debug, info, notice, warn, # error, crit, alert, emerg. # It is also possible to configure the loglevel for particular # modules, e.g. #LogLevel info ssl:warn ErrorLog ${APACHE_LOG_DIR}/error.log CustomLog ${APACHE_LOG_DIR}/access.log combined # For most configuration files from conf-available/, which are # enabled or disabled at a global level, it is possible to # include a line for only one particular virtual host. For example the # following line enables the CGI configuration for this host only # after it has been globally disabled with "a2disconf". #Include conf-available/serve-cgi-bin.conf </VirtualHost>
Save/close the file.
5. Enable virtual host configuration files
After making the necessary changes, disable the default virtual host config file i.e 000.default.conf, and enable all newly created virtual host config files as shown below.
$ sudo a2dissite 000-default.conf
$ sudo a2ensite ostechnix1.lan.conf
$ sudo a2ensite ostechnix2.lan.conf
Restart apache web server to take effect the changes.
$ sudo systemctl restart apache2
That’s it. We have successfully configured virtual hosts in Apache. Let us go ahead and check whether they are working or not.
6. Test Virtual hosts
Open /etc/hosts file in any editor:
$ sudo vi /etc/hosts
Add all your virtual websites/domains one by one like below.
[...] 192.168.225.22 ostechnix1.lan 192.168.225.22 ostechnix2.lan [...]
Please note that if you want to access the virtual hosts from any remote systems, you must add the above lines in each remote system’s /etc/hosts file.
Save and close the file.
Open up your web browser and point it to http://ostechnix1.lan or http://ostechnix2.lan.
ostechnix1.lan test page:
ostechnix2.lan test page:
Congratulations! You can now be able to access your all websites. From now on, you can upload the data and serve them from different websites.
As you noticed, we have used the same IP address (i.e 192.168.225.22) for host two different websites (http://ostechnix1.lan and http://ostechnix2.lan). This is what we call name-based virtual hosting. Hope this helps. I will show you how to configure IP-based virtual hosting in the next guide. Until then, stay tuned!
https protocol in Apache2
The HTTP/2 protocol
HTTP/2 is the evolution of the world's most successful application layer protocol, HTTP. It focuses on making more efficient use of network resources. It does not change the fundamentals of HTTP, the semantics. There are still request and responses and headers and all that. So, if you already know HTTP/1, you know 95% about HTTP/2 as well.
There has been a lot written about HTTP/2 and how it works. The most normative is, of course, its RFC 7540 (also available in more readable formatting, YMMV). So, there you'll find the nuts and bolts.
But, as RFC do, it's not really a good thing to read first. It's better to first understand what a thing wants to do and then read the RFC about how it is done. A much better document to start with is http2 explained by Daniel Stenberg, the author of curl. It is available in an ever growing list of languages, too!
Too Long, Didn't read: there are some new terms and gotchas that need to be kept in mind while reading this document:
HTTP/2 in Apache httpd
The HTTP/2 protocol is implemented by its own httpd module,
aptly named mod_http2
. It implements the complete set
of features described by RFC 7540 and supports HTTP/2 over
cleartext (http:), as well as secure (https:) connections. The
cleartext variant is named 'h2c
', the secure one
'h2
'. For h2c
it allows the
direct mode and the Upgrade:
via an initial
HTTP/1 request.
One feature of HTTP/2 that offers new capabilities for web developers is Server Push. See that section on how your web application can make use of it.
Build httpd with HTTP/2 support
mod_http2
uses the library of nghttp2 as its
implementation base. In order to build mod_http2
you
need at least version 1.2.1 of libnghttp2
installed on
your system.
When you ./configure
you Apache httpd source tree,
you need to give it '--enable-http2
' as additional
argument to trigger the build of the module. Should your
libnghttp2
reside in an unusual place (whatever that
is on your operating system), you may announce its location with
'--with-nghttp2=<path>
' to
configure
.
While that should do the trick for most, they are people who
might prefer a statically linked nghttp2
in this
module. For those, the option
--enable-nghttp2-staticlib-deps
exists. It works quite
similar to how one statically links openssl to
mod_ssl
.
Speaking of SSL, you need to be aware that most browsers will
speak HTTP/2 only on https:
URLs, so you need a server
with SSL support. But not only that, you will need a SSL library
that supports the ALPN
extension. If OpenSSL is the
library you use, you need at least version 1.0.2.
Basic Configuration
When you have a httpd
built with
mod_http2
you need some basic configuration for it
becoming active. The first thing, as with every Apache module, is
that you need to load it:
LoadModule http2_module modules/mod_http2.so
The second directive you need to add to your server configuration is
Protocols h2 http/1.1
This allows h2, the secure variant, to be the preferred protocol on your server connections. When you want to enable all HTTP/2 variants, you simply write:
Protocols h2 h2c http/1.1
Depending on where you put this directive, it affects all connections or just the ones to a certain virtual host. You can nest it, as in:
Protocols http/1.1 <VirtualHost ...> ServerName test.example.org Protocols h2 http/1.1 </VirtualHost>
This allows only HTTP/1 on connections, except SSL connections
to test.example.org
which offer HTTP/2.
Choose a strong SSLCipherSuite
The SSLCipherSuite
needs to be configured with a
strong TLS cipher suite. The current version of
mod_http2
does not enforce any cipher but most clients
do so. Pointing a browser to a h2
enabled server with
a inappropriate cipher suite will force it to simply refuse and
fall back to HTTP 1.1. This is a common mistake that is done while
configuring httpd for HTTP/2 the first time, so please keep it in
mind to avoid long debugging sessions! If you want to be sure about
the cipher suite to choose please avoid the ones listed in the
HTTP/2 TLS reject list.
The order of protocols mentioned is also relevant. By default, the first one is the most preferred protocol. When a client offers multiple choices, the one most to the left is selected. In
Protocols http/1.1 h2
the most preferred protocol is HTTP/1 and it will always be selected unless a client only supports h2. Since we want to talk HTTP/2 to clients that support it, the better order is
Protocols h2 h2c http/1.1
There is one more thing to ordering: the client has its own preferences, too. If you want, you can configure your server to select the protocol most preferred by the client:
ProtocolsHonorOrder Off
makes the order you wrote the Protocols irrelevant and only the client's ordering will decide.
A last thing: the protocols you configure are not checked for
correctness or spelling. You can mention protocols that do not
exist, so there is no need to guard Protocols
with any
<IfModule>
checks.
For more advanced tips on configuration, see the modules section about dimensioning and how to manage multiple hosts with the same certificate.
MPM Configuration
HTTP/2 is supported in all multi-processing modules that come
with httpd. However, if you use the prefork
mpm, there
will be severe restrictions.
In prefork
, mod_http2
will only
process one request at at time per connection. But clients, such as
browsers, will send many requests at the same time. If one of these
takes long to process (or is a long polling one), the other
requests will stall.
mod_http2
will not work around this limit by
default. The reason is that prefork
is today only
chosen, if you run processing engines that are not prepared for
multi-threading, e.g. will crash with more than one request.
If your setup can handle it, configuring event
mpm
is nowadays the best one (if supported on your platform).
If you are really stuck with prefork
and want
multiple requests, you can tweak the H2MinWorkers
to
make that possible. If it breaks, however, you own both parts.
Clients
Almost all modern browsers support HTTP/2, but only over SSL connections: Firefox (v43), Chrome (v45), Safari (since v9), iOS Safari (v9), Opera (v35), Chrome for Android (v49) and Internet Explorer (v11 on Windows10) (source).
Other clients, as well as servers, are listed on the Implementations wiki, among them implementations for c, c++, common lisp, dart, erlang, haskell, java, nodejs, php, python, perl, ruby, rust, scala and swift.
Several of the non-browser client implementations support HTTP/2 over cleartext, h2c. The most versatile being curl.
Useful tools to debug HTTP/2
The first tool to mention is of course curl. Please make sure
that your version supports HTTP/2 checking its
Features
:
$ curl -V curl 7.45.0 (x86_64-apple-darwin15.0.0) libcurl/7.45.0 OpenSSL/1.0.2d zlib/1.2.8 nghttp2/1.3.4 Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 [...] Features: IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP HTTP2
Mac OS homebrew notes
brew install curl --with-openssl --with-nghttp2
And for really deep inspection wireshark.
The nghttp2 package also includes clients, such as:
Chrome offers detailed HTTP/2 logs on its connections via the special net-internals page. There is also an interesting extension for Chrome and Firefox to visualize when your browser is using HTTP/2.
Server Push
The HTTP/2 protocol allows the server to PUSH responses to a client it never asked for. The tone of the conversation is: "here is a request that you never sent and the response to it will arrive soon..."
But there are restrictions: the client can disable this feature and the server may only ever PUSH on a request that came from the client.
The intention is to allow the server to send resources to the client that it will most likely need: a css or javascript resource that belongs to a html page the client requested. A set of images that is referenced by a css, etc.
The advantage for the client is that it saves the time to send the request which may range from a few milliseconds to half a second, depending on where on the globe both are located. The disadvantage is that the client may get sent things it already has in its cache. Sure, HTTP/2 allows for the early cancellation of such requests, but still there are resources wasted.
To summarize: there is no one good strategy on how to make best use of this feature of HTTP/2 and everyone is still experimenting. So, how do you experiment with it in Apache httpd?
mod_http2
inspect response header for
Link
headers in a certain format:
Link </xxx.css>;rel=preload, </xxx.js>; rel=preload
If the connection supports PUSH, these two resources will be sent to the client. As a web developer, you may set these headers either directly in your application response or you configure the server via
<Location /xxx.html> Header add Link "</xxx.css>;rel=preload" Header add Link "</xxx.js>;rel=preload" </Location>
If you want to use preload
links without triggering
a PUSH, you can use the nopush
parameter, as in
Link </xxx.css>;rel=preload;nopush
or you may disable PUSHes for your server entirely with the directive
H2Push Off
And there is more:
The module will keep a diary of what has been PUSHed for each connection (hashes of URLs, basically) and will not PUSH the same resource twice. When the connection closes, this information is discarded.
There are people thinking about how a client can tell a server what it already has, so PUSHes for those things can be avoided, but this is all highly experimental right now.
Another experimental draft that has been implemented in
mod_http2
is the Accept-Push-Policy Header Field where
a client can, for each request, define what kind of PUSHes it
accepts.
PUSH might not always trigger the request/response/performance that one expects or hopes for. There are various studies on this topic to be found on the web that explain benefits and weaknesses and how different features of client and network influence the outcome. For example: just because the server PUSHes a resource does not mean a browser will actually use the data.
The major thing that influences the response being PUSHed is the
request that was simulated. The request URL for a PUSH is given by
the application, but where do the request headers come from? For
example, will the PUSH request a accept-language
header and if yes with what value?
Apache will look at the original request (the one that triggered
the PUSH) and copy the following headers over to PUSH requests:
user-agent
, accept
,
accept-encoding
, accept-language
,
cache-control
.
All other headers are ignored. Cookies will also not be copied over. PUSHing resources that require a cookie to be present will not work. This can be a matter of debate. But unless this is more clearly discussed with browser, let's err on the side of caution and not expose cookie where they might ordinarily not be visible.
Early Hints
An alternative to PUSHing resources is to send Link
headers to the client before the response is even ready. This uses
the HTTP feature called "Early Hints" and is described in RFC
8297.
In order to use this, you need to explicitly enable it on the server via
H2EarlyHints on
(It is not enabled by default since some older browser tripped on such responses.)
If this feature is on, you can use the directive
H2PushResource
to trigger early hints and resource
PUSHes:
<Location /xxx.html> H2PushResource /xxx.css H2PushResource /xxx.js </Location>
This will send out a "103 Early Hints"
response to
a client as soon as the server starts processing the
request. This may be much early than the time the first response
headers have been determined, depending on your web
application.
If H2Push
is enabled, this will also start the PUSH
right after the 103 response. If H2Push
is disabled
however, the 103 response will be send nevertheless to the
client.