Difference between revisions of "Cloud At Sara"

From LUG
Jump to navigation Jump to search
(New page: = SARA in the CLOUD = <i>by Pjotr Prins</i>, Wageningen University, Dept. of Nematology The super computing center SARA in Amsterdam has a pilot in CLOUD computing in which I participate...)
 
 
Line 63: Line 63:
 
   203  ubuntu-9 runn  0 2097152          host06 01 17:40:52
 
   203  ubuntu-9 runn  0 2097152          host06 01 17:40:52
 
   204  ubuntu-9 runn  0 2097152          host05 01 17:40:50
 
   204  ubuntu-9 runn  0 2097152          host05 01 17:40:50
 +
 +
== Starting an image ==
  
 
Current images are:
 
Current images are:
Line 87: Line 89:
 
   cp debian-lenny-10G.template debian-lenny-default-32.template
 
   cp debian-lenny-10G.template debian-lenny-default-32.template
 
   
 
   
I am greedy so I grab all CPU's (8) and a little more RAM. The diff is
+
I am greedy (needlessly to say) so I grab all CPU's (8) and a little
 +
more RAM. The diff is
  
 
   < CPU    = 1
 
   < CPU    = 1
Line 112: Line 115:
 
   206  cloud10  debian runn  0      0          host03 00 00:00:42
 
   206  cloud10  debian runn  0      0          host03 00 00:00:42
  
and, yes! it is running on host03. Now you know the ID you can get extra info with
+
and, yes! it is running on host03. If it is in a pending state,
 +
kick it to a node with
 +
 
 +
  onevm deploy 206 host03
 +
 
 +
Now you know the ID you can get extra info with
  
 
   onvm show 206
 
   onvm show 206
  
which gives the IP etc.
+
which gives the IP etc. It may take a while to boot, but then
  
 
   ssh cloud@145.100.5.248
 
   ssh cloud@145.100.5.248
Line 124: Line 132:
 
   cloud@debian:~$  
 
   cloud@debian:~$  
  
cool. That looks good. 'top' shows 4 Gb of RAM, 'df' shows I have 8.5 Gb of disk
+
That looks good. 'top' shows 4 Gb of RAM, 'df' shows I have 8.5 Gb of
space. Only problem is I have one CPU! So delete the previous VM
+
disk space. Only problem is I have one CPU I see with cat
 +
/proc/cpuinfo! So deleting the previous VM
  
 
   onevm delete 206
 
   onevm delete 206
  
 +
and recreate, now with 4 CPUs. For this I read some of the Open Nebula
 +
documentation, as the number of CPU's did not seem to matter in the
 +
configuration. The configuration of onevm (part of the open nebula
 +
project, see link below) is in /etc/one/oned.conf which shows in
 +
particular vmm_kvm/vmm_kvm.conf and im_kvm/im_kvm.conf are relevant.
 +
The first contains the maximum of (virtual) CPU's used. This has not
 +
been set, so maybe it defaults to one? Inside the Debian image:
 +
 +
  cloud@debian:~$ dmesg|grep -i cpu
 +
  [    0.000000] Initializing cgroup subsys cpuset
 +
  [    0.000000] Initializing cgroup subsys cpu
 +
  [    0.000000] kvm-clock: cpu 0, msr 0:3baf81, boot clock
 +
  [    0.000000] SMP: Allowing 8 CPUs, 7 hotplug CPUs
 +
  [    0.000000] PERCPU: Allocating 37960 bytes of per cpu data
 +
  [    0.000000] NR_CPUS: 8, nr_cpu_ids: 8
 +
 +
Ah! The image sees 8 CPU's, but only uses one. This is hotplugging
 +
CPU's, a feature of the Linux Kernel. You can switch them on with
 +
 +
  http://www.cyberciti.biz/faq/debian-rhel-centos-redhat-suse-hotplug-cpu/
 +
 +
My help desk Floris Sluiter, of SARA, suggests:
 +
 +
  To enable multicore machines you can add "vCPU = n" to your template:
 +
  NAME  = debian
 +
  CPU    = 2 # NOTE:  CPU and vCPU should be the same!
 +
  vCPU  = 2
 +
  MEMORY = 2048
 +
 +
  NOTE: the number of cores after CPU and vCPU should be the same
 +
  (otherwise you risk interfering with other users). This is because
 +
  CPU is used by the scheduler and vCPU is used by the image.
 +
 +
Restarting the VM takes a while, so I tail it with
 +
 +
  onevm list
 +
  tail -f /var/log/one/218.log
 +
 +
and yes, now I have all my CPU's listed in 'top'!
 +
 +
== Securing the image ==
 +
 +
Time to start securing my image. Just by adding a user:
 +
 +
  sudo bash
 +
  adduser yourname
 +
  passwd root (something very safe)
 +
  passwd cloud (something safe)
 +
 +
next I edit /etc/ssh/sshd_config to disable root logins by adding
 +
 +
  PermitRootLogin no
 +
 +
followed by
 +
 +
  /etc/init.d/ssh restart
 +
 +
change the entry in sudo from cloud into yourname
 +
 +
  visudo
 +
 +
that means to gain root I have login as a different user. This is
 +
standard protocol, for me. To really secure your system you may want
 +
to consider disabling password logins (use ssh keys instead), and
 +
perhaps limit the number of originating IP addresses. But anyway, that
 +
is beyond this document, and the security really stands or falls with
 +
the security of the KVM host system (which is controlled by the CLOUD
 +
facility at SARA), which is beyond my control. Anyone can take down
 +
my VM and mount the image on the local disk system, leaving it open
 +
to compromise. For now this is prevents immediate strangers from
 +
entering your VM.
 +
 +
One nice aspect is that we can clone this image later. So all the
 +
actions we have on this 'master' image won't have to be repeated for
 +
the others.
 +
 +
== Backup the image ==
 +
 +
At this point you may want to clone your updated disk image (just copy
 +
it to something else). This can be done at any point in time. Do stop
 +
the VM first - otherwise there is a possibility the image file system
 +
is not consistent. Cloning a live VM is possibible, but that is
 +
something else.
 +
 +
Unfortunately in the current setup at SARA OpenNebula does not
 +
cleanly copy back the image. The workaround is to *stop* the running
 +
VM and copy the image located at /home/oneadmin/ID. So
 +
 +
  onvm stop 218
 +
  (wait a bit until onvm list shows 'stop')
 +
  cp -vau /home/one/218/images/disk.0 save.img
 +
  onevm resume 218
 +
 +
I tested this saved image. It works.
 +
 +
== Installing software ==
 +
 +
Next step is to install software. On Debian I have a list of packages
 +
I normally install/update, for example:
 +
 +
  apt-get update
 +
  apt-get install less openssh-server openssh-client \
 +
    vim bzip2 ruby perl \
 +
    screen unzip \
 +
    sudo gnupg locales lynx \
 +
    mc rsync python make astyle gcc ncurses-bin \
 +
    unzip tree git-core tzdata psmisc
 +
 +
With Debian you may want to
 +
 +
  dpkg-reconfigure locales
 +
  dpkg-reconfigure tzdata
 +
 +
Once all this is done, make a backup of your image. You don't want to
 +
repeat this work.
 +
 +
== Adding a drive ==
 +
 +
I am going to build Nix package (see nixos.org). For several reasons,
 +
but mostly because I like reproducible systems. One tool I'll need is
 +
subversion, so
 +
 +
  apt-get install subversion
 +
 +
Now visit my nix_on_debian writeup (in preparation)
 +
 +
== Using NFS ==
 +
 +
  apt-get install nfs-kernel-server nfs-common
 +
 +
== Notes ==
  
 
== MORE HELP ==  
 
== MORE HELP ==  

Latest revision as of 18:30, 18 November 2009

SARA in the CLOUD

by Pjotr Prins, Wageningen University, Dept. of Nematology

The super computing center SARA in Amsterdam has a pilot in CLOUD computing in which I participate as a tester. Here I quickly document my experiences over november 2009.

The setup consists of five dual-quad CPU's - i.e. 40 cores (additional to Nematology/Bioinformatics 24 cores' in my 'CLOUD') available for some hard hitting on top of Ubuntu with KVM. SARA supplies a few images and documentation.

SSH

The login to the control server (named ui) is via:

 ssh -XC user@ui.grid.sara.nl
 ssh -XC user@ui.claudia.sara.nl

Easiest to provide a tunnel, say

 ssh -L 20201:ui.claudia.sara.nl:22 -f -N user@ui.grid.sara.nl

And now you can ssh and scp etc. directly through

 ssh -p 20201 user@localhost 

easiest to setup a key too. On your local machine:

 ssh-keygen -t dsa

and add the content of ~/.ssh/id_dsa.pub to the ~/.ssh/authorized_keys file on UI (and later any images) as described here. Using ssh keys allows you to login without typing passwords every time.

UI commands

A number of commands are listed on the CLAUDIA wiki listed above. Additionally

 for x in 0 1 2 3 4 5 ;  do echo -n $x :; onehost show $x | grep USEDCPU ; done
 0 :USEDCPU=0.799999999999955
 1 :USEDCPU=0.799999999999955
 2 :USEDCPU=0.799999999999955
 3 :USEDCPU=0.799999999999955
 4 :USEDCPU=3.20000000000005
 5 :USEDCPU=1.60000000000002

You can see the CLOUD is not busy. Currently there are two types of images running:

 user@ui:~$ onevm list
   ID      NAME STAT CPU     MEM        HOSTNAME        TIME
  193  vm_cloud runn   0 4194304          host05 05 21:02:00
  194  ubuntu-9 runn   0 2097152          host01 02 23:07:56
  195  ubuntu-9 runn   0 2097152          host02 02 02:38:10
  200    debian runn   0 2097152          host03 01 19:36:52
  201    debian runn   0 2097152          host06 01 19:14:56
  202  ubuntu-9 runn   0 2097152          host04 01 17:40:53
  203  ubuntu-9 runn   0 2097152          host06 01 17:40:52
  204  ubuntu-9 runn   0 2097152          host05 01 17:40:50

Starting an image

Current images are:

 user@ui:~$ ls /data/images/
 debian-503-i386-CD-1.iso  scilin5.3-10G.img  ubuntu-9.10-10G.img
 debian-lenny-10G.img      ttylinux.img       ubuntu-9.10-server-i386.iso

These are 32-bit images. A 64-bit image I have to make myself - will do that a bit later. First and foremost I am a Debian-guy, so the first bit is easy. Copy the image to use your own - you don't want to update a shared image:

 cp /data/images/debian-lenny-10G.img ~

copy and edit the template to match:

 cp /data/templates/debian-lenny-10G.template ~
 vi ~/debian-lenny-10G.template 

First I rename my image

 cp debian-lenny-10G.img debian-lenny-default-32.img
 cp debian-lenny-10G.template debian-lenny-default-32.template

I am greedy (needlessly to say) so I grab all CPU's (8) and a little more RAM. The diff is

 < CPU    = 1
 < MEMORY = 2048
 ---
 > CPU    = 8
 > MEMORY = 4096
 <   source   = "/data/images/debian-lenny-10G.img",
 ---
 >   source   = "/home/cloud10/debian-lenny-default32.img",

and

 onevm create debian-lenny-default-32.template
 onevm list
   205  cloud10   debian fail   0       0          host03 00 00:00:09

not good. I had used the wrong name for source. The log can be found in

 cat /var/log/one/205.log

Try again and it reads:

  206  cloud10   debian runn   0       0          host03 00 00:00:42

and, yes! it is running on host03. If it is in a pending state, kick it to a node with

 onevm deploy 206 host03

Now you know the ID you can get extra info with

 onvm show 206

which gives the IP etc. It may take a while to boot, but then

 ssh cloud@145.100.5.248
 Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
 permitted by applicable law.
 Last login: Tue Nov  3 06:17:28 2009
 cloud@debian:~$ 

That looks good. 'top' shows 4 Gb of RAM, 'df' shows I have 8.5 Gb of disk space. Only problem is I have one CPU I see with cat /proc/cpuinfo! So deleting the previous VM

 onevm delete 206

and recreate, now with 4 CPUs. For this I read some of the Open Nebula documentation, as the number of CPU's did not seem to matter in the configuration. The configuration of onevm (part of the open nebula project, see link below) is in /etc/one/oned.conf which shows in particular vmm_kvm/vmm_kvm.conf and im_kvm/im_kvm.conf are relevant. The first contains the maximum of (virtual) CPU's used. This has not been set, so maybe it defaults to one? Inside the Debian image:

 cloud@debian:~$ dmesg|grep -i cpu
 [    0.000000] Initializing cgroup subsys cpuset
 [    0.000000] Initializing cgroup subsys cpu
 [    0.000000] kvm-clock: cpu 0, msr 0:3baf81, boot clock
 [    0.000000] SMP: Allowing 8 CPUs, 7 hotplug CPUs
 [    0.000000] PERCPU: Allocating 37960 bytes of per cpu data
 [    0.000000] NR_CPUS: 8, nr_cpu_ids: 8

Ah! The image sees 8 CPU's, but only uses one. This is hotplugging CPU's, a feature of the Linux Kernel. You can switch them on with

 http://www.cyberciti.biz/faq/debian-rhel-centos-redhat-suse-hotplug-cpu/

My help desk Floris Sluiter, of SARA, suggests:

 To enable multicore machines you can add "vCPU = n" to your template:
 NAME   = debian
 CPU    = 2 # NOTE:  CPU and vCPU should be the same!
 vCPU   = 2
 MEMORY = 2048
 NOTE: the number of cores after CPU and vCPU should be the same
 (otherwise you risk interfering with other users). This is because
 CPU is used by the scheduler and vCPU is used by the image.

Restarting the VM takes a while, so I tail it with

 onevm list
 tail -f /var/log/one/218.log 

and yes, now I have all my CPU's listed in 'top'!

Securing the image

Time to start securing my image. Just by adding a user:

 sudo bash
 adduser yourname
 passwd root (something very safe)
 passwd cloud (something safe)

next I edit /etc/ssh/sshd_config to disable root logins by adding

 PermitRootLogin no

followed by

 /etc/init.d/ssh restart 

change the entry in sudo from cloud into yourname

 visudo

that means to gain root I have login as a different user. This is standard protocol, for me. To really secure your system you may want to consider disabling password logins (use ssh keys instead), and perhaps limit the number of originating IP addresses. But anyway, that is beyond this document, and the security really stands or falls with the security of the KVM host system (which is controlled by the CLOUD facility at SARA), which is beyond my control. Anyone can take down my VM and mount the image on the local disk system, leaving it open to compromise. For now this is prevents immediate strangers from entering your VM.

One nice aspect is that we can clone this image later. So all the actions we have on this 'master' image won't have to be repeated for the others.

Backup the image

At this point you may want to clone your updated disk image (just copy it to something else). This can be done at any point in time. Do stop the VM first - otherwise there is a possibility the image file system is not consistent. Cloning a live VM is possibible, but that is something else.

Unfortunately in the current setup at SARA OpenNebula does not cleanly copy back the image. The workaround is to *stop* the running VM and copy the image located at /home/oneadmin/ID. So

 onvm stop 218
 (wait a bit until onvm list shows 'stop')
 cp -vau /home/one/218/images/disk.0 save.img
 onevm resume 218

I tested this saved image. It works.

Installing software

Next step is to install software. On Debian I have a list of packages I normally install/update, for example:

 apt-get update
 apt-get install less openssh-server openssh-client \
    vim bzip2 ruby perl \
    screen unzip \
    sudo gnupg locales lynx \
    mc rsync python make astyle gcc ncurses-bin \
    unzip tree git-core tzdata psmisc

With Debian you may want to

 dpkg-reconfigure locales
 dpkg-reconfigure tzdata

Once all this is done, make a backup of your image. You don't want to repeat this work.

Adding a drive

I am going to build Nix package (see nixos.org). For several reasons, but mostly because I like reproducible systems. One tool I'll need is subversion, so

 apt-get install subversion

Now visit my nix_on_debian writeup (in preparation)

Using NFS

 apt-get install nfs-kernel-server nfs-common

Notes

MORE HELP