SQL Server failover cluster, VSphere, & SCSI-3 reservation nightmares

October 22, 2014, 9:53 pm

≫ Next: Windows Server 2012 R2: solving .NET Framework 3.5 installation problems

When I have to install a virtualized SQL Server FCI at a customer place as an SQL Server consultant, the virtualized environment usally is ready. I guess this is the same for most part of the database consulting people. Since we therefore lack practice, I have to admit that we do not always know the good configuration settings to apply to the virtualized layer in order to correctly run our SQL Server FCI architecture.

A couple of days ago, I had an interesting case where I had to help my customer to correctly configure the storage layer on VSphere 5.1. First, I would like to thank my customer because I seldom have the opportunity to deal with VMWare (other than via my personal lab).

The story begins with a failover testing that fails randomly on a SQL Server FCI after switching the SQL Server volumes from VMFS to RDM in a physical compatibility mode. We had to switch because the first configuration was installed in CIB configuration (aka Cluster-In-Box configuration). As you certainly know, it does not provide a full additional layer of high availability over VMWare in this case, because all the virtual machines are on the same host. So we decided to move to a CAB configuration (Cluster across Box) scenario that is more reliable than the first configuration.

In the new configuration, a failover triggers a Windows 170 error randomly with this brief description: “The resource is busy”. At this point, I suspected that the ISCI-3 reservation was not performed correctly, but the cluster failover validate report didn’t show any errors concerning the SCSI-3 reservation. I then decided to generate the cluster log to see if I could find more information about my problem – and here is what I found:

00000fb8.000006e4::2014/10/07-17:50:34.116 INFO [RES] Physical Disk : HardDiskpQueryDiskFromStm: ClusterStmFindDisk returned device=’?scsi#disk&ven_dgc&prod_vraid#5&effe51&0&000c00#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}’

00000fb8.000016a8::2014/10/07-17:50:34.116 INFO [RES] Physical Disk : ResHardDiskArbitrateInternal request Not a Space: Uses FastPath

00000fb8.000016a8::2014/10/07-17:50:34.116 INFO [RES] Physical Disk : ResHardDiskArbitrateInternal: Clusdisk driver handle or event handle is NULL.

00000fb8.000016a8::2014/10/07-17:50:34.116 INFO [RES] Physical Disk : HardDiskpQueryDiskFromStm: ClusterStmFindDisk returned device=’?scsi#disk&ven_dgc&prod_vraid#5&effe51&0&000d00#{53f56307-b6bf-11d0-94f2-00a0c91efb8b}’

00000fb8.000006e4::2014/10/07-17:50:34.117 INFO [RES] Physical Disk : Arbitrate – Node using PR key a6c936d60001734d

00000fb8.000016a8::2014/10/07-17:50:34.118 INFO [RES] Physical Disk : Arbitrate – Node using PR key a6c936d60001734d

00000fb8.000006e4::2014/10/07-17:50:34.120 INFO [RES] Physical Disk : HardDiskpPRArbitrate: Fast Path arbitration…

00000fb8.000016a8::2014/10/07-17:50:34.121 INFO [RES] Physical Disk : HardDiskpPRArbitrate: Fast Path arbitration…

00000fb8.000016a8::2014/10/07-17:50:34.122 WARN [RES] Physical Disk : PR reserve failed, status 170

00000fb8.000006e4::2014/10/07-17:50:34.122 INFO [RES] Physical Disk : Successful reserve, key a6c936d60001734d

00000fb8.000016a8::2014/10/07-17:50:34.123 ERR [RES] Physical Disk : HardDiskpPRArbitrate: Error exit, unregistering key…

00000fb8.000016a8::2014/10/07-17:50:34.123 ERR [RES] Physical Disk : ResHardDiskArbitrateInternal: PR Arbitration for disk Error: 170.

00000fb8.000016a8::2014/10/07-17:50:34.123 ERR [RES] Physical Disk : OnlineThread: Unable to arbitrate for the disk. Error: 170.

00000fb8.000016a8::2014/10/07-17:50:34.124 ERR [RES] Physical Disk : OnlineThread: Error 170 bringing resource online.

00000fb8.000016a8::2014/10/07-17:50:34.124 ERR [RHS] Online for resource AS_LOG failed.

An arbitration problem! It looks to be related to my first guess doesn’t it? Unfortunately I didn’t have access to the vmkernel.log to see potential reservation conflicts. After that, and this is certainly the funny part of this story (and probably the trigger for this article), I took a look at the multi-pathing configuration for each RDM disk. The reason for this is that I remembered some of the conversations I had with one of my friends (he will surely recognize himself :-)) in which we talked about ISCSI-3 reservation issues with VMWare.

As a matter of fact, the path selection policy was configured to round robin here. According to the VMWare KB1010041, PSP_RR is not supported with VSphere 5.1 for Windows failover cluster and shared disks. This is however the default configuration when creating RDM disks with EMC VNX storage which is used by my customer. After changing this setting for each shared disk, no problem occurred!

My customer inquired about the difference between VMFS and RDM disks. I don’t presume to be a VMWare expert because I’m not, but I know that database administrators and consultants cannot just ignore anymore how VMWare (or Hyper-V) works.

Fortunately, most of the time there will be virtual administrators with strong skills, but sometimes not and in this case, you may feel alone facing such problem. So the brief answer I gave to the customer was the following: If we wouldn’t use physical mode RDMs and used VMDKs or virtual mode RDMs instead, the SCSI reservation would be translated to a file lock. In CIB configuration, there is no a problem, but not for CAB configuration, as you can imagine. Furthermore, using PSP_RR with older versions than VSphere 5.5 will free the reservation and can cause potential issues like the one described in this article.

Wishing you a happy and problem-free virtualization!

Cet article SQL Server failover cluster, VSphere, & SCSI-3 reservation nightmares est apparu en premier sur Blog dbi services.

↧

Windows Server 2012 R2: solving .NET Framework 3.5 installation problems

November 27, 2014, 6:57 pm

≫ Next: FIO (Flexible I/O) – a benchmark tool for any operating system

≪ Previous: SQL Server failover cluster, VSphere, & SCSI-3 reservation nightmares

I faced a problem at a customer site last week when I tried to install the .NET Framework 3.5 – a prerequisite for installing SQL Server 2012 on a Windows Server 2012 R2. I opened the Server Manager and then navigated to the Manage, Add Roles and Features section:

I selected the .NET Framework 3.5 Features option:

I specified an alternate source path:

… and surprise! Even though an ISO of Windows Server 2012 R2 was mapped to my D: drive, the installation failed with this strange error: “The source file could not be found…”

After some investigations, I found that this problem is quite common and that Microsoft has published a fix … which unfortunately does not work for me!
I tried the same installation with different ways: command prompt, PowerShel l… but absolutely NO RESULT.
I finally decided to open a PowerShell console to check the Windows Features available on my server with the cmdlet Get-WindowsFeature:

Strangely, the status of the .NET Framework 3.5 is not showing “Available”, but “Removed”!

So, how do I change this state from removed to available?

After some investigations and after having tried some fixes provided by persons who faced to the same problem as me, I finally found the Standalone Offline Installer tool that solved my problem by enabling the .NET Framework 3.5 (many thanks to Abbodi1406).

I downloaded this exe file and executed it on my server.
An installer screen appeared:

After clicking on the Next button, a command prompt screen appeared which showed the completion state of the process.

As soon as the process was finished, I went back to my PowerShell screen to check if my .NET Framework 3.5 is now available – by running my PowerShell cmdlet Get-WindowsFeature:

The .NET Framework 3.5 now was available and I as able to restart the installation process from the beginning by navigating to the server manager, selecting the concerned feature and giving the alternate source path.

I finally succeded in installing my .NET Framework 3.5!
I hope that my blog post will help some of you to resolve this installation problem ;-)

Cet article Windows Server 2012 R2: solving .NET Framework 3.5 installation problems est apparu en premier sur Blog dbi services.

↧

FIO (Flexible I/O) – a benchmark tool for any operating system

December 8, 2014, 2:55 am

≫ Next: Don’t forget to configure Power Management settings on Hyper-V

≪ Previous: Windows Server 2012 R2: solving .NET Framework 3.5 installation problems

I have just attended an interesting session held by Martin Nash (@mpnsh) at UKOUG 14 – Liverpool: “The least an Oracle DBA Should Know about Linux Administration” . During this session I had the opportunity to discover some interesting commands and tools such as FIO (Flexible I/O). FIO is a workload generator that can be used both for benchmark and stress/hardware verification.

FIO has support for 19 different types of I/O engines (sync, mmap, libaio, posixaio, SG v3, splice, null, network, syslet, guasi, solarisaio, and more), I/O priorities (for newer Linux kernels), rate I/O, forked or threaded jobs, and much more. It can work on block devices as well as files. fio accepts job descriptions in a simple-to-understand text format.

This tool has the huge advantage to be available for almost all kind of Operating Systems ( POSIX, Linux, BSD, Solaris, HP-UX, AIX ,OS X, Android, Windows). If you want to use this tool in the context of Oracle database I invite you to have a look on the following blog from Yann Neuhaus: Simulating database-like I/O activity with Flexible I/O

In order to install it on ubuntu simply use the following command:

steulet@ThinkPad-T540p:~$ sudo apt-get install fio

After having installed fio you can run your first test. This first test will run 2 gigabyte of IO (read write) in directory /u01/fio.

steulet@ThinkPad-T540p:~$ mkdir /u01/fio

Once the directory have been created we can set up the configuration script as described below. However it is perfectly possible to execute this command in command line without configuration script (fio –name=global –ioengine=posixaio –rw=readwrite –size=2g –directory=/u01/fio –threads=1 –name=myReadWriteTest-Thread1):

[global]
ioengine=posixaio
rw=readwrite
size=2g
directory=/u01/fio
threads=1
[myReadWriteTest-Thread1]

Now you can simply run your test with the command below:

steulet@ThinkPad-T540p:~$ fio testfio.fio

The output will looks like the following:

myReadWriteTest-Tread1: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=posixaio, iodepth=1
fio-2.1.3
Starting 1 thread
Jobs: 1 (f=1): [M] [100.0% done] [112.9MB/113.1MB/0KB /s] [28.9K/29.2K/0 iops] [eta 00m:00s]
myReadWriteTest-Tread1: (groupid=0, jobs=1): err= 0: pid=7823: Mon Dec  8 12:45:27 2014
  read : io=1024.7MB, bw=98326KB/s, iops=24581, runt= 10671msec
    slat (usec): min=0, max=72, avg= 1.90, stdev= 0.53
    clat (usec): min=0, max=2314, avg=20.25, stdev=107.40
     lat (usec): min=5, max=2316, avg=22.16, stdev=107.41
    clat percentiles (usec):
     |  1.00th=[    4],  5.00th=[    6], 10.00th=[    7], 20.00th=[    7],
     | 30.00th=[    7], 40.00th=[    7], 50.00th=[    7], 60.00th=[    7],
     | 70.00th=[    8], 80.00th=[    8], 90.00th=[    8], 95.00th=[   10],
     | 99.00th=[  668], 99.50th=[ 1096], 99.90th=[ 1208], 99.95th=[ 1208],
     | 99.99th=[ 1256]
    bw (KB  /s): min=    2, max=124056, per=100.00%, avg=108792.37, stdev=26496.59
  write: io=1023.4MB, bw=98202KB/s, iops=24550, runt= 10671msec
    slat (usec): min=1, max=24, avg= 2.08, stdev= 0.51
    clat (usec): min=0, max=945, avg= 9.71, stdev=24.52
     lat (usec): min=5, max=947, avg=11.79, stdev=24.54
    clat percentiles (usec):
     |  1.00th=[    5],  5.00th=[    8], 10.00th=[    8], 20.00th=[    8],
     | 30.00th=[    8], 40.00th=[    8], 50.00th=[    9], 60.00th=[    9],
     | 70.00th=[    9], 80.00th=[    9], 90.00th=[   10], 95.00th=[   11],
     | 99.00th=[   15], 99.50th=[   20], 99.90th=[  612], 99.95th=[  628],
     | 99.99th=[  652]
    bw (KB  /s): min=108392, max=123536, per=100.00%, avg=114596.33, stdev=3108.03
    lat (usec) : 2=0.01%, 4=0.01%, 10=91.43%, 20=6.93%, 50=0.71%
    lat (usec) : 100=0.13%, 250=0.01%, 500=0.01%, 750=0.47%, 1000=0.01%
    lat (msec) : 2=0.31%, 4=0.01%
  cpu          : usr=10.46%, sys=21.17%, ctx=527343, majf=0, minf=12
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=262309/w=261979/d=0, short=r=0/w=0/d=0Run status group 0 (all jobs):
   READ: io=1024.7MB, aggrb=98325KB/s, minb=98325KB/s, maxb=98325KB/s, mint=10671msec, maxt=10671msec
  WRITE: io=1023.4MB, aggrb=98202KB/s, minb=98202KB/s, maxb=98202KB/s, mint=10671msec, maxt=10671msecDisk stats (read/write):
  sda: ios=6581/67944, merge=0/67, ticks=4908/196508, in_queue=201408, util=56.49%

You will find some really good examples and a detailed list of parameters on the following website: http://www.bluestop.org/fio/HOWTO.txt

This tool is really powerful and present the huge advantage to be available for more or less any Operating System. Such advantage will allow you to make some consistent comparison accross different kind of architecture.

Cet article FIO (Flexible I/O) – a benchmark tool for any operating system est apparu en premier sur Blog dbi services.

↧

Don’t forget to configure Power Management settings on Hyper-V

January 6, 2015, 12:51 am

≫ Next: SQL Server 2014: FCIs, availability groups, and TCP port conflict issues

≪ Previous: FIO (Flexible I/O) – a benchmark tool for any operating system

Recently I had the opportunity to audit a SQL Server database hosted on a Hyper-V 2012 cluster. I noticed that the guest operating system had the Power Plan configured to High performance. This is great thing but when I talked to the system administrator to verify if the same option is turned on on the Hyper-V operating system, this was unfortunately not the case.

As a reminder, the power policy setting has no effect on the guest operating system in case of virtual environments and we always have to verify if this option is configured correctly at the right level.

I performed a quick demonstration to my customer by using the SuperPI benchmark tool that is pretty simple: it calculates pi to a specific number of digits by using one thread and for my purpose it’s sufficient.

–> Let’s have the situation when Power Saver is enabled on the Hyper-V side and High performance turned on on the guest side. Then let’s run SuperPI tool with 512K of digit to compute:

blog_25_-_superpi_calculation_-_power_saving

Here the time taken by the guest to calculate pi:

blog_25_-_superpi_calculation_-_power_saving_-_result

Now let’s change the story by reversing the power settings value: High performance on the Hyper-V side and Power Saver on the guest side. Then we can do the same benchmark test:

blog_25_-_superpi_calculation_-_high_perf_-_result

5,688 seconds for this test against 13,375 seconds for the first test – 57% of improvement .. not so bad :-) but let’s have a more suitable situation. Indeed in most configurations power management setting is configured to Balanced by default and my customer asked me if there is a noticable difference if we leave the default configuration. In order to justify my recommandation we performed the same test but this time I decided to change the number of digits to compute to simulate a more realistic OLTP transaction (short and requiere all CPU resources during a short time). The table lists and compare the both results:

Settings	Duration (s)
Hyper – V : Load balancing	0.219
Hyper – V : High performance	0.141

We can notice a 64% of CPU time improvement in the context of my customer! So after that, my customer was convinced to change this setting and I hope it is the same for you! Of course with long running queries that consume a lot of CPU resources during a long time the difference may be less discernible because the processor wake-up time is very small compared to the total worker time consumed by them.

Keep in mind that changing Power Management state from the guest has no effect on virtualized environment. You must take care of this setting directly on the hypervisor.

Happy virtualization !!

Cet article Don’t forget to configure Power Management settings on Hyper-V est apparu en premier sur Blog dbi services.

↧

SQL Server 2014: FCIs, availability groups, and TCP port conflict issues

January 12, 2015, 9:26 pm

≫ Next: IOUG Collaborate #C15LV

≪ Previous: Don’t forget to configure Power Management settings on Hyper-V

After giving my session about SQL Server AlwaysOn and availability groups at the last French event “Les journées SQL Server 2014”, I had several questions concerning the port conflict issues, particularly the differences that exist between FCIs and availability groups (AAGs) on this subject.

In fact, in both cases, we may have port conflicts depending on which components that are installed on each cluster node. Fundamentally, FCIs and AAGs are both clustered-based features but each of them use the WSFC differently: SQL Server FCIs are “cluster-aware” services while AAGs use standalone instances by default (using of clustered instances with AAGs is possible but this scenario is relatively uncommon and it doesn’t change in any way the story).

First of all, my thinking is based on the following question: Why does having an availability group listener on the same TCP port than an SQL Server instance (but on a different process) cause a conflict issue whereas having both SQL Server FCIs with the same port is working fine?

Let’s begin with the SQL Server FCIs. When you install two SQL Server FCIs (on the same WSFC), you can configure the same listen port for the both instances and it works perfectly right? Why? The main reason is that each SQL Server FCI has its dedicated virtual IP address and as you know, a process can open a socket to a particular IP address on a specific port. However, two or more processes that attempt to open a socket on the same specific port and on the same IP address will result to a conflict. For instance, in my case, I have two SQL Server FCIs – SQLCLUST-01SQL01 and SQLCLUST-02SQL02 – that respectively listen on the same TCP port number: 1490. Here the picture of netstat –ano command output

blog_26_-_netstat_ano_-_1

Notice that each SQL Server process listens to its IP address and only to this one. We can confirm this by taking a look at each SQL Server error log.

blog_26_-_sqlclust01_sql01_error_log_-_2

…

blog_26_-_sqlclust02_sql02_error_log_-_3

Now let’s continue with the availability groups. The story is not the same because in most scenarios, we use standalone instances and by default they listen on all available IP addresses. In my case, this time I have two standalone instances – MSSQLSERVER (default) and APP – that listen respectively on the TCP port 1433 and 1438. By looking at the netstat –ano output we can notice that each process listen on all available IP addresses (LocalAddress = 0.0.0.0)

blog_26_-_netstat_ano_-_4

We can also verify the SQL Server error log of each standalone instance (default and APP)

blog_26_-_sql141_error_log_-_5

…

blog_26_-_sql141_app_error_log_-_6

At this point I am sure you are beginning to understand the issue you may have with availability groups and listeners. Let’s try to create a listener for an availability group with the default instances (MSSQLSERVER). My default instances on each cluster node listen on the port 1433 whereas the APP instances listen on the port 1438 as showed on the above picture. If I attempt to create my listener LST-DUMMY on the port 1433 it will be successful because my availability group and my default instances are on the same process.

blog_26_-_netstat_ano_-_7

Notice that the listener LST-DUMMY listens to the same port than the default instance and both are on the same process (PID = 1416). Of course if I try to change the TCP port number of my listener with 1438, SQL Server will raise the well-known error message with id 19486.

USE [master]

ALTER AVAILABILITY GROUP [dummy]

MODIFY LISTENER N’LST-DUMMY’(PORT=1438);

Msg 19486, Level 16, State 1, Line 3

The configuration changes to the availability group listener were completed, but the TCP provider of the instance of

SQL Server failed to listen on the specified port [LST-DUMMY:1438]. This TCP port is already in use.

Reconfigure the availability group listener, specifying an available TCP port. For information about altering an availability group listener,

see the “ALTER AVAILABILITY GROUP (Transact-SQL)” topic in SQL Server Books Online.

The response becomes obvious now. Indeed, the SQL Server instance APP listens on TCP port 1438 for all available IP addresses (including the IP address of the listener LST-DUMMY).

blog_26_-_netstat_ano_-_8

You don’t trust me? Well, I can prove it by connecting directly to the SQL Server named instance APP with the IP address of the listener LST-DUMMY – 192.168.0.35 – and the TCP port of the named instance – 1438 –

blog_26_-_sqlcmd_-_9

To summarize:

Having several SQL Server FCI that listen on the same port is not a problem because they can open a socket on their distinct IP address. However you can face port conflicts in the case you have also a standalone instance installed on one of the cluster node.
Having an availability group with a listener that listen on the same TCP port than the standalone instance on the same process will not result to a TCP port conflict.
Having an availability group with a listener that listen on the same TCP port than the standalone instance on a different process will result to a TCP port conflict. In this case each SQL Server process will attempt to open a socket on the same TCP port and on the same address IP.

Hope it helps!

Cet article SQL Server 2014: FCIs, availability groups, and TCP port conflict issues est apparu en premier sur Blog dbi services.

↧

IOUG Collaborate #C15LV

February 2, 2015, 11:48 pm

≫ Next: Windows Cluster vNext and cloud witness

≪ Previous: SQL Server 2014: FCIs, availability groups, and TCP port conflict issues

The IOUG – Independant Oracle User Group – has a great event each year: the COLLABORATE. This year it’s in April 12-16, 2015 at The Mandalay Bay Resort & Casino in Las Vegas.

I’ll be a speaker and a RAC Attack Ninja as well.

IOUG COLLABORATE provides all the real-world technical training you need – not sales pitches. The IOUG Forum presents hundreds of educational sessions on Oracle technology, led by the most informed and accomplished Oracle users and experts in the world, bringing more than 5,500 Oracle technology and applications professionals to one venue for Oracle education, customer exchange and networking.

Registration for the event:

http://collaborate.ioug.org/page/register

Speaker

I’ll present ‘Interpreting AWR Reports – Straight to the Goal': how to read an AWR or a Statspack report and get straight to the root cause, being able to estimate the gains.
Here is the session schedule:

http://coll15.mapyourshow.com/6_0/sessions/session-details.cfm?ScheduleID=3535

RAC attack

You want to learn and practice RAC on your laptop? One of the best ways to learn new technology is with hands-on experience. During this workshop you will have an opportunity to set up Oracle 12c Real Application Cluster environment on your laptop, go through advanced RAC related setup scenarios or work together with other technical geeks on solving RAC related challenges.
I’ll participate with my friends as a RAC attack ninjas, to help you address any related issues and guide you through the setup process. Come with your laptop and download oracle before (http://tinyurl.com/rac12c-dl – 4 files for database and grid infrastructure)

It’s not only technology, but also Networking, Beer + Pizza, and new T-SHIRTs.

Follow #C15LV for info about COLLABORATE15 and #RACAttack:

#C15LV Tweets

Cet article IOUG Collaborate #C15LV est apparu en premier sur Blog dbi services.

↧

Windows Cluster vNext and cloud witness

March 29, 2015, 11:14 am

≫ Next: RAC Attack! was another great success at C15LV

≪ Previous: IOUG Collaborate #C15LV

The next version of Windows will provide some interesting features about WFSC architectures. One of them is the new quorum type: “Node majority and cloud witness” which will solve many cases where a third datacenter is mandatory and missing to achieve a truly resilient quorum.
Let’s imagine the following scenario that may concern the implementation of either an SQL Server availability group or a SQL Server FCI. Let’s say you have to implement a geo-cluster that includes 4 nodes across two datacenters with 2 nodes on each. To achieve the quorum in case of broken network link between the two datacenters, adding a witness is mandatory even if you work with dynamic weight nodes feature but where to put it? Having a third datacenter to host this witness seems to be the better solution but as you may imagine, it is a costly and not affordable solution for many customers.
Using a cloud witness in this case might be a very interesting workaround. Indeed, a cloud witness consists of a blob storage inside a storage account’s container. From cost perspective, it is a very cheap solution because you have to pay only for the storage space you will use (first 1TB/month – CHF 0.0217 / GB). Let’s take a look at the storage space consumed by my cloud witness from my storage account:

blog_36_-_cloud_witness_storage_space_

Interesting, isn’t it? To implement a cloud witness, you have to meet the following requirements:

Yourstorage account must be configured as a locally redundant storage (LRS) because the created blob file is used as the arbitration point, which requires some consistency guarantees when reading the data. All data in the storage account is made durable by replicating transactions synchronously in this case. LRS doesn’t protect against a complete regional disaster but it may be acceptable in our case because cloud witness is also dynamic weight-based feature.
A special container, called msft-cloud-witness, is created to this purpose and contains the blob file lied to the cloud witness.

blog_36_-_storage_account_replication_type_

How to configure my cloud witness?

In the same way than before. By using the GUI, you have to select the quorum type you want to use and then you must provide the storage account information (storage account name and the access key). You may also prefer to configure your cloud witness by using PowerShell cmdlet Set-ClusterQuorum as follows:

blog_36_-_cloud_witness_configuration_powershel

After configuring the cloud witness, a corresponded core resource is created with an online state as follows:

blog_36_-_cloud_witness_view_from_GUI_

By using PowerShell:

blog_36_-_cloud_witness_view_from_powershell_

Let’s have a deeper look at this core resource, especially the following advanced policies parameters isAlive() and looksAlive() configuration:

blog_36_-_cloud_witness_isalive_looksalive

We may notice that the basic resource health check interval default value is configured to 15 min. Hmm, I guess that this value will probably be customized according to the customer architecture configuration.
Go ahead and let’s perform some basic tests with my lab architecture. Basically, I have configured a multi-subnet failover cluster that includes four nodes across two (simulated) datacenters. Then, I have implemented a cloud witness hosted inmy storage account “mikedavem”. You may find a simplified picture of my environment below:

blog_36_-_WFSC_core_resources_overview

…

blog_36_-_WFSC_nodes_overview

You may notice that because I implemented a cloud witness, the system changes the overall node weight configuration (4 nodes + 1 witness = 5 votes). In addition, in case of network failure between my 2 datacenters, I want to prioritize the first datacenter in terms of availability. In order to meet this requirement, I used the new cluster property LowerQuorumPriorityNodeID to change the priority of the WIN104 cluster node.

blog_36_-_WFSC_change_node_priority

At this point we are not ready to perform our first test: simulate a failure of the cloud witness:

blog_36_-_cloud_witness_failed_statejpg

Then the system recalculates the overall node weight configuration to achieve a maximum quorum resiliency. As expected, the node weight of WIN104 cluster node is changed from 1 to 0 because it has the lower priority.
The second consists in simulating a network failure between the two datacenters. Once again, as expected, the first partition of the WFSC in the datacenter1 keeps online whereas the second partition brings offline according the node weight priority configuration.

blog_36_-_WFSC_failed_state_partition_2jpg

Is the cloud witness dynamic behavior suitable with minimal configurations?

I wrote a blog post here about issues that exist with dynamic witness behavior and minimal configurations with only 2 cluster nodes. I hoped to see an improvement on that side but unfortunately no. Perhaps with the RTM release … wait and see.

Happy clustering!

Cet article Windows Cluster vNext and cloud witness est apparu en premier sur Blog dbi services.

↧

RAC Attack! was another great success at C15LV

April 12, 2015, 1:49 pm

≫ Next: Using Windows 2012 R2 & dynamic witness feature with minimal configurations – part II

≪ Previous: Windows Cluster vNext and cloud witness

The RAC Attack – install a RAC in your own laptop – is a great success at Las Vegas.
The idea is to help people follow the RAC Attack cookbook which is available at:

http://en.wikibooks.org/wiki/RAC_Attack_-_Oracle_Cluster_Database_at_Home/RAC_Attack_12c/Hardware_Requirements

It is a complex configuration and there is always problems to troubleshoot:

get Virtual Box be able to run a 64-bits guest, and that might involve some BIOS settings
be able to install VirtualBox, and we have people with their company laptop where some security policies makes things difficule
Network configuration is not simple and any misconfiguration will make things more difficult later

So it is a very good exercise for troubleshooting.
The organisation way excellent: Organisation by Ludovico Caldara, infrastructure by Erik Benner, food sponsored by OTN, and Oracle software made available on USB sticks thanks to Markus Michalewicz. Yes the RAC Product Manager did the racattack installation.
It’s also a very good networking event where people meet people around the technology, thanks to IOUG Collaborate.

More Ninjas graduating the Dojo! #racattack @ioug @racsig #c15lv @OracleDBDev @Mythics pic.twitter.com/M4pdb8AHf9

— Erik Benner (@Erik_Benner) April 12, 2015

When people manage to get a VM with the OS installed, they can get the red tee-shirt. Look at the timelapse of the full day and you will see more and more red T-shirts: https://www.youtube.com/watch?v=mqlhbR7dYm0
Do you wonder why we are so happy to see people having only the OS installed? Because it’s the most difficult part. Creating a cluster on a laptop is not easy. You have to create the VM, you have to setup networking, DNS, etc.
Once this setup is good, then installing Grid Infrastructure and Database is straightforward with graphical installer.

Cet article RAC Attack! was another great success at C15LV est apparu en premier sur Blog dbi services.

↧

Using Windows 2012 R2 & dynamic witness feature with minimal configurations – part II

May 18, 2015, 2:38 am

≫ Next: Windows 10 – I tried and came back

≪ Previous: RAC Attack! was another great success at C15LV

I wrote a blog post some time ago about using a file share witness with a minimal windows failover cluster configuration that consists of two cluster nodes. In this blog post, I told I was reluctant to use a witness in this case because it introduces a weakness in the availability process. Indeed, the system is not able to adjust node weight in this configuration but it does mean that we don’t need a witness in this case and this is what I want to clarify here. I admit myself I was wrong on this subject during for some time.

Let’s set the scene with a pretty simple Windows failover cluster architecture that includes two nodes and with dynamic quorum but without a configured witness. The node vote configuration is as follows:

blog_38_-_1_-_cluster_nodes_state

At this point the system will affect randomly a node weight to the current available nodes. For instance, in my context the vote is affected to the SQL143 node but there is a weakness in this configuration. Let’s first say the node SQL141 goes down in an unplanned scenario. In this case the cluster stays functioning because the node SQL143 has the vote (last man standing). Now, let’s say this time the node SQL143 goes down in an unplanned scenario. In this case the cluster will lost the quorum because the node SQL141 doesn’t have the vote to survive. You will find related entries in the cluster event log as shown to the next picture with two specific event ids (1135 and 1177).

blog_38_-_2_-_event_viewer

However in the event of the node SQL143 is gracefully shutdown, the cluster will able to remove the vote of the node SQL143 and give it to the node SQL141. But you know, I’m a follower of the murphy law: anything that can go wrong, will go wrong and it is particularly true in IT world.

So we don’t have the choice here. To protect from unplanned failure with two nodes, we should add a witness and at this point you may use either a disk or a file share witness. My preference is to promote first the disk quorum type but it is often not suitable with customers especially for geo cluster configuration. In this case using file share witness is very useful but it might introduce some important considerations about quorum resiliency. First of all, I want to exclude scenarios where the cluster resides on one datacenter. There are no really considerations here because the loose of the datacenter implies the unavailability of the entire cluster (and surely other components).

Let’s talk about geo location clusters often used with SQL Server availability groups and where important considerations must be made about the file share witness localization. Indeed, most of my customers are dealing only with two datacenters and in this case the 100$ question is where to place it? Most of time, we will place the witness in the location of what we can call the primary datacenter. If the connectivity is lost between the two datacenters the service stays functioning in the primary datacenter. However a manual activation will be required in the event of full primary data center failure.

blog_38_-_3_-_geo_clust_primary_without_change

Another scenario consists in placing the witness on the secondary datacenter. Unlike our first scenario, a network failure between the two datacenters will trigger an automatic failover of the resources to the secondary datacenter but if in the event of a complete failure of the secondary datacenter, the cluster will lost the quorum (as a reminder the remaining node is not able to survive).

blog_38_-_4_-_geo_clust_secondary_failover

As you can see, each of aforementioned scenario have their advantages and drawbacks. A better situation would be to have a third datacenter to host the witness. Indeed, in the event of network failure between the two datacenters that host the cluster nodes, the vote will be assigned to the node which will first successfully lock the file share witness this time.

Keep in mind that even in this third case, losing the witness because either of a network failure between the two main datacenters and the third datacenter or the file share used by the witness deleted accidently by an administrator, can compromise the entire of the cluster availability in case of a node failure (one who has the vote). So be aware to monitor correctly this critical resource.

So, I would finish by a personal think. I always wondered why in the case of a minimal configuration (only 2 cluster nodes and a FSW), the cluster was not able to perform weight adjustment. Until now, I didn’t get the response from Microsoft but after some time, I think this weird behavior is quite normal. Let’s image the scenario where your file share witness resource is in failed state and the cluster is able to perform weight adjustment. Which of the nodes it may choose? The primary or the secondary? In fact it doesn’t matter because in the both cases, the next failure of the node which has the vote will also shutdown the cluster. Finally it is just delaying an inevitable situation …

Happy clustering !

Cet article Using Windows 2012 R2 & dynamic witness feature with minimal configurations – part II est apparu en premier sur Blog dbi services.

↧

Windows 10 – I tried and came back

September 7, 2015, 1:26 pm

≫ Next: Linux Serververwaltung mit Puppet leichtgemacht

≪ Previous: Using Windows 2012 R2 & dynamic witness feature with minimal configurations – part II

2015-09-07 18.41.48 I’ve tried Windows 10 one month ago, and I had to roll back because VirtualBox failed to create host-only interfaces. Today, it forced me to upgrade again, so I tried a little further… and finally rolled back. Here is the list of issues I’ve seen in one hour of Windows 10

At boot I had the following:

Ok. this is a DB2 database I had installed some month ago on my laptop. A missing DDL messages seems to come from my past memories, as well as DB2. No problem, I don’t need that. Upgrades is also a good occasion to cleanup.

Then I tried to run my demo environment (several cygwin windows multiplexed my tmux and controled by Eventghost that I con control with my Pebble…)
the only issue is that mintty opens several windows.

this is not blocking, and can probably be fixed.

Then there is VirtualBox. I stayed with 4.3 because in 5.0 my interfaces disappeared at each resume.

But In Windows 10 I can’t create any host-only interfaces. Any attempt from GUI finishes in:

and from command line:

I tried with the latest build and it’s the same.

There are probably some workarounds ( and it works elsewhere https://twitter.com/joerg_whtvr/status/640972224393052161) but looking at the tickets abou the issue shows that it’s not stable at all yet.
I need Virtual Box (labs to test features, demos for presentations, workshop environment, docker, …) so this is a no-go for me.

Cet article Windows 10 – I tried and came back est apparu en premier sur Blog dbi services.

↧

Linux Serververwaltung mit Puppet leichtgemacht

November 12, 2015, 12:09 am

≫ Next: Control-M Application Integrator verwaltet Microsoft PowerShell Code & Credentials

≪ Previous: Windows 10 – I tried and came back

Mit Puppet die Linux-Server verwalten

Sind mehrere Linux-Server vorhanden, dann entsteht sofort ein Aufwand die unzähligen Konfigurationen ( ntp, chrony, dns, users, groups, services etc.) zu administrieren.

Jeder Administrator möchte möglichst alle Server auf die gleiche Weise konfiguriert haben, die gleichen Konfigurationen.

Puppet ist ein Werkzeug um genau diese Tätigkeiten zu automatisieren. Das erfolgt mit Manifesten (Definitionen). Die Funktionalität ist vergleichbar mit Policys in der Windowswelt.

Diese Manifeste beschreiben den soll Zustand des Systems, Files oder Service. Mit anderen Worten, fehlt eine Service wird er installiert. Wird eine Konfigurationsdatei benötigt, wird sie durch eine saubere Kopie ersetzt oder erstellt.

Dabei ist Puppet in der Lage, auf dem Zielsystem zu unterscheiden mit welchen Befehlen z.B. Software Packages installiert werden müssen, auf Red Hat Servern kennt es yum und auf Debian apt-get. Im Unterschied zu Scripts ist Puppet in der Lage auf eine beschreibende Art den soll Zustand festzulegen.

Um mit Puppet zu starten, wird ein Puppet Master benötigt und ein Client.

Die Installation des Puppet Master ist beschrieben unter folgendem Link: https://docs.puppetlabs.com/guides/install_puppet/post_install.html#configure-a-puppet-master-server

Auf dem Client muss der Puppet-Agent installiert werden.

Installation des Puppet Clients ab dem Puppet Master:

[root@fed22v1 ~]# curl -k https://puppetmaster:8140/packages/current/install.bash | bash

Installation direkt aus dem Repository der Distribution (z.B. Fedora):

[root@fed22v1 ~]# dnf info puppet
Last metadata expiration check performed 0:18:20 ago on Thu Oct 29 16:32:43 2015.
Installed Packages
Name        : puppet
Arch        : noarch
Epoch       : 0
Version     : 4.1.0
Release     : 5.fc22
Size        : 4.2 M
Repo        : @System
From repo   : updates
Summary     : A network tool for managing many disparate systems
URL         : http://puppetlabs.com
License     : ASL 2.0
Description : Puppet lets you centrally manage every important aspect of your system using a
            : cross-platform specification language that manages all the separate elements
            : normally aggregated in different files, like users, cron jobs, and hosts,
            : along with obviously discrete elements like packages, services, and files.

Nach der Installation des Clients, muss im /etc/puppet/puppet.conf folgendes eingetragen werden (puppetmaster muss aufgelöst werden können):

[main]
server = puppetmaster
[agent]
certname = fed22v1.localdomain

Auf dem Puppet Master muss noch der Client akzeptiert werden:

[root@puppetmaster]# puppet cert sign fed22v1.localdomain

Erster Kontakt zwischen Client und Puppet Master kann mit dem Befehl sofort ausgelöst werden:

[root@fed22v1 ~]# puppet agent –tv

Und jetzt den ersten Dienst konfigurieren:

Unter den Puppet Anwendern, werden eine ganze Liste von vordefinierten Konfigurationen zur Verfügung gestellt.

[root@puppetmaster /etc/puppetlabs/code/environments/production/modules]# puppet module search ntp
Notice: Searching https://forgeapi.puppetlabs.com ...
NAME                      DESCRIPTION                                                           AUTHOR             KEYWORDS
thias-ntp                 Network Time Protocol module                                          @thias             ntp ntpd
ghoneycutt-ntp            Manage NTP                                                            @ghoneycutt        ntp time services sync
puppetlabs-ntp            Installs, configures, and manages the NTP service.                    @puppetlabs        ntp time rhel ntpd gentoo aix
dhoppe-ntp                This module installs, configures and manages the NTP service.         @dhoppe            debian ubuntu ntp
diskstats-ntp             Lean RedHat NTP module, with the most common settings.                @diskstats         redhat ntp time rhel ntpd hiera
saz-ntp                   UNKNOWN                                                               @saz               ntp time ntpd gentoo oel suse
example42-ntp             Puppet module for ntp                                                 @example42         ntp example42
erwbgy-ntp                configure and manage ntpd                                             @erwbgy            ntp time services rhel centos
mthibaut-ntp              NTP Module                                                            @mthibaut          ntp hiera
kickstandproject-ntp      UNKNOWN                                                               @kickstandproject  ntp
aageyev-ntp               Install ntp on ubuntu                                                 @aageyev           ubuntu ntp
a2tar-ntp                 Install ntp on ubuntu                                                 @a2tar             ubuntu ntp
csail-ntp                 Configures NTP servers and clients                                    @csail             debian ubuntu ntp ntpd freebsd
warriornew-ntp            ntp setup                                                             @warriornew        ntp
a2labs-ntp                Install ntp on ubuntu                                                 @a2labs
mmitchell-puppetlabs_ntp  UNKNOWN                                                               @mmitchell
tohuwabohu-openntp        Puppet module for OpenNTPD                                            @tohuwabohu        ntp time openntp
hacking-ntpclient         A module to enable easy configuration of an NTP client                @hacking           ntp
ringingliberty-chrony     Manages the chrony network time daemon                                @ringingliberty    debian ubuntu redhat ntp fedora
example42-openntpd        Puppet module for openntpd                                            @example42         ntp example42 openntpd
evenup-time               Manages the timezone and ntp.                                         @evenup            ntp
oppegaard-ntpd            OpenNTP module for OpenBSD                                            @oppegaard         ntp ntpd openbsd openntpd
erwbgy-system             Manage Linux system resources and services from hiera configuration   @erwbgy            ntp rhel cron sshd user host fact
mikegleasonjr-server      The Server module serves as a base configuration for all your mana... @mikegleasonjr     ntp rsyslog firewall timezone swa

[root@puppetmaster /etc/puppetlabs/code/environments/production/modules]# puppet module search chrony
Notice: Searching https://forgeapi.puppetlabs.com ...
NAME                   DESCRIPTION                                               AUTHOR           KEYWORDS
ringingliberty-chrony  Manages the chrony network time daemon                    @ringingliberty redhat ntp fedora centos chrony
aboe-chrony            Module to install chrony time daemon on Archlinux         @aboe

Als erstes Beispiel habe ich ntp und chrony gewählt.

Auf dem Pupper Master, muss das entsprechende Modul installiert werden:

[root@puppetmaster]# puppet module install puppetlabs-ntp
[root@puppetmaster]# puppet module install ringingliberty-chrony

Nach der Installation liegen die Module unter:

[root@puppetmaster /etc/puppetlabs/code/environments/production/modules]# ls -als
4 drwxr-xr-x 6 root root 4096 Oct 29 12:12 chrony
4 drwxr-xr-x 7 root root 4096 Jul 22 00:44 ntp

Das Modul muss noch einem Client zugeteilt werden(CLI):

Diese Zuteilung erfolgt unter (site.pp):

[root@puppetmaster /etc/puppetlabs/code/environments/production/manifests]# ls -als
4 -rw-r--r-- 1 root root 2079 Oct 29 12:38 site.pp

node 'fed22v1.localdomain' {
class { 'ntp':
servers => [
'1.ch.pool.ntp.org',
'2.ch.pool.ntp.org',
'3.ch.pool.ntp.org'
]}}
 
oder
node 'fed22v1.localdomain' {
class { 'chrony':
servers => [
'1.ch.pool.ntp.org',
'2.ch.pool.ntp.org',
'3.ch.pool.ntp.org'
]}}

Das Modul muss noch einem Client zugeteilt werden(Web Zugang):

Der Unterschied dieser beiden Arten der Konfiguration:

Das wird als erstes durchgearbeitet, eine Zentrale Möglichkeit der Konfiguration. Hier können auch Standards für alle Clients festgelegt werden.

Web-Gui

Hier könne Server in Gruppen zusammenfassen werden. Diesen Gruppen werden dann die Classen(z.B. chrony) zugeteilt.

Fazit:

Puppet biete die möglich Server zentral zu konfigurieren. Einfache Punkte wie Zeitsynchronisation sind dabei schnell konfiguriert und installiert. Als nächstes werde ich mich an User, Services und Konfigurationsfiles wagen und so die weiteren mächtigen Möglichkeiten von Puppet erkunden!

Cet article Linux Serververwaltung mit Puppet leichtgemacht est apparu en premier sur Blog dbi services.

↧

Control-M Application Integrator verwaltet Microsoft PowerShell Code & Credentials

December 7, 2015, 7:46 am

≫ Next: Mail mit Format und Priorität aus Linux

≪ Previous: Linux Serververwaltung mit Puppet leichtgemacht

Welche Vorteile hat “Control-M Application Integrator”?

Bei einem Kunden, der Control-M im Einsatz hat, habe ich auf einem Windows Server ganz viele Windows und PowerShell Scripts angetroffen. Unter diesen Scripts waren einige die nach einem bestimmten Keyword in einem Log-File suchten. Alle diese Scripts hatten einen festen CIFS Path im Script. Damit das klappt, wurde ein Share auf dem Windows Server gemappt, das File nach dem Keyword durchsucht und der Share wieder abgehängt. Dazu wurden jeweils 2 Scripts verwendet. In einem der Scripts, waren die Credentials direkt gespeichert! Das erste Script (Windows Script), wurde durch einen Job aus Control-M regelmässig gestartet, das zweite Script (PowerShell) wurde direkt aus dem ersten aufgerufen.

Was sind nun die Vorteile des Modules “Control-M Application Integrator”:

Keine lokalen Scripts mehr auf den Server nötig
Sicherheit, keine Credential in den Scripts
Deutliches vereinfachen
Bessere Wartbarkeit
Wiederverwendbarkeit des neuen “Control-M Application Integrator” Types “String finder”

Wie sieht eine “Control-M Application Integrator” Implementierung aus?

Die Erstellung eines neuen Job Type erfolgt direkt im “Control-M Application Integrator” von Control-M.

Der neue Job Type heisst “String finder”, und als erstes soll das Mapping mit “net use” auf Laufwerk x erfolgen.

if exist x:\ (
	net use x: /delete /yes
)
net use x: "{{Path}}" {{ShareUserPW}} /USER:{{ShareUserName}}

Im Hauptteil “Execution #1 – #3″ haben wir dann den PowerShell Code:

powershell.exe -nologo -ExecutionPolicy Bypass -NoProfile -Command \
"& {$COUNTES=@(GetChildItem -Path {{Path}} -Include {{Filename}} {{Recourse}} \
| Select-String '{{Pattern}}'.count; echo "Hits:$COUNTES"; exit $COUNTES}" < NUL

Im Code oben, wird das Vorkommen des Pattern gezählt ($COUNTES) und anschliessend ausgegeben (Hits:$COUNTES). Die Variablen welche durch “Control-M Application Integrator” verwendet werden ({{Path}}, {{Filename}}, {{Recourse}} und {{Pattern}} werden zur Laufzeit eingesetzt. Die Ausgabe (Hits:$COUNTES) wird später wieder verwendet um zu entscheiden, Mail oder nicht. Ebenfalls wird mit dem Exit Code, geprüft wenn ungleich null wird “Execution #2″ & “Execution #3″ ausgeführt.

In den beiden nächsten Schritten, werden noch eine globale Variable und der Text für die Mail Benachrichtigung erzeugt.

ctmvar -action set -var "%%%%\Text" -varexpr \
"The following pattern [{{Pattern}}] was found [{{HITS}}] times on the [{{Path}}\{{Filename}}]."

Hier wird die globale Control-M Variable “Text” verwendet um den Mail Text aus dem Code zu Definieren.

IF {{HITS}} GTR 0 (set MSG1=Hits& set MSG2=found!)
IF {{HITS}} GTR 0 (echo %MSG1% %MSG2% [{{HITS}}]) ELSE (echo Nothing to do.)

Hier wurde speziell der Text, nach dem aus Control-M später gesucht wird, in zwei Variablen aufgeteilt! Da sonst im Output, in dem der Textfilter sucht den Text “Hits found!” bereits bei der Definition finden würde!

Beim Post-Execution Schritt wird das Mapping wieder gelöscht.

if exist x:\ (
	net use x: /delete /yes
)

Wenn wir nun im Control-M, den Job erstellen möchten, so müssen wir den neuen Job Type “String finder” verwenden.

In dem neuen Job spezifizieren wir dann die neuen Attribute für diesen Job:

Job Name: 1 -> Der Name des Jobs
Connection Profile: 2 -> Das wird im Connection Manager definiert und sind die Credentials (Username und Password für das Mapping)
Filename: 3 -> Filter “*.log” (Es sollen nur diese Files durchsucht werden)
Path: 4 -> Das ist der CIFS (Common Internet File System) Path, der verwendet werden soll
Pattern: 5 -> Ist das was wir in den Files suchen

Damit wir beim Auffinden des gesuchten Pattern auch eine Benachrichtigung per Mail erhalten, konfigurieren wir noch folgendes:

Erstellung des Connection Profile im “Configuration Manager”

Damit wir die Credential nicht im Script oder Control-M Job definieren müssen, verwenden wir ein eigenes “Connection Profile”.

Jetzt benötigen wir noch die Credentials:

Fazit

Damit haben wir nun die Möglichkeit diverse Pattern in unterschiedlichen Files auf unterschiedlichen CIFS Laufwerken zu suchen, den Code und die Methodik wieder zu verwenden. Weiter haben wir keine Windows und PowerShell Scripts mehr lokal auf den Servern, das bedeutet Scheduler, Code und Credentials werden durch Control-M verwaltet, was auch die Sicherheit erhöht.

Control-M Application Integrator ist nur eines von vielen Modulen in Control-M.

Ich hoffe diese Beitrag konnte etwas Licht in das Module bringen :-) .

Cet article Control-M Application Integrator verwaltet Microsoft PowerShell Code & Credentials est apparu en premier sur Blog dbi services.

↧

Mail mit Format und Priorität aus Linux

December 10, 2015, 7:48 am

≫ Next: Windows Server 2016: Containers

≪ Previous: Control-M Application Integrator verwaltet Microsoft PowerShell Code & Credentials

Ausgangslage

Ein Kunde hat mit kürzlich danach gefragt, wie ich den Output eines Scripts auf einer Linuxplattform per Mail sauber formatiert versenden kann. Als Beispiel nehmen wir folgedes:

  ____________________________________________________________________________________________________________________
                                                                                                                     
                                  File System free space on fed22v1.localdomain 
  ____________________________________________________________________________________________________________________
 |                                                                                                                    |
 |                                       Mb-Total   Mb-Free 0%      20%       40%       60%       80%      100%       |
 | /                                   :    17918     16263 #####----+---------+---------+---------+---------+  10%   |
 | /boot                               :      476       326 #############------+---------+---------+---------+  27%   |
 | /dev                                :      479       479 +--------+---------+---------+---------+---------+   0%   |
 | /dev/shm                            :      488       488 +--------+---------+---------+---------+---------+   0%   |
 | /run                                :      488       487 +--------+---------+---------+---------+---------+   1%   |
 | /run/user/0                         :       97        97 +--------+---------+---------+---------+---------+   0%   |
 | /sys/fs/cgroup                      :      488       488 +--------+---------+---------+---------+---------+   0%   |
 | /tmp                                :      488       487 +--------+---------+---------+---------+---------+   1%   |
 |____________________________________________________________________________________________________________________|

Sende ich den Output direkt per Mail, dann sieht das Mail wie folgt aus:

/root/fsdisc.ksh -s1 | mailx -s "Testmail normale Priorität Text-Format 8bit Encoding" mailaddr@gmail.com

Das Mail ist so nur knapp lesbar, das ist definitiv nicht gewünscht!

Hier muss es doch mehr geben?

Ja, es gibt mehr:

From: root@localhost
To: mailaddr@gmail.com
Subject: Testmail normale Priorität HTML-Format 8bit Encoding
MIME-Version: 1.0
Content-Type: text/html; charset=UTF-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

<html>
<body>
<pre style="font: monospace">

  ____________________________________________________________________________________________________________________
                                                                                                                     
                                  File System free space on fed22v1.localdomain 
  ____________________________________________________________________________________________________________________
 |                                                                                                                    |
 |                                       Mb-Total   Mb-Free 0%      20%       40%       60%       80%      100%       |
 | /dev                                :      479       479 +--------+---------+---------+---------+---------+   0%   |
 | /dev/shm                            :      488       488 +--------+---------+---------+---------+---------+   0%   |
 | /run/user/0                         :       97        97 +--------+---------+---------+---------+---------+   0%   |
 | /sys/fs/cgroup                      :      488       488 +--------+---------+---------+---------+---------+   0%   |
 | /run                                :      488       487 +--------+---------+---------+---------+---------+   1%   |
 | /tmp                                :      488       487 +--------+---------+---------+---------+---------+   1%   |
 | /                                   :    17918     16254 #####----+---------+---------+---------+---------+  10%   |
 | /boot                               :      476       326 #############------+---------+---------+---------+  27%   |
 |____________________________________________________________________________________________________________________|

</pre>
</body>
</html>

Das Mail muss als HTML Mail, mit einem nicht proportionalen Font(monospace) erzeugt werden. Ist der Mailheader mit den richtigen Attributen versehen so sieht das Mail genau so aus wie im Terminalfenster.

Das Resultat kann sich sehen lassen:

Jetzt muss nur noch die möglichkeit geschaffen werden auch die Priorität in dem Mail mitzugeben, damit in einem Mail-Client wie Outlook das Prioritätsflag gesetzt wird.

From: root@localhost
To: mailaddr@gmail.com
Subject: Testmail hohe Priorität HTML-Format 8bit Encoding
MIME-Version: 1.0
Content-Type: text/html; charset=UTF-8
Content-Disposition: inline
X-Priority: 1 (Highest); X-MSMail-Priority: High
Content-Transfer-Encoding: 8bit

<html>
<body>
<pre style="font: monospace">

  ____________________________________________________________________________________________________________________
                                                                                                                     
                                  File System free space on fed22v1.localdomain 
  ____________________________________________________________________________________________________________________
 |                                                                                                                    |
 |                                       Mb-Total   Mb-Free 0%      20%       40%       60%       80%      100%       |
 | /dev                                :      479       479 +--------+---------+---------+---------+---------+   0%   |
 | /dev/shm                            :      488       488 +--------+---------+---------+---------+---------+   0%   |
 | /run/user/0                         :       97        97 +--------+---------+---------+---------+---------+   0%   |
 | /sys/fs/cgroup                      :      488       488 +--------+---------+---------+---------+---------+   0%   |
 | /run                                :      488       487 +--------+---------+---------+---------+---------+   1%   |
 | /tmp                                :      488       487 +--------+---------+---------+---------+---------+   1%   |
 | /                                   :    17918     16254 #####----+---------+---------+---------+---------+  10%   |
 | /boot                               :      476       326 #############------+---------+---------+---------+  27%   |
 |____________________________________________________________________________________________________________________|

</pre>
</body>
</html>

Das Ergebnis sieht dann im Outlook so aus:

Es ist mit angaben im Mailheader möglich auch aus Linux gut formatierte Mail zu erzeugen.

Damit wir ein HTML-Mail mit dem Font Monospace erstellen können benötigen wir die folgenden Attribute:

MIME-Version: 1.0
Content-Type: text/html; charset=UTF-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

<html>
<body>
<pre style="font: monospace">
-> Hier kommt der Mailinhalt
</pre>
</body>
</html>

Damit wir die Priorität auf hoch setzen können benötigen wir diese Attribute:

MIME-Version: 1.0
Content-Type: text/html; charset=UTF-8
Content-Disposition: inline
X-Priority: 1 (Highest); X-MSMail-Priority: High
Content-Transfer-Encoding: 8bit

<html>
<body>
<pre style="font: monospace">

</pre>
</body>
</html>

Ja, somit kann ein Script erstellt werden der das Wrapping rund um die Mailheader Attribute durchführt.

Ich hoffe mit diesem Beitrag konnte etwas Licht in die Welt der Mail, Mailheader Attribute gebracht werden.

Cet article Mail mit Format und Priorität aus Linux est apparu en premier sur Blog dbi services.

↧

Windows Server 2016: Containers

December 14, 2015, 5:57 am

≫ Next: Linux – How to check the exit status of several piped commands

≪ Previous: Mail mit Format und Priorität aus Linux

One of the new feature which will come with Windows Server 2016 is the Containers. Microsoft made available its fourth technical preview of his Windows Server 2016 platform some days ago. This new Technical Preview brings with it and for the first time Hyper-V Containers as the Technical Preview three came with the first release of Windows Server Containers.
I will discuss about the difference of those two types of Container later in this blog.
But first let’s have a brief introduction to Container.

Principle

What is a Container?

It is a new virtualization technology, an Operating System virtualization. It makes believe to an application that it is running in a dedicated environment with its own libraries, all the features of the host Operating System even if it is not the case. It is an isolated, independent and portable operating environment.
This container will be easily moved from a machine to another one, from a cloud to another cloud or from a test server to a pre-production server…
It is a really development and test oriented feature.

Container history

Containers have been originally created to facilitate international shipping all over the world as they have a standardized size to be able to go over all boats, trains, trucks… This standardization brings a better industrialization and therefore lower transport costs.
Containers in IT have been used the same approach to virtualize some processes and since more than one decade some products exist as following:
Parallels Virtuozzo (2001)
Solaris Containers (2005)
Linux LXC (2008)
Docker (2013)

Docker uses a lot of known tools but make it use simpler: best packaging and new functionalities which make the use of it more efficient and simpler.
Docker virtualizes the FileSystem, the NameSpaces, the CGroups ( limitation and prioritization of resources CPU, RAM, Network…)… all the different component of the Operating System. The container uses the host OS, thus it will not be possible to instantiate a Linux container in a Windows running a Docker Engine and vice versa.

Virtual Machines versus Containers

Virtual machines run on the top of a physical machine which includes an hypervisor and a host Operating System. In their turn, virtual machines run with an Operating System, binaries and a set of libraries. Applications installed on it may consume all these resources if requested.

Unlike traditional virtual machines, containers run on the top of a physical machine that includes a host Operating System, either a Docker Engine or a Container Engine and a set of libraries and Binaries usable by applications.
You can run more Containers than Virtual Machines on the same host server as Container are more lightweight.
Containers don’t need their own CPU, RAM, set of binaries and libraries, disks space… There is a big difference compared to Virtual Machines which share host resources and need their own Os, disks, libraries, …

Docker

Docker is an Open-source solution to manipulate LXC Container. It is a Client Server oriented model: with a Docker client and Docker Engine. Docker has more or less forty command used via command line to list all images, start a container, publish an image in a repository (private or public)…

Docker Client and Engine

The Docker Client (command line) is available for Max, Linux and Windows.
Some commands are:
docker run: to instantiate a new container
docker pull: to retrieve an image
docker build: to build a new image

The Docker Engine (Docker host) is only available for Linux distribution (Ubuntu, RedHat…). But this Linux machine can be a virtual machine on Azure, Hyper-V, VMware…

Image Docker & Dockerfile

A Docker image represents more or less the filesystem of the Operating System, but is in an inert state. Each image is constituted of a certain number of layers.
Images are created from a configuration file named Dockerfile which describes exactly what need to be installed on the system.

Docker Hub

Docker hub is a public repository of container images.
It offers:

a Registry: Storage system for container images
a public index: list of public images with evaluation system and sorting feature
automated builds: to link a code repository GitHub or Bitbucket and automatically create Docker images after each commit of the source code

Docker & Microsoft

Partnership

More or less one year and a half ago, Microsoft announced his partnership with Docker. Now, Docker is in the Azure MarketPlace:

and application in Container also:

The partnership has brought:

Extension of the Docker API and Docker tools to be able to support Windows Containers
Docker Client CLI (Command Line) on Windows
Docker extension for Linux VM on Azure
Docker Compose and Docker Swarm supported on Azure
Visual Studio tools for Docker
Docker Container in the Azure MarketPlace

Container on Windows Server 2016

As Container is a new feature of Windows Server 2016, you will have to install it as a normal feature.
For that go to Server Manager, Manage and click on “Add Roles and Features”. In the wizard, in the Feature screen just select the Container feature:

Once it is intalled, you can open a Powershell windows and have a look at the possible PowerShell Cmdlets for Containers:

I will play with those new Cmdlets in a future blog, but here let’s continue to explain how Microsoft has integrated those containers in Windows Server 2016.

The Docker Client, docker.exe, is shown as a command line, the goal is to have a unique client which is able to manage Linux and Windows Container.
The command line docker.exe will be able to instantiate images of Linux or Windows Container. But a container can only run on a host machine which executes the Docker Engine that runs on the same OS.
A Linux container will use the Kernel of the host machine which executes the Docker Host on Linux.
We did OS virtualization and not machine virtualization.

Type of Container

Windows Server Container

runs on Windows Server 2016
uses libraries and functionalities of the Windows kernel
container is managed via a Container Management Stack interfaced with
- Docker
- PowerShell & WMI objects

Hyper-V Container

adds an isolation level between each Container and the Management Stack based on Hyper-V partitions
uses libraries and functionalities of the Windows kernel
container is managed via a Container Management Stack interfaced with
- Docker
- PowerShell & WMI objects

Container environment

Container Run-Time

Windows Server 2016 if you want to run Windows Container
Linux if you need to run Linux container

Image Repository

Docker Hub: Registry public
DTR: Docker Trusted Registry for enterprises which want their own Private Image Repository

Container images

Container instantiated and based on a stacking of images

Managing Container

To be able to manage Containers, PowerShell script is used. Here some PowerShell commands:

Get-Containerimage: to search all container images in a Repository
New-Container -Name ‘Test’ -ContainerImageName ‘Windows': create a new container image in the Repository named ‘Test’ and based on image ‘Windows’
Start-Container ‘Test': to start the container created previously
cmd /c node.msi: on a container to install for example a msi package, our container is running and is open for writing
Stop-Container ‘Test': to stop the container named ‘Test’
New-ContainerImage -ContainerName ‘Test’ -Name ‘NewVersionTest': create a new container image in my Repository based on the image I just modified and now stopped

Development Process

The development process for a Container is as follow:

Each developer has his own local repository
Import container from the Central Repository of the enterprise. All dependent containers are also imported with the selected one
the developer will develop his application, compile it … and will build a new container image
the developer push this new image in the Central Repository
this new image is now available for everybody

Containers will help developers to build and easily deploy much faster high quality applications. Containers will also help administrators to create quickly and easily new architecture for test, development or production environment and will simplify the maintenance and update.
Let’s see how Container will modified this ecosystem.

Cet article Windows Server 2016: Containers est apparu en premier sur Blog dbi services.

↧

Linux – How to check the exit status of several piped commands

January 10, 2016, 1:17 pm

≫ Next: Windows failover cluster 2016 : new site awareness feature

≪ Previous: Windows Server 2016: Containers

As piping in bash commands is common and very usefull, controling the exit status of each piped commands in bash scripting can be vital, especially for backups.

I was checking by a customer the backups of a critical MySQL instance and was surprised even stunned that the return status of all of them was always successfull when tested but physically on disk, the dumps were all empty.

No valid backups since a long time meaning no possible recovery.

Oups! How can it be?

I immediately opened the backup script with “vi” and had a look to the used statement which was the following:

mysqldump -u $USERNAME -p $PASS $DBNAME | bzip2 -9c -c > Dump.sql.gz

Now, what if the backup failed and the bzip2 command succeeded?
In fact, the exit status will be the one of the last command.

echo $? 
0

And this will be always successfull.

So, the solution to check the exit status of a particular command in piped commands is to use an inbuilt linux variable called PIPESTATUS.

PIPESTATUS is an array variable which contain the exit status of every piped commands.

In our case,

echo ${PIPESTATUS[0]} will refer to the backup and will be greater than 0 if it fails
echo ${PIPESTATUS[1]} will refer to the compression

echo ${PIPESTATUS[*]} or echo ${PIPESTATUS[@]} will give you the status of both.

So, one solution in our example could be:

mysqldump -u $USERNAME -p $PASS $DBNAME | bzip2 -9c -c > Dump.sql.gz

if [ ${PIPESTATUS[0]} -ne "0" ]
then
    echo "the MySQL Database backup failed with Error: ${PIPESTATUS[0]}";
else
    echo "The MySQL Database backup was successfull!";
fi

I hope it will help...

Cet article Linux – How to check the exit status of several piped commands est apparu en premier sur Blog dbi services.

↧

Windows failover cluster 2016 : new site awareness feature

January 31, 2016, 11:45 pm

≫ Next: Getting started with Ansible – Preparations

≪ Previous: Linux – How to check the exit status of several piped commands

After my first two blogs about cloud witness and domain-less dependencies features shipped with Windows Server 2016, it’s time to talk about another pretty cool feature of the WSFC : site awareness that will also benefit to our SQL Server FCIs and availability groups.

But before to talk further about the potential benefits of this new feature, let’s go back to the previous and current version of Windows Server that is 2012 R2. Firstly, let’s assume we have a Windows failover cluster environment that includes 4 nodes spread evenly across two data centers and a FSW on a third datacenter as well.

Assuming we use the dynamic quorum feature, all the node members and the FSW as well will get a vote in this specific context. If we lose the FSW, the system is intended to reevaluate dynamically the situation and it has to drop weight for a cluster node. We are able to influence its behaviour by configuring the LowerQuorumPriorityNodeID property (please refer the end of my blog here). For example, if we want to maximize the availability of the Windows failover cluster on the first datacenter (on the left), we have to change the quorum priority of one of the cluster node on the second datacenter (on the right) as shown below:

Now, with Windows Server 2016, this property is deprecated and the site awareness feature will replace it. We will able to define and group cluster member nodes into sites. So let’s assume we want to achieve the same design than previously. To meet the same requirements, we have to define two sites and give the priority to the first site (on the left).

Actually, sites and site preference may be configured only by using PowerShell and the new property site by using the following syntax:

(Get-ClusterNode –Name <node>).Site=<site number>

For example, on my test lab, I have configured two sites at the cluster node level to reflect the situation shown on the above.

Then, I have configured my preferred site at the cluster level in order to prioritize my first datacenter to survive to the 50/50 failure scenario (by changing the PreferredSite property):

(Get-Cluster).PreferredSite=<site number>

Now let’s simulate we face a failure of the FSW and let’s take a look at the new node weight configuration

Cluster nodes

Ok, my configuration seems to work correctly and the FSW has played its role (tie break) by dropping a weight for the WIN20164 node. Go ahead and let’s simulate a failure of the second datacenter (WIN20163 and WIN20164 down) to check if the first will survive…

Great job! And the system has automatically reevaluate the overall node weight configuration (the system has dropped the weight for the failed nodes). We may also notice the action taken by the Windows Failover Cluster in the cluster log as shown below:

At the first glance, I guess you are saying “ok but this is just a replacement of the old LowerQuorumPriorityNodeID … nothing more …” but fortunately there are more. Keep reading the next part of this blog post :-)

The second interesting topic concerns the ability to better predict the next role’s location during failover events. Indeed, with Windows 2012 and previous versions, you had to play manually with the preferred owners list to control the failover order of the corresponding role. Fortunately, Windows 2016 introduces another parameter preferred site that tends to make this task easier.

Let’s say I added two SQL Server FCIs (INST1 and INST3) to my previous environment and now I want to prioritize the INST1 instance on the datacenter1 and the INST2 instance on the datacenter2 (aka multi master datacenters). In other words, I want the INST1 and INST2 instances fail over first to their respective cluster nodes in the same site before the others.

We have to define the preferred site at the role level as follows:

(Get-ClusterGroup –Name  <group name>).PreferredSite =<site number>

My initial scenario consists in starting each instance in their respective preferred site with INST1 on the node 1 and INST2 on the node 3.

Then I simulated a failure of the node WIN20163 and I expected to see the INST2 instance to failover to the WIN20164 node and this is exactly what it happened and vis vers ca (failure of the WIN20164 cluster node). Of course if no cluster nodes are available from the preferred site, the concerned role will failover to the next other available node.

Finally the last topic concerns the new heartbeat settings that come with the new site awareness feature. When we define sites between cluster nodes we also have the ability to define new heartbeat properties as CrossSiteDelay and CrossSiteThreshold as shown below:

By default, the cross-site threshold is defined to 20s whereas the intra-site threshold (that corresponds to the same subnet threshold) is configured to 10s. According to the Microsoft blog here, CrossSiteDelay and CrossSiteThreshold may supersede other existing parameters regarding the scenario.

Thus in my case, I have to deal with two different thresholds (same subnet – 10s – and cross site – 20s-). I may confirm these two values by taking a look at the concerned cluster log record after the failover of both the WIN2013 and the WIN2014 nodes.

WIN20131 (192.168.10.22) – WIN20164 (192.168.10.24) – same site

WIN20131 (192.168.10.20) –WIN20163 (192.168.10.24) – cross-site

At this point, you may also wonder why introducing these two new site heartbeat parameters when you may already deal with the existing parameters as SameSubnet* and CrossSubnet*. Well the new site settings will provide us more flexibility to define heartbeats by using the site concept. Let’s demonstrate with the following (real) customer scenario that includes AlwaysOn availability groups that run on the top of the WSFC with 3 nodes spread on 3 different data centers and a stretch vlan.

There is a high bandwidth network link between the two first data centers (with a low latency) and a less efficient network link between the third database and the others (heterogeneous configuration). In fact the third site was dedicated for DR in this context.

With Windows 2012R2, I may be limited by the only available parameter : SameSubnetThreshold regarding my scenario (all of the cluster nodes are on the same subnet here). This parameter may influence the hearbeat setting for all the nodes and I only want to get a more relaxed monitoring for the third node WIN2013.

With Windows 2016, the game will probably change and I would group WIN20161 and WIN20162 into the same “logical” site and leave WIN20163 alone into its own site. Thus, I could benefit to the cross-site parameters that will supersede the same subnet parameters between the [WIN20161-WIN20162] and [WIN20163] without influencing to the monitoring between WIN20161 and WIN20162 (that are in the same site). The new picture would be as follows:

This new configuration will give me more flexibility to configure aggressive or relaxed monitoring between nodes on different sites but on the same subnet.

Happy clustering!

Cet article Windows failover cluster 2016 : new site awareness feature est apparu en premier sur Blog dbi services.

↧

Getting started with Ansible – Preparations

March 17, 2016, 6:05 am

≫ Next: Windows Server 2016: Introducing Storage Spaces Direct

≪ Previous: Windows failover cluster 2016 : new site awareness feature

When your infrastructure landscape becomes larger and larger you definitely need a tool which helps you in managing your servers, no matter if pḧysical or virtual, running in the cloud or not. One tool (many others are around) which assists you in provisioning, configuration management and mass deployment is Ansible. If you look at the man page you can find this sentence: “Ansible is an extra-simple tool/framework/API for doing ‘remote things’ over SSH.”. What is better than learning by doing? Lets start:

I have three CentOS 7.2 core installations:

hostname	description	ip
ansible-control	The ansible control host. All Ansible commands will be executed from here	192.168.22.170
ansible-pg1	The first host that will be managed by Ansible	192.168.22.171
ansible-pg2	The second host that will be managed by Ansible	192.168.22.172

All of these hosts are on the same CentOS release (although that does not really matter):

[root@ansible-control~]$ cat /etc/centos-release
CentOS Linux release 7.2.1511 (Core)

Obviously the first step is to get Ansible installed on the control host. There is nothing we need to install on the hosts we want to manage. Ansible does not require to setup any agents which is one of the main advantages over other tools. The only requirement is ssh. There are various ways to get Ansible installed but for the scope of this post we’ll go with yum as Ansible is available in the EPEL repository. Lets add the repository:

[root@ansible-control~]$ yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
[root@ansible-control~]$ rpm -qa | grep epel
epel-release-7-5.noarch

From there on installing Ansible is just a matter of using yum:

[root@ansible-control~]$ yum install -y ansible

At the time of writing this will install version 1.9.4 of Ansible:

[ansible@ansible-control ~]$ ansible --version
ansible 1.9.4
  configured module search path = None

Yum will take care of all the dependencies (e.g. python) and setup everything for us. Before starting to work with Ansible (because we do not want to work with root) all three hosts get a group and a user created. So, for all hosts:

[root@ansible-control~]$ groupadd ansible
[root@ansible-control~]$ useradd -g ansible ansible
[root@ansible-control~]$ passwd ansible 
[root@ansible-control~]$ su - ansible
[ansible@ansible-control~]$ ssh-keygen -t rsa

The generation of the ssh keys is important as we’ll use password-less ssh authentication for talking from the control host to the managed hosts. To succeed with this we’ll need to copy the ssh ids from the control host to the managed hosts:

[ansible@ansible-control ~]$ ssh-copy-id -i .ssh/id_rsa.pub ansible@192.168.22.171
[ansible@ansible-control ~]$ ssh-copy-id -i .ssh/id_rsa.pub ansible@192.168.22.172

How does the control host know which hosts it shall manage if nothing gets installed on the clients? Quite easy: there is a configuration file in /etc/ansible. As we do want to work with our dedicated user we’ll change the permissions:

[root@ansible-control~]$ chown -R ansible:ansible /etc/ansible/*

Using the Ansible user we can now create our first hosts file:

[ansible@ansible-control ~]$ cat /etc/ansible/hosts
[postgres-servers]
192.168.22.171
192.168.22.172

The name in brackets is a so called group name. This means be referencing “postgres-servers” in the Ansible commands Ansible resolves the group name to the server names listed for that group. Lets do a basic test to check if we can talk to our clients:

[ansible@ansible-control ~]$ ansible postgres-servers -a "/bin/echo 11"
192.168.22.171 | success | rc=0 >>
11

192.168.22.172 | success | rc=0 >>
11

Cool, works. We can run the same command on both clients. If there are more clients you can tell Ansible to use N parallel forks to execute the commands by using the “-f” switch (5 is the default):

[ansible@ansible-control ~]$ ansible postgres-servers -a "/bin/echo 11" -f 5
192.168.22.171 | success | rc=0 >>
11

192.168.22.172 | success | rc=0 >>
11

You might already guessed it from the naming of the clients that there will be something about PostgreSQL in this post. And, you’re right To outline one use case for Ansible we’ll setup PostgreSQL on two hosts in exactly the same way using Ansible in the next post.

Cet article Getting started with Ansible – Preparations est apparu en premier sur Blog dbi services.

↧

Windows Server 2016: Introducing Storage Spaces Direct

April 10, 2016, 11:17 am

≫ Next: Stay tuned with kernel parameters

≪ Previous: Getting started with Ansible – Preparations

Have you heard about Storage Spaces introduced by Windows Server 2012? Well, this is a very interesting storage feature that allows us to deal with functionalities like storage virtualization, RAID capabilities, thin or thick provisioning, cluster shared volume (CSV), efficiently file repair and so on. In addition, the Windows 2012 R2 version has also introduced some enhancements like storage tiering, write-back caching, datadeduplication, parity space support for failover clusters, JBOD enclosure awareness… A bunch of enterprise storage features and we are far from the old basic Windows RAID level capabilities for sure.

To be honest, I have never implemented myself or seen Storage Space in action at any customer shops so far because in my area, most of my customers already own a SAN. On another side, we have to deal with some scaling and hardware limitations that probably have delayed the adoption of this feature. Indeed, Scale-Out File Servers (SOFS) are recommended in replacement of SAN in this context and all SOFS nodes must be physically connected to every JBOD chassis. Therefore, if one of the SOFS nodes fails for some reason, the IO traffic will be redirected so fast that the server won’t notice, but at the price to either be limited by the number of SAS cables that can be wired to each SOFS node or by the maximum number of disks per disk. Even the storage pool expansion comes with some limitations as well. Finally, its usage is limited to SAS disks and no other technologies work even it seems not to be a big problem for my customers to be honest. In short, suitable scenarios seem to be very limited for customers (at least in my context).

Picture from https://blogs.technet.microsoft.com/clausjor/2015/05/14/storage-spaces-direct/

But fortunately, Windows Server 2016 seems to change the game by introducing the new Storage Spaces Direct feature (aka S2D). In fact, this is simply the enhancement of the old Storage Spaces feature and I have to say that some key points have caught my attention. Indeed, we may pool together local storage from each node using Storage Spaces for data protection. A big improvement that will simplify some scenarios for sure! Moreover, there is no limitation of disk types and now we are able to include SAS, SATA and NVMe technologies as well. The latter one is probably the most interesting item because it is based on flash memory on PCI express card and I may imagine some scenarios as SQL Server FCIs that may benefit from high performance and low latency storage in conjunction of RDMA (with Mellanox Connect cards for example) either in physical or virtualization context. One another important point is that S2D runs on the top of a new designed storage bus transport by Microsoft to transfer data between nodes over the SMB or SMB direct protocol.

In this blog post let’s just introduce S2D by a first simple implementation. Unfortunately, my lab environment is very limited because it includes only one computer (Lenovo T530) that includes one SSD disk and HDD disk (SATA/600 – 7200 rpm) that will be used as storage layout by my Hyper-V server. I have installed 4 virtual machines that run on Windows 2016 TP4 and that have 2 virtual disks (SSD + HDD) installed regardless the operation system disk. Each set of virtual disks (SSD + HDD) will be enrolled in a S2D pool later in this blog post. So it is obvious that executing some performance tests doesn’t make sense here. But I hope to get a more sized environment for that but at this moment let’s focus on how to implement S2D. Here a picture of my lab architecture:

First let’s install a simple Windows Failover Cluster that will include my four nodes (WiN20161, WIN20162, WIN20163 and WIN20164). Before installing the cluster itself, let’s test my current environment. You may notice that I have included the Storage Spaces Direct capabilities to my test by using the well-known Test-Cluster cmdlet:

Test-Cluster –Node WIN20161, WIN20162, WIN20163, WIN20164 –Include “Storage Spaces Direct”,Inventory,Network,”System Configuration”

Perfect! My configuration seems to be suitable to implement S2D as stated below:

The next step will consists in finding which disks may be included to my storage pool. In my case, disks with size higher or equal to 10GB concern HDD disks (rotational disk) whereas those with a size equal to 10GB are SSD disks.

# HDD
Get-PhysicalDisk -CanPool $true | ? Size -ge 10GB | Sort Size | FT -AutoSize

# SSD
Get-PhysicalDisk -CanPool $true | ? Size -le 5GB | Sort Size | FT -AutoSize

Let’s create a storage pool by using the New-StoragePool cmdlet.

# Create storage pool
$s = Get-StorageSubSystem
New-StoragePool -StorageSubSystemId $s.UniqueId -FriendlyName Pool_SQL -PhysicalDisks (Get-PhysicalDisk -CanPool $true)

One another capability of S2D is to use multi resilient virtual disks according to the Clauds Joergensen blog post by using both storage tiers and ReFs real time tiering. How important is this feature? Well, let’s say you want to dedicate a virtual disk for SQL Server data files with a 70/30 read write IO pattern. In this case it may be interesting to use multi resilient virtual disks with both Mirror and Parity tiers. Indeed, from one side, ReFs will always perform IO writes on the Mirror Tier before Parity Tier, improving IO writes in this case. On another side, using parity Tiers may improve random IO read pattern. This is my understanding of potential benefits from multi resilient feature. Please feel free to reach me out if I’m wrong.

Well, let’s implement storage tiers to achieve this goal. SSD disks will be used by the Mirror tier and HDD disks by the Parity tier. However before implementing storage tiers, it remains one important task that will consists in defining disk media types (HDD or SSD). In my context, the virtual layer may prevent the media type detection of my physical disks from the operating system. But fortunately we may use the Set-PhysicalDisk cmdlet to achieve this task as shown below:

# Configure media type for virtual SAS disks 
Get-StoragePool Pool_SQL | Get-PhysicalDisk | ? Size -lt 8GB | Set-PhysicalDisk –MediaType SSD
Get-StoragePool Pool_SQL | Get-PhysicalDisk | ? Size -gt 8GB | Set-PhysicalDisk –MediaType HDD

After defining my storage tiers, let’s verify the total size of the Pool_SQL storage pool :

# Get storage pool info
Get-StoragePool -FriendlyName Pool_SQL | select FriendlyName, Size, AllocatedSize

So we may benefit from a total capacity size of 54GB but let’s notice that 2GB is already in use. In fact, 256MB is reserved from each disk for internal metadata management. So let’s perform a simple math: 8 disks x 256MB = 2GB that corresponds to the AllocatedSize column value.

Let’s then have a look at the physical disks in the Pool_SQL storage pool:

# Get storage pool physical disks
Get-StoragePool Pool_SQL | Get-PhysicalDisk | Group MediaType, Size, BusType | Sort Name | select Count, Name

As expected, the storage pool includes 4 SSD disks for a total size of 4GB as well as 4 HDD disks for a total size of 9GB according to the above picture. Don’t forget to take into account the reserved space as said earlier.

Now it’s time to create our two storage tiers. As a reminder, my lab environment is very limited with only 4 virtual disks that may be pooled. It means that I may only define one mirror resilient based configuration with mirror two-way layout that requires a minimum of 2 disks (one disk for the source data and one disk dedicated for data copy that corresponds respectively to NumberofColumns= 1 and PhysicalDiskRedundancy = 1 parameters). A mirror three-way layout would require a minimum number of 5 disks, so it is not affordable in my case. I also have defined a parity resilient based configuration (with LRC erasure) that requires a minimum of 3 disks and will support one drive failure (in my case, I enrolled 4 disks).

$s = New-StorageTier –StoragePoolFriendlyName Pool_SQL -FriendlyName SSDTier -MediaType SSD -ResiliencySettingName Mirror -NumberOfColumns 1 -PhysicalDiskRedundancy 1
$h = New-StorageTier –StoragePoolFriendlyName Pool_SQL -FriendlyName HDDTier -MediaType HDD -ResiliencySettingName Parity -NumberOfColumns 4 -PhysicalDiskRedundancy 1

After defining each storage tier, let’s have a look at the supported size in each case (mirror or parity):

SSD tier

Get-StorageTierSupportedSize SSDTier -ResiliencySettingName Mirror | FT -AutoSize
Get-StorageTierSupportedSize SSDTier -ResiliencySettingName Parity | FT -AutoSize

For either the mirror or parity resilient based configuration, the minimum pool size will be 2GB. However for mirror resilient based configuration the maximum size will be 6GB (3GB * 4 disks / 2) whereas for parity resilient based configuration the maximum size will be 9GB (3GB * 3 disks). The remaining disk will be used to store parity data.

HDD tier

Get-StorageTierSupportedSize HDDTier -ResiliencySettingName Mirror | FT -AutoSize
Get-StorageTierSupportedSize HDDTier -ResiliencySettingName Parity | FT -AutoSize

I let you perform the math by yourself:-)

So we have defined two tiers and we are now able to create a new virtual disk that uses multi-resiliency capability. The new virtual disk will be created as a clustered shared volume (CSV) and formatted by using ReFs filesystem (CSVFS_ReFS).

# Create new volume with multi-resiliency tiers
New-Volume -StoragePoolFriendlyName Pool_SQL -FriendlyName Volume_SQL -FileSystem CSVFS_ReFS -StorageTiers $s,$h -StorageTierSizes 4GB,6GB -ProvisioningType Fixed

Let’s take a look at the new created volume:

Get-Volume | ? FileSystemLabel -eq "Volume_SQL" | ft FileSystemLabel, FileSystemType, AllocationUnitSize, @{Name="Size_GB";Expression={$_.Size / 1GB}}

… and at the new virtual disk

Get-VirtualDisk Volume_SQL | ft FriendlyName, NumberOf*, ResiliencySettingName -AutoSize

At this point, we could be surprised to see only the Mirror Resiliency setting name (instead of viewing both Mirror + Parity resiliency setting names). At this point, my assumption is that Refs always writes into the mirror tier as stated by Klaus in his blog post so viewing only the Mirror doesn’t make sense.

Ok, our storage and our virtual disks are ready for use. That’s all for this blog post. Of course, there are many other interesting items to study about this feature. I hope to have the opportunity to take a closer look at what we can really do with this storage feature in the future. Stay tuned!

Cet article Windows Server 2016: Introducing Storage Spaces Direct est apparu en premier sur Blog dbi services.

↧

Stay tuned with kernel parameters

April 12, 2016, 12:43 am

≫ Next: VM Linux – Device not found : Network unreachable

≪ Previous: Windows Server 2016: Introducing Storage Spaces Direct

Why does the sysctl.conf value for swappiness on Oracle Linux 7.x not survive a reboot

For a project I applied the dbi services best practices for Oracle databases. One of these is to adjust the swappiness parameter. We recommend to use a very low for the swappiness value like 10 or even lower to reduce the risk for Linux to begin swapping! Swapping on a database server is not a problem per se, but it generates activity on disk, which can negatively impact the performance of a database.

At the end of this project I did the handover to our service desk. The service desk has a lot of things to validate for every installation and developed some scripts to check the dbi services best practices against a new system before it gets under contract and/or monitoring. One of this scripts is detected that the swappiness value on the system was set to the value of 30. After a few hours of investigation we identified the issue. More about this focused on swappiness later on this blog.

In previous versions of Linux we apply or modify parameters in the /etc/sysctl.conf file to tune the Linux kernel, network, disk etc. One of these is vm.swappiness.

[root]# cat /etc/sysctl.conf | grep -A1 "^# dbi"
# dbi services reduces the possibility for swapping, 0 = disable, 10 = reduce the paging possibility to 10%
vm.swappiness = 0

To activate this setting we use then:

[root]# sysctl -p | grep "^vm"
vm.swappiness = 0

To control this setting we can request the current value:

[root]# cat /proc/sys/vm/swappiness 
0

After a reboot of the system when we check the value again:

[root]# cat /proc/sys/vm/swappiness 
30

What a surprise! My value did not survive a reboot.

Why is the default value of 30 applied?

Explanation

There are some important changes when it comes to setting the system values for kernel,network,disk etc. in the recent versions of Red Hat Linux 7.

In a minimal installation, since version 7, per default the tuned.service is enabled.
The tuned.service applies some values after the load of the sysctl.conf values.
The default tuned profile that gets applied is network-throughput on a physical machine and virtual-guest on a virtual host

Once we got known to this facts we looked at the values which are set by default.

The tuned.service profiles are located under /usr/lib/tuned/

[root]# ls -als /usr/lib/tuned/
total 36
 4 drwxr-xr-x. 13 root root  4096 Apr  8 14:13 .
12 dr-xr-xr-x. 41 root root  8192 Mar 11 09:39 ..
 0 drwxr-xr-x.  2 root root    23 Apr  6 14:56 balanced
 0 drwxr-xr-x.  2 root root    23 Apr  6 14:56 desktop
16 -rw-r--r--.  1 root root 12294 Mar 31 18:46 functions
 0 drwxr-xr-x.  2 root root    23 Apr  6 14:56 latency-performance
 0 drwxr-xr-x.  2 root root    23 Apr  6 14:56 network-latency
 0 drwxr-xr-x.  2 root root    23 Apr  6 14:56 network-throughput
 0 drwxr-xr-x.  2 root root    39 Apr  6 14:56 powersave
 4 -rw-r--r--.  1 root root  1288 Jul 31  2015 recommend.conf
 0 drwxr-xr-x.  2 root root    23 Apr  6 14:56 throughput-performance
 0 drwxr-xr-x.  2 root root    23 Apr  6 14:56 virtual-guest           <-default
 0 drwxr-xr-x.  2 root root    23 Apr  6 14:56 virtual-host

There is a list of predefined profiles.

[root]# tuned-adm list
Available profiles:
- balanced
- desktop
- latency-performance
- network-latency
- network-throughput
- powersave
- throughput-performance
- virtual-guest
- virtual-host
The currently active profile is: virtual-guest

To find out more about the tuned profiles:

[root]# man tuned-profiles
TUNED_PROFILES(7) tuned TUNED_PROFILES(7)

NAME
tuned-profiles - description of basic tuned profiles

DESCRIPTION
These are the base profiles which are mostly shipped in the base tuned package. They are targeted to various goals. Mostly they provide perfor‐
mance optimizations but there are also profiles targeted to low power consumption, low latency and others. You can mostly deduce the purpose of
the profile by its name or you can see full description bellow.

The profiles are stored in subdirectories below /usr/lib/tuned. If you need to customize the profiles, you can copy them to /etc/tuned and mod‐
ify them as you need. When loading profiles with the same name, the /etc/tuned takes precedence. In such case you will not lose your customized
profiles between tuned updates.

The power saving profiles contain settings that are typically not enabled by default as they will noticeably impact the latency/performance of
your system as opposed to the power saving mechanisms that are enabled by default. On the other hand the performance profiles disable the addi‐
tional power saving mechanisms of tuned as they would negatively impact throughput or latency.

PROFILES
At the moment we're providing the following pre-defined profiles:

balanced
It is the default profile. It provides balanced power saving and performance. At the moment it enables CPU and disk plugins of tuned
and it makes sure the ondemand governor is active (if supported by the current cpufreq driver). It enables ALPM power saving for SATA
host adapters and sets the link power management policy to medium_power. It also sets the CPU energy performance bias to normal. It also
enables AC97 audio power saving or (it depends on your system) HDA-Intel power savings with 10 seconds timeout. In case your system con‐
tains supported Radeon graphics card (with enabled KMS) it configures it to automatic power saving.

powersave
Maximal power saving, at the moment it enables USB autosuspend (in case environment variable USB_AUTOSUSPEND is set to 1), enables ALPM
power saving for SATA host adapters and sets the link power manamgent policy to min_power. It also enables WiFi power saving, enables
multi core power savings scheduler for low wakeup systems and makes sure the ondemand governor is active (if supported by the current
cpufreq driver). It sets the CPU energy performance bias to powersave. It also enables AC97 audio power saving or (it depends on your
system) HDA-Intel power savings (with 10 seconds timeout). In case your system contains supported Radeon graphics card (with enabled
KMS) it configures it to automatic power saving. On Asus Eee PCs dynamic Super Hybrid Engine is enabled.

throughput-performance
Profile for typical throughput performance tuning. Disables power saving mechanisms and enables sysctl settings that improve the
throughput performance of your disk and network IO. CPU governor is set to performance and CPU energy performance bias is set to perfor‐
mance. Disk readahead values are increased.

latency-performance
Profile for low latency performance tuning. Disables power saving mechanisms. CPU governor is set to performance andlocked to the low C
states (by PM QoS). CPU energy performance bias to performance.

network-throughput
Profile for throughput network tuning. It is based on the throughput-performance profile. It additionaly increases kernel network buf‐
fers.

network-latency
Profile for low latency network tuning. It is based on the latency-performance profile. It additionaly disables transparent hugepages,
NUMA balancing and tunes several other network related sysctl parameters.

desktop
Profile optimized for desktops based on balanced profile. It additionaly enables scheduler autogroups for better response of interactive
applications.

virtual-guest
Profile optimized for virtual guests based on throughput-performance profile. It additionally decreases virtual memory swappiness and
increases dirty_ratio settings.

virtual-host
Profile optimized for virtual hosts based on throughput-performance profile. It additionally enables more aggresive writeback of dirty
pages.

FILES
/etc/tuned/* /usr/lib/tuned/*

SEE ALSO
tuned(8) tuned-adm(8) tuned-profiles-atomic(7) tuned-profiles-sap(7) tuned-profiles-sap-hana(7) tuned-profiles-oracle(7) tuned-profiles-real‐
time(7) tuned-profiles-nfv(7) tuned-profiles-compat(7)

AUTHOR
Jaroslav Škarvada <jskarvad@redhat.com> Jan Kaluža <jkaluza@redhat.com> Jan Včelák <jvcelak@redhat.com> Marcela Mašláňová <mmaslano@redhat.com>
Phil Knirsch <pknirsch@redhat.com>

Fedora Power Management SIG 23 Sep 2014 TUNED_PROFILES(7)

Watch the values insight the default profile (virtual-guest), which includes the network-throughput. We take the focus on the swappiness value, which is set to 30.

[root]# cat /usr/lib/tuned/virtual-guest/tuned.conf 
#
# tuned configuration
#

[main]
include=throughput-performance

[sysctl]
# If a workload mostly uses anonymous memory and it hits this limit, the entire
# working set is buffered for I/O, and any more write buffering would require
# swapping, so it's time to throttle writes until I/O can catch up.  Workloads
# that mostly use file mappings may be able to use even higher values.
#
# The generator of dirty data starts writeback at this percentage (system default
# is 20%)
vm.dirty_ratio = 30

# Filesystem I/O is usually much more efficient than swapping, so try to keep
# swapping low.  It's usually safe to go even lower than this on systems with
# server-grade storage.
vm.swappiness = 30

Some important point, the tuned profile virtual-guest includ’s the settings from the tuned profile throughtput-performance:

[root]# cat /usr/lib/tuned/throughput-performance/tuned.conf
#
# tuned configuration
#

[cpu]
governor=performance
energy_perf_bias=performance
min_perf_pct=100

[disk]
readahead=>4096

[sysctl]
# ktune sysctl settings for rhel6 servers, maximizing i/o throughput
#
# Minimal preemption granularity for CPU-bound tasks:
# (default: 1 msec#  (1 + ilog(ncpus)), units: nanoseconds)
kernel.sched_min_granularity_ns = 10000000

# SCHED_OTHER wake-up granularity.
# (default: 1 msec#  (1 + ilog(ncpus)), units: nanoseconds)
#
# This option delays the preemption effects of decoupled workloads
# and reduces their over-scheduling. Synchronous workloads will still
# have immediate wakeup/sleep latencies.
kernel.sched_wakeup_granularity_ns = 15000000

# If a workload mostly uses anonymous memory and it hits this limit, the entire
# working set is buffered for I/O, and any more write buffering would require
# swapping, so it's time to throttle writes until I/O can catch up.  Workloads
# that mostly use file mappings may be able to use even higher values.
#
# The generator of dirty data starts writeback at this percentage (system default
# is 20%)
vm.dirty_ratio = 40

# Start background writeback (via writeback threads) at this percentage (system
# default is 10%)
vm.dirty_background_ratio = 10

# PID allocation wrap value.  When the kernel's next PID value
# reaches this value, it wraps back to a minimum PID value.
# PIDs of value pid_max or larger are not allocated.
#
# A suggested value for pid_max is 1024 * <# of cpu cores/threads in system>
# e.g., a box with 32 cpus, the default of 32768 is reasonable, for 64 cpus,
# 65536, for 4096 cpus, 4194304 (which is the upper limit possible).
#kernel.pid_max = 65536

# The swappiness parameter controls the tendency of the kernel to move
# processes out of physical memory and onto the swap disk.
# 0 tells the kernel to avoid swapping processes out of physical memory
# for as long as possible
# 100 tells the kernel to aggressively swap processes out of physical memory
# and move them to swap cache
vm.swappiness=10

Solution

There various approaches to solve this issue:

- disable the tuned.service for switching back to the /etc/sysctl.conf values
- adapt the values in the virtual-guest profile but what if they will be a updated automatically by the OS vendor in future patches or releases?
- create a new tuned profile based on the virtual-guest and adapt the values
- use the tuned profile which is deployed by Oracle in the repository from Oracle Linux

I prefer solution 4 which is also the much useful way.

What we need to do to solve it like this:

First of all install the corresponding package from the Oracle Linux 7 repository:

[root]# yum info *tuned-profile*
Loaded plugins: ulninfo
Available Packages
Name        : tuned-profiles-oracle
Arch        : noarch
Version     : 2.5.1
Release     : 4.el7_2.3
Size        : 1.5 k
Repo        : installed
From repo   : ol7_latest
Summary     : Additional tuned profile(s) targeted to Oracle loads
URL         : https://fedorahosted.org/tuned/
License     : GPLv2+
Description : Additional tuned profile(s) targeted to Oracle loads.

Watch the values insight this tuned profile:

[root]# cat /usr/lib/tuned/oracle/tuned.conf 
#
# tuned configuration
#

[main]
include=throughput-performance

[sysctl]
vm.swappiness = 1
vm.dirty_background_ratio = 3
vm.dirty_ratio = 80
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
kernel.shmmax = 4398046511104
kernel.shmall = 1073741824
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 6815744
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
kernel.panic_on_oops = 1

[vm]
transparent_hugepages=never

Activate the profile, check which is really active and then check the current configuration value of the swappiness parameter:

[root]# tuned-adm profile oracle
[root]# tuned-adm active
Current active profile: oracle
[root]# cat /proc/sys/vm/swappiness 
1

Now we have the oracle tuned profile applied which overwrites some values which do also come from Oracle with the oracle-rdbms-server-11gR2-preinstall or oracle-rdbms-server-12cR1-preinstall packages. In my case this is a list of double parameters:

/usr/lib/tuned/oracle/tuned.conf
vm.swappiness = 1
vm.dirty_background_ratio = 3
vm.dirty_ratio = 80
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
kernel.shmmax = 4398046511104
kernel.shmall = 1073741824
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 6815744
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
kernel.panic_on_oops = 1

/etc/sysctl.conf from oracle-rdbms-server-11gR2-preinstall
# oracle-rdbms-server-11gR2-preinstall setting for fs.file-max is 6815744
fs.file-max = 6815744

# oracle-rdbms-server-11gR2-preinstall setting for kernel.sem is '250 32000 100 128'
kernel.sem = 250 32000 100 128

# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmmni is 4096
kernel.shmmni = 4096

# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmall is 1073741824 on x86_64
# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmall is 2097152 on i386
kernel.shmall = 1073741824

# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmmax is 4398046511104 on x86_64
# oracle-rdbms-server-11gR2-preinstall setting for kernel.shmmax is 4294967295 on i386
kernel.shmmax = 4398046511104

# oracle-rdbms-server-11gR2-preinstall setting for kernel.panic_on_oops is 1 per Orabug 19212317
kernel.panic_on_oops = 1

# oracle-rdbms-server-11gR2-preinstall setting for net.core.rmem_default is 262144
net.core.rmem_default = 262144

# oracle-rdbms-server-11gR2-preinstall setting for net.core.rmem_max is 4194304
net.core.rmem_max = 4194304

# oracle-rdbms-server-11gR2-preinstall setting for net.core.wmem_default is 262144
net.core.wmem_default = 262144

# oracle-rdbms-server-11gR2-preinstall setting for net.core.wmem_max is 1048576
net.core.wmem_max = 1048576

# oracle-rdbms-server-11gR2-preinstall setting for net.ipv4.conf.all.rp_filter is 2
net.ipv4.conf.all.rp_filter = 2

# oracle-rdbms-server-11gR2-preinstall setting for net.ipv4.conf.default.rp_filter is 2
net.ipv4.conf.default.rp_filter = 2

# oracle-rdbms-server-11gR2-preinstall setting for fs.aio-max-nr is 1048576
fs.aio-max-nr = 1048576

# oracle-rdbms-server-11gR2-preinstall setting for net.ipv4.ip_local_port_range is 9000 65500
net.ipv4.ip_local_port_range = 9000 65500

Summary

What we need to take care of is that if we need to modify some values we have to watch exactly what we apply and also the way in which we apply the values. The most important point is, control the outcome, like described in this blog, of setting values in the /etc/sysctl.conf. The tuned profiles are a good solution in which manufacturers or suppliers are able to distribute optimized values within the Linux distributions.

Cet article Stay tuned with kernel parameters est apparu en premier sur Blog dbi services.

↧

VM Linux – Device not found : Network unreachable

April 12, 2016, 3:01 am

≫ Next: Windows Failover Cluster : Introduction to paxos tag

≪ Previous: Stay tuned with kernel parameters

Recently I exported an Oracle Virtual Box VM as an “.ova” file. When the file was imported again I faced issues with the network devices inside the VM. This post is about what I experienced and how I resolved it.

In my example I have two networks devices configured for the VM in the Virtual Box:

When I logged in I found that eth0 and eth1 are not available:
[root@MYSQL ~]# ifconfig lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:16 errors:0 dropped:0 overruns:0 frame:0 TX packets:16 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:960 (960.0 b) TX bytes:960 (960.0 b) [root@MYSQL ~]#

The issue is with the udev rules for the network devices. You can check these in “/etc/udev/rules.d”. Usually the file which defines the rules for the network devices is named “70-persistent-net.rules”:
[root@MYSQL ~]# cd /etc/udev/rules.d/ [root@MYSQL rules.d]# ll total 44 -rw-r--r--. 1 root root 1652 Nov 12 2010 60-fprint-autosuspend.rules -rw-r--r--. 1 root root 1060 Nov 11 2010 60-pcmcia.rules -rw-r--r--. 1 root root 316 Oct 15 2014 60-raw.rules -rw-r--r--. 1 root root 134 Aug 18 2015 60-vboxadd.rules -rw-r--r--. 1 root root 530 Apr 29 2015 70-persistent-cd.rules -rw-r--r--. 1 root root 1245 Mar 19 19:23 70-persistent-net.rules -rw-r--r--. 1 root root 320 Jan 12 2015 90-alsa.rules -rw-r--r--. 1 root root 83 Oct 15 2014 90-hal.rules -rw-r--r--. 1 root root 2486 Nov 11 2010 97-bluetooth-serial.rules -rw-r--r--. 1 root root 308 Apr 15 2015 98-kexec.rules -rw-r--r--. 1 root root 54 Dec 8 2011 99-fuse.rules

To resolve this check the MAC address for the VM in VirtualBox and adapt the file with the same values that VirtualBox shows for the network device:

Why? When you clone a VM that causes VirtualBox to generate a new MAC address for the network device. The new device is auto detected when VM boots and added as a new device in “/etc/udev/rules.d/70-persistent-net.rules”:
[root@MYSQL rules.d]# vi 70-persistent-net.rules This file was automatically generated by the /lib/udev/write_net_rules # program, run by the persistent-net-generator.rules rules file. # # You can modify it, as long as you keep each rule on a single # line, and change only the value of the NAME= key. # # PCI device 0x8086:0x100e (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:4a:32:87", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" .. # PCI device 0x8086:0x100e (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:87:cc:59", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" .. # PCI device 0x8086:0x100e (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:05:19:40", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3" .. # PCI device 0x8086:0x100e (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:ba:d7:60", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" .. # PCI device 0x8086:0x100e (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:54:57:a8", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4" .. # PCI device 0x8086:0x100e (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:09:6c:96", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5"

Edit the file “70-persistent-net.rules” and adapt your network configuration for eth0 and eth1.
# This file was automatically generated by the /lib/udev/write_net_rules # program, run by the persistent-net-generator.rules rules file. # # You can modify it, as long as you keep each rule on a single # line, and change only the value of the NAME= key. # # PCI device 0x8086:0x100e (e1000) #SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:4a:32:87", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0" .. # PCI device 0x8086:0x100e (e1000) #SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:87:cc:59", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" .. # PCI device 0x8086:0x100e (e1000) #SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:05:19:40", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3" .. # PCI device 0x8086:0x100e (e1000) #SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:ba:d7:60", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2" .. # PCI device 0x8086:0x100e (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:54:57:a8", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1" .. # PCI device 0x8086:0x100e (e1000) SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="08:00:27:09:6c:96", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

Execute the following command: “udevadm trigger –type=devices –action=add” without reboot and restart the network service :
[root@MYSQL network-scripts]# udevadm trigger --type=devices --action=add [root@MYSQL network-scripts]# service network restart Shutting down interface eth0: [ OK ] Shutting down interface eth1: [ OK ] Shutting down loopback interface: [ OK ] Bringing up loopback interface: [ OK ] Bringing up interface eth0: Determining IP information for eth0... done. [ OK ] Bringing up interface eth1: Determining if ip address 192.168.56.11 is already in use for device eth1... [ OK ] [root@MYSQL network-scripts]#

After this change the network devices in the Virtual Machine are up and running again:
[root@MYSQL ~]# ifconfig eth0 Link encap:Ethernet HWaddr 08:00:27:09:6C:96 inet addr:10.0.0.11 Bcast:10.0.0.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:fe09:6c96/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2 errors:0 dropped:0 overruns:0 frame:0 TX packets:11 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1180 (1.1 KiB) TX bytes:1346 (1.3 KiB) .. eth1 Link encap:Ethernet HWaddr 08:00:27:54:57:A8 inet addr:192.168.56.11 Bcast:192.168.56.255 Mask:255.255.255.0 inet6 addr: fe80::a00:27ff:fe54:57a8/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:8 errors:0 dropped:0 overruns:0 frame:0 TX packets:11 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:506 (506.0 b) TX bytes:726 (726.0 b) .. lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:40 errors:0 dropped:0 overruns:0 frame:0 TX packets:40 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3004 (2.9 KiB) TX bytes:3004 (2.9 KiB) [root@MYSQL ~]#

Cet article VM Linux – Device not found : Network unreachable est apparu en premier sur Blog dbi services.

↧