OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: advanced networking with public IPs direct to VMs


Hi John,

I’m late to this thread and have possibly missed some things – but a couple of observations:

“When I add the zone and get to the storage web page I exclude the IPs already used for the compute node NICs and the NFS server itself. …..”
“So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10 -> 172.30.5.14.”

I think you may have some confusion around the use of the storage network. The important part here is to understand this is for *secondary storage* use only – it has nothing to do with primary storage. This means this storage network needs to be accessible to the SSVM, to the hypervisors, and secondary storage NFS pools needs to be accessible on this network.

The important part – this also means you *can not use the same IP ranges for management and storage networks* - doing so means you will have issues where effectively both hypervisors and SSVM can see the same subnet on two NICs – and you end up in a routing black hole.

So – you need to either:

1) Use different IP subnets on management and storage, or
2) preferably just simplify your setup – stop using a secondary storage network altogether and just allow secondary storage to use the management network (which is default). Unless you have a very high I/O environment in production you are just adding complexity by running separate management and storage.

Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue

On 06/06/2018, 10:18, "Jon Marshall" <jms.123@xxxxxxxxxxxxx> wrote:

    I will disconnect the host this morning and test but before I do that I ran this command when all hosts are up -
    
    
    
    
    
     select * from cloud.host;
    +----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
    | id | name            | uuid                                 | status | type               | private_ip_address | private_netmask | private_mac_address | storage_ip_address | storage_netmask | storage_mac_address | storage_ip_address_2 | storage_mac_address_2 | storage_netmask_2 | cluster_id | public_ip_address | public_netmask  | public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets | cpus | speed | url                                 | fs_type | hypervisor_type | hypervisor_version | ram        | resource | version  | parent | total_size | capabilities | guid                                                          | available | setup | dom0_memory | last_ping  | mgmt_server_id | disconnected        | created             | removed | update_count | resource_state | owner | lastUpdated | engine_state |
    +----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
    |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up     | Routing            | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 | 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.3        | 255.255.255.128 | 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |    2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource |         1 |     0 |           0 | 1492390408 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 | Enabled        | NULL  | NULL        | Disabled     |
    |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up     | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 | 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 | 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.98       | 255.255.255.128 | 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | Proxy.2-ConsoleProxyResource                                  |         1 |     0 |           0 | 1492390409 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
    |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up     | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 | 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 | 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL              |       NULL | 172.30.4.86       | 255.255.255.128 | 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL | NULL |  NULL | NoIqn                               | NULL    | NULL            | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |       NULL | NULL         | s-1-VM-NfsSecondaryStorageResource                            |         1 |     0 |           0 | 1492390407 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 | Enabled        | NULL  | NULL        | Disabled     |
    |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up     | Routing            | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 | 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.4        | 255.255.255.128 | 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |    2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource |         1 |     0 |           0 | 1492450882 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 | Enabled        | NULL  | NULL        | Disabled     |
    |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up     | Routing            | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 | 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL              |          1 | 172.30.4.5        | 255.255.255.128 | 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |    2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM             | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |       NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource |         1 |     0 |           0 | 1492390408 |   146457912294 | 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 | Enabled        | NULL  | NULL        | Disabled     |
    +----+-----------------+--------------------------------------+--------+--------------------+--------------------+-----------------+---------------------+--------------------+-----------------+---------------------+----------------------+-----------------------+-------------------+------------+-------------------+-----------------+--------------------+------------+----------------+--------+-------------+------+-------+-------------------------------------+---------+-----------------+--------------------+------------+----------+----------+--------+------------+--------------+---------------------------------------------------------------+-----------+-------+-------------+------------+----------------+---------------------+---------------------+---------+--------------+----------------+-------+-------------+--------------+
    5 rows in set (0.00 sec)
    
    
    
    and you can see that it says the storage IP address is the same as the private IP address (the management network).
    
    
    I also ran the command you provided using the Cluster ID number from the table above -
    
    
    
    mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is not null;
    Empty set (0.00 sec)
    
    mysql>
    
    So assuming I am reading this correctly that seems to be the issue.
    
    
    I am at a loss as to why though.
    
    
    I have a separate NIC for storage as described. When I add the zone and get to the storage web page I exclude the IPs already used for the compute node NICs and the NFS server itself. I do this because initially I didn't and the SSVM started using the IP address of the NFS server.
    
    
    So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10 -> 172.30.5.14.
    
    
    And I used the label "cloudbr2" for storage.
    
    
    I must be doing this wrong somehow.
    
    
    Any pointers would be much appreciated.
    
    
    
    
    ________________________________
    From: Rafael Weingärtner <rafaelweingartner@xxxxxxxxx>
    Sent: 05 June 2018 16:13
    To: users
    Subject: Re: advanced networking with public IPs direct to VMs
    
    That is interesting. Let's see the source of all truth...
    This is the code that is generating that odd message.
    
    >     List<StoragePoolVO> clusterPools =
    > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
    >         boolean hasNfs = false;
    >         for (StoragePoolVO pool : clusterPools) {
    >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem) {
    >                 hasNfs = true;
    >                 break;
    >             }
    >         }
    >         if (!hasNfs) {
    >             s_logger.warn(
    >                     "Agent investigation was requested on host " + agent +
    > ", but host does not support investigation because it has no NFS storage.
    > Skipping investigation.");
    >             return Status.Disconnected;
    >         }
    >
    
    There are two possibilities here. You do not have any NFS storage? Is that
    the case? Or maybe, for some reason, the call
    "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not returning
    any NFS storage pools. Looking at the "listPoolsByCluster " we will see
    that the following SQL is used:
    
    Select * from storage_pool where cluster_id = <host'sClusterId> and removed
    > is not null
    >
    
    Can you run that SQL to see the its return when your hosts are marked as
    disconnected?
    
    
Dag.Sonstebo@xxxxxxxxxxxxx 
www.shapeblue.com
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue
  
 

On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jms.123@xxxxxxxxxxxxx> wrote:
    
    > I reran the tests with the 3 NIC setup. When I configured the zone through
    > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
    > traffic and cloudbr2 for NFS as per my original response to you.
    >
    >
    > When I pull the power to the node (dcp-cscn2.local) after about 5 mins
    > the  host status goes to "Alert" but never to "Down"
    >
    >
    > I get this in the logs -
    >
    >
    > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation was
    > requested on host Host[-4-Routing], but host does not support investigation
    > because it has no NFS storage. Skipping investigation.
    > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was able to
    > determine host 4 is in Disconnected
    > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host 4 state
    > determined is Disconnected
    > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
    > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is disconnected but
    > the host is still up: 4-dcp-cscn2.local
    >
    > I don't understand why it thinks there is no NFS storage as each compute
    > node has a dedicated storage NIC.
    >
    >
    > I also don't understand why it thinks the host is still up ie. what test
    > is it doing to determine that ?
    >
    >
    > Am I just trying to get something working that is not supported ?
    >
    >
    > ________________________________
    > From: Rafael Weingärtner <rafaelweingartner@xxxxxxxxx>
    > Sent: 04 June 2018 15:31
    > To: users
    > Subject: Re: advanced networking with public IPs direct to VMs
    >
    > What type of failover are you talking about?
    > What ACS version are you using?
    > What hypervisor are you using?
    > How are you configuring your NICs in the hypervisor?
    > How are you configuring the traffic labels in ACS?
    >
    > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms.123@xxxxxxxxxxxxx>
    > wrote:
    >
    > > Hi all
    > >
    > >
    > > I am close to giving up on basic networking as I just cannot get failover
    > > working with multiple NICs (I am not even sure it is supported).
    > >
    > >
    > > What I would like is to use 3 NICs for management, storage and guest
    > > traffic. I would like to assign public IPs direct to the VMs which is
    > why I
    > > originally chose basic.
    > >
    > >
    > > If I switch to advanced networking do I just configure a guest VM with
    > > public IPs on one NIC and not both with the public traffic -
    > >
    > >
    > > would this work ?
    > >
    >
    >
    >
    > --
    > Rafael Weingärtner
    >
    
    
    
    --
    Rafael Weingärtner