OSDir


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: advanced networking with public IPs direct to VMs


Dag


Do you mean  check the pools with "Infrastructure -> Primary Storage" and "Infrastructure -> Secondary Storage" within the UI ?


If so Primary Storage has a state of UP, secondary storage does not show a state as such so not sure where else to check it ?


Rerun of the command -

mysql> select * from cloud.storage_pool where cluster_id = 1;
Empty set (0.00 sec)

mysql>

I think it is something to do with my zone creation rather than the NIC, bridge setup although I can post those if needed.

I may try to setup just the 2 NIC solution you mentioned although as I say I had the same issue with that ie. host goes to "Altert" state and same error messages.  The only time I can get it to go to "Down" state is when it is all on the single NIC.

Quick question just to make sure - assuming management/storage is on the same NIC when I setup basic networking the physical network has the management and guest icons already there and I just edit the KVM labels. If I am running storage over management do I need to drag the storage icon to the physical network and use the same KVM label (cloudbr0) as the management or does CS automatically just use the management NIC ie. I would only need to drag the storage icon across in basic setup if I wanted it on a different NIC/IP subnet ?  (hope that makes sense !)

On the plus side I have been at this for so long now and done so many rebuilds I could do it in my sleep now 😊


________________________________
From: Dag Sonstebo <Dag.Sonstebo@xxxxxxxxxxxxx>
Sent: 06 June 2018 12:28
To: users@xxxxxxxxxxxxxxxxxxxxx
Subject: Re: advanced networking with public IPs direct to VMs

Looks OK to me Jon.

The one thing that throws me is your storage pools – can you rerun your query: select * from cloud.storage_pool where cluster_id = 1;

Do the pools show up as online in the CloudStack GUI?

Regards,
Dag Sonstebo
Cloud Architect
ShapeBlue

On 06/06/2018, 12:08, "Jon Marshall" <jms.123@xxxxxxxxxxxxx> wrote:

    Don't know whether this helps or not but I logged into the SSVM and ran an ifconfig -


    eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 169.254.3.35  netmask 255.255.0.0  broadcast 169.254.255.255
            ether 0e:00:a9:fe:03:23  txqueuelen 1000  (Ethernet)
            RX packets 141  bytes 20249 (19.7 KiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 108  bytes 16287 (15.9 KiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

    eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.30.3.34  netmask 255.255.255.192  broadcast 172.30.3.63
            ether 1e:00:3b:00:00:05  txqueuelen 1000  (Ethernet)
            RX packets 56722  bytes 4953133 (4.7 MiB)
            RX errors 0  dropped 44573  overruns 0  frame 0
            TX packets 11224  bytes 1234932 (1.1 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

    eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.30.4.86  netmask 255.255.255.128  broadcast 172.30.4.127
            ether 1e:00:d9:00:00:53  txqueuelen 1000  (Ethernet)
            RX packets 366191  bytes 435300557 (415.1 MiB)
            RX errors 0  dropped 39456  overruns 0  frame 0
            TX packets 145065  bytes 7978602 (7.6 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

    eth3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 172.30.5.14  netmask 255.255.255.240  broadcast 172.30.5.15
            ether 1e:00:cb:00:00:1a  txqueuelen 1000  (Ethernet)
            RX packets 132440  bytes 426362982 (406.6 MiB)
            RX errors 0  dropped 39446  overruns 0  frame 0
            TX packets 67443  bytes 423670834 (404.0 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

    lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
            inet 127.0.0.1  netmask 255.0.0.0
            loop  txqueuelen 1  (Local Loopback)
            RX packets 18  bytes 1440 (1.4 KiB)
            RX errors 0  dropped 0  overruns 0  frame 0
            TX packets 18  bytes 1440 (1.4 KiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


    so it has interfaces in both the management and the storage subnets (as well as guest).



    ________________________________
    From: Jon Marshall <jms.123@xxxxxxxxxxxxx>
    Sent: 06 June 2018 11:08
    To: users@xxxxxxxxxxxxxxxxxxxxx
    Subject: Re: advanced networking with public IPs direct to VMs

    Hi Rafael


    Thanks for the help, really appreciate it.


    So rerunning that command with all servers up -



    mysql> select * from cloud.storage_pool where cluster_id = 1 and removed is null;
    Empty set (0.00 sec)

    mysql>


    As for the storage IP no I'm not setting it to be the management IP when I setup the zone but the output of the SQL command suggests that is what has happened.

    As I said to Dag I am using a different subnet for storage ie.

    172.30.3.0/26  - management subnet
    172.30.4.0/25 -  guest VM subnet
    172.30.5.0/28 - storage

    the NFS server IP is 172.30.5.2

    each compute node has 3 NICs with an IP from each subnet (i am assuming the management node only needs an IP in the management network ?)

    When I add the zone in the UI I have one physical network with management (cloudbr0), guest (cloudbr1) and storage (cloudbr2).
    When I fill in the storage traffic page I use the range 172.16.5.10 - 14 as free IPs as I exclude the ones already allocated to the compute nodes and the NFS server.

    I think maybe I am doing something wrong in the UI setup but it is not obvious to me what it is.

    What I might try today unless you want me to keep the setup I have for more outputs is to go back to 2 NICs, one for storage/management and one for guest VMs.

    I think with the 2 NICs setup the mistake I made last time when adding the zone was to assume storage would just run over management so I did not drag and drop the storage icon and assign it to cloudbr0 as with the management which I think is what I should do ?





    ________________________________
    From: Rafael Weingärtner <rafaelweingartner@xxxxxxxxx>
    Sent: 06 June 2018 10:54
    To: users
    Subject: Re: advanced networking with public IPs direct to VMs

    Jon, do not panic we are here to help you :)
    So, I might have mistyped the SQL query. You you use select * from
    cloud.storage_pool where cluster_id = 1 and removed is not null ", you are
    listing the storage pools removed. Therefore, the right query would be "
    select * from cloud.storage_pool where cluster_id = 1 and removed is null "

    There is also something else I do not understand. You are setting the
    storage IP in the management subnet? I am not sure if you should be doing
    like this. Normally, I set all my storages (primary[when working with NFS]
    and secondary) to IPs in the storage subnet.

    On Wed, Jun 6, 2018 at 6:49 AM, Dag Sonstebo <Dag.Sonstebo@xxxxxxxxxxxxx>
    wrote:

    > Hi John,
    >
    > I’m late to this thread and have possibly missed some things – but a
    > couple of observations:
    >
    > “When I add the zone and get to the storage web page I exclude the IPs
    > already used for the compute node NICs and the NFS server itself. …..”
    > “So the range is 172.30.5.1 -> 15 and the range I fill in is 172.30.5.10
    > -> 172.30.5.14.”
    >
    > I think you may have some confusion around the use of the storage network.
    > The important part here is to understand this is for *secondary storage*
    > use only – it has nothing to do with primary storage. This means this
    > storage network needs to be accessible to the SSVM, to the hypervisors, and
    > secondary storage NFS pools needs to be accessible on this network.
    >
    > The important part – this also means you *can not use the same IP ranges
    > for management and storage networks* - doing so means you will have issues
    > where effectively both hypervisors and SSVM can see the same subnet on two
    > NICs – and you end up in a routing black hole.
    >
    > So – you need to either:
    >
    > 1) Use different IP subnets on management and storage, or
    > 2) preferably just simplify your setup – stop using a secondary storage
    > network altogether and just allow secondary storage to use the management
    > network (which is default). Unless you have a very high I/O environment in
    > production you are just adding complexity by running separate management
    > and storage.
    >
    > Regards,
    > Dag Sonstebo
    > Cloud Architect
    > ShapeBlue
    >
    > On 06/06/2018, 10:18, "Jon Marshall" <jms.123@xxxxxxxxxxxxx> wrote:
    >
    >     I will disconnect the host this morning and test but before I do that
    > I ran this command when all hosts are up -
    >
    >
    >
    >
    >
    >      select * from cloud.host;
    >     +----+-----------------+------------------------------------
    > --+--------+--------------------+--------------------+------
    > -----------+---------------------+--------------------+-----
    > ------------+---------------------+----------------------+--
    > ---------------------+-------------------+------------+-----
    > --------------+-----------------+--------------------+------
    > ------+----------------+--------+-------------+------+------
    > -+-------------------------------------+---------+----------
    > -------+--------------------+------------+----------+-------
    > ---+--------+------------+--------------+-------------------
    > --------------------------------------------+-----------+---
    > ----+-------------+------------+----------------+-----------
    > ----------+---------------------+---------+--------------+--
    > --------------+-------+-------------+--------------+
    >     | id | name            | uuid                                 | status
    > | type               | private_ip_address | private_netmask |
    > private_mac_address | storage_ip_address | storage_netmask |
    > storage_mac_address | storage_ip_address_2 | storage_mac_address_2 |
    > storage_netmask_2 | cluster_id | public_ip_address | public_netmask  |
    > public_mac_address | proxy_port | data_center_id | pod_id | cpu_sockets |
    > cpus | speed | url                                 | fs_type |
    > hypervisor_type | hypervisor_version | ram        | resource | version  |
    > parent | total_size | capabilities | guid
    >                         | available | setup | dom0_memory | last_ping  |
    > mgmt_server_id | disconnected        | created             | removed |
    > update_count | resource_state | owner | lastUpdated | engine_state |
    >     +----+-----------------+------------------------------------
    > --+--------+--------------------+--------------------+------
    > -----------+---------------------+--------------------+-----
    > ------------+---------------------+----------------------+--
    > ---------------------+-------------------+------------+-----
    > --------------+-----------------+--------------------+------
    > ------+----------------+--------+-------------+------+------
    > -+-------------------------------------+---------+----------
    > -------+--------------------+------------+----------+-------
    > ---+--------+------------+--------------+-------------------
    > --------------------------------------------+-----------+---
    > ----+-------------+------------+----------------+-----------
    > ----------+---------------------+---------+--------------+--
    > --------------+-------+-------------+--------------+
    >     |  1 | dcp-cscn1.local | d97b930c-ab5f-4b7d-9243-eabd60012284 | Up
    >  | Routing            | 172.30.3.3         | 255.255.255.192 |
    > 00:22:19:92:4e:34   | 172.30.3.3         | 255.255.255.192 |
    > 00:22:19:92:4e:34   | NULL                 | NULL                  | NULL
    >             |          1 | 172.30.4.3        | 255.255.255.128 |
    > 00:22:19:92:4e:35  |       NULL |              1 |      1 |           1 |
    >   2 |  2999 | iqn.1994-05.com.redhat:fa437fb0c023 | NULL    | KVM
    >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
    >    NULL | hvm,snapshot | 9f2b15cb-1b75-321b-bf59-f83e7a5e8efb-LibvirtComputingResource
    > |         1 |     0 |           0 | 1492390408 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:44:33 | NULL    |            4 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  2 | v-2-VM          | ce1f4594-2b4f-4b2b-a239-3f5e2c2215b0 | Up
    >  | ConsoleProxy       | 172.30.3.49        | 255.255.255.192 |
    > 1e:00:80:00:00:14   | 172.30.3.49        | 255.255.255.192 |
    > 1e:00:80:00:00:14   | NULL                 | NULL                  | NULL
    >             |       NULL | 172.30.4.98       | 255.255.255.128 |
    > 1e:00:c9:00:00:5f  |       NULL |              1 |      1 |        NULL |
    > NULL |  NULL | NoIqn                               | NULL    | NULL
    >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
    >  NULL | NULL         | Proxy.2-ConsoleProxyResource
    >           |         1 |     0 |           0 | 1492390409 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:46:22 | NULL    |            7 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  3 | s-1-VM          | 107d0a8e-e2d1-42b5-8b9d-ff3845bb556c | Up
    >  | SecondaryStorageVM | 172.30.3.34        | 255.255.255.192 |
    > 1e:00:3b:00:00:05   | 172.30.3.34        | 255.255.255.192 |
    > 1e:00:3b:00:00:05   | NULL                 | NULL                  | NULL
    >             |       NULL | 172.30.4.86       | 255.255.255.128 |
    > 1e:00:d9:00:00:53  |       NULL |              1 |      1 |        NULL |
    > NULL |  NULL | NoIqn                               | NULL    | NULL
    >     | NULL               |          0 | NULL     | 4.11.0.0 | NULL   |
    >  NULL | NULL         | s-1-VM-NfsSecondaryStorageResource
    >             |         1 |     0 |           0 | 1492390407 |   146457912294
    > | 2018-06-05 14:09:22 | 2018-06-05 13:46:27 | NULL    |            7 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  4 | dcp-cscn2.local | f0c076cb-112f-4f4b-a5a4-1a96ffac9794 | Up
    >  | Routing            | 172.30.3.4         | 255.255.255.192 |
    > 00:26:b9:4a:97:7d   | 172.30.3.4         | 255.255.255.192 |
    > 00:26:b9:4a:97:7d   | NULL                 | NULL                  | NULL
    >             |          1 | 172.30.4.4        | 255.255.255.128 |
    > 00:26:b9:4a:97:7e  |       NULL |              1 |      1 |           1 |
    >   2 |  2999 | iqn.1994-05.com.redhat:e9b4aa7e7881 | NULL    | KVM
    >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
    >    NULL | hvm,snapshot | 40e58399-fc7a-3a59-8f48-16d0f99b11c9-LibvirtComputingResource
    > |         1 |     0 |           0 | 1492450882 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:46:33 | NULL    |            8 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     |  5 | dcp-cscn3.local | 0368ae16-550f-43a9-bb40-ee29d2b5c274 | Up
    >  | Routing            | 172.30.3.5         | 255.255.255.192 |
    > 00:24:e8:73:6a:b2   | 172.30.3.5         | 255.255.255.192 |
    > 00:24:e8:73:6a:b2   | NULL                 | NULL                  | NULL
    >             |          1 | 172.30.4.5        | 255.255.255.128 |
    > 00:24:e8:73:6a:b3  |       NULL |              1 |      1 |           1 |
    >   2 |  3000 | iqn.1994-05.com.redhat:ccdce43aff1c | NULL    | KVM
    >      | NULL               | 7510159360 | NULL     | 4.11.0.0 | NULL   |
    >    NULL | hvm,snapshot | 10bb1c01-0e92-3108-8209-37f3eebad8fb-LibvirtComputingResource
    > |         1 |     0 |           0 | 1492390408 |   146457912294 |
    > 2018-06-05 14:09:22 | 2018-06-05 13:47:04 | NULL    |            6 |
    > Enabled        | NULL  | NULL        | Disabled     |
    >     +----+-----------------+------------------------------------
    > --+--------+--------------------+--------------------+------
    > -----------+---------------------+--------------------+-----
    > ------------+---------------------+----------------------+--
    > ---------------------+-------------------+------------+-----
    > --------------+-----------------+--------------------+------
    > ------+----------------+--------+-------------+------+------
    > -+-------------------------------------+---------+----------
    > -------+--------------------+------------+----------+-------
    > ---+--------+------------+--------------+-------------------
    > --------------------------------------------+-----------+---
    > ----+-------------+------------+----------------+-----------
    > ----------+---------------------+---------+--------------+--
    > --------------+-------+-------------+--------------+
    >     5 rows in set (0.00 sec)
    >
    >
    >
    >     and you can see that it says the storage IP address is the same as the
    > private IP address (the management network).
    >
    >
    >     I also ran the command you provided using the Cluster ID number from
    > the table above -
    >
    >
    >
    >     mysql> select * from cloud.storage_pool where cluster_id = 1 and
    > removed is not null;
    >     Empty set (0.00 sec)
    >
    >     mysql>
    >
    >     So assuming I am reading this correctly that seems to be the issue.
    >
    >
    >     I am at a loss as to why though.
    >
    >
    >     I have a separate NIC for storage as described. When I add the zone
    > and get to the storage web page I exclude the IPs already used for the
    > compute node NICs and the NFS server itself. I do this because initially I
    > didn't and the SSVM started using the IP address of the NFS server.
    >
    >
    >     So the range is 172.30.5.1 -> 15 and the range I fill in is
    > 172.30.5.10 -> 172.30.5.14.
    >
    >
    >     And I used the label "cloudbr2" for storage.
    >
    >
    >     I must be doing this wrong somehow.
    >
    >
    >     Any pointers would be much appreciated.
    >
    >
    >
    >
    >     ________________________________
    >     From: Rafael Weingärtner <rafaelweingartner@xxxxxxxxx>
    >     Sent: 05 June 2018 16:13
    >     To: users
    >     Subject: Re: advanced networking with public IPs direct to VMs
    >
    >     That is interesting. Let's see the source of all truth...
    >     This is the code that is generating that odd message.
    >
    >     >     List<StoragePoolVO> clusterPools =
    >     > _storagePoolDao.listPoolsByCluster(agent.getClusterId());
    >     >         boolean hasNfs = false;
    >     >         for (StoragePoolVO pool : clusterPools) {
    >     >             if (pool.getPoolType() == StoragePoolType.NetworkFilesystem)
    > {
    >     >                 hasNfs = true;
    >     >                 break;
    >     >             }
    >     >         }
    >     >         if (!hasNfs) {
    >     >             s_logger.warn(
    >     >                     "Agent investigation was requested on host " +
    > agent +
    >     > ", but host does not support investigation because it has no NFS
    > storage.
    >     > Skipping investigation.");
    >     >             return Status.Disconnected;
    >     >         }
    >     >
    >
    >     There are two possibilities here. You do not have any NFS storage? Is
    > that
    >     the case? Or maybe, for some reason, the call
    >     "_storagePoolDao.listPoolsByCluster(agent.getClusterId())" is not
    > returning
    >     any NFS storage pools. Looking at the "listPoolsByCluster " we will see
    >     that the following SQL is used:
    >
    >     Select * from storage_pool where cluster_id = <host'sClusterId> and
    > removed
    >     > is not null
    >     >
    >
    >     Can you run that SQL to see the its return when your hosts are marked
    > as
    >     disconnected?
    >
    >
    > Dag.Sonstebo@xxxxxxxxxxxxx
    > www.shapeblue.com<http://www.shapeblue.com>
    Shapeblue - The CloudStack Company<http://www.shapeblue.com/>
    www.shapeblue.com<http://www.shapeblue.com>
    ShapeBlue are the largest independent integrator of CloudStack technologies globally and are specialists in the design and implementation of IaaS cloud infrastructures for both private and public cloud implementations.



    > 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
    > @shapeblue
    >
    >
    >
    > On Tue, Jun 5, 2018 at 11:32 AM, Jon Marshall <jms.123@xxxxxxxxxxxxx>
    > wrote:
    >
    >     > I reran the tests with the 3 NIC setup. When I configured the zone
    > through
    >     > the UI I used the labels cloudbr0 for management, cloudbr1 for guest
    >     > traffic and cloudbr2 for NFS as per my original response to you.
    >     >
    >     >
    >     > When I pull the power to the node (dcp-cscn2.local) after about 5
    > mins
    >     > the  host status goes to "Alert" but never to "Down"
    >     >
    >     >
    >     > I get this in the logs -
    >     >
    >     >
    >     > 2018-06-05 15:17:14,382 WARN  [c.c.h.KVMInvestigator]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent investigation
    > was
    >     > requested on host Host[-4-Routing], but host does not support
    > investigation
    >     > because it has no NFS storage. Skipping investigation.
    >     > 2018-06-05 15:17:14,382 DEBUG [c.c.h.HighAvailabilityManagerImpl]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) KVMInvestigator was
    > able to
    >     > determine host 4 is in Disconnected
    >     > 2018-06-05 15:17:14,382 INFO  [c.c.a.m.AgentManagerImpl]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) The agent from host
    > 4 state
    >     > determined is Disconnected
    >     > 2018-06-05 15:17:14,382 WARN  [c.c.a.m.AgentManagerImpl]
    >     > (AgentTaskPool-1:ctx-f4da4dc9) (logid:138e9a93) Agent is
    > disconnected but
    >     > the host is still up: 4-dcp-cscn2.local
    >     >
    >     > I don't understand why it thinks there is no NFS storage as each
    > compute
    >     > node has a dedicated storage NIC.
    >     >
    >     >
    >     > I also don't understand why it thinks the host is still up ie. what
    > test
    >     > is it doing to determine that ?
    >     >
    >     >
    >     > Am I just trying to get something working that is not supported ?
    >     >
    >     >
    >     > ________________________________
    >     > From: Rafael Weingärtner <rafaelweingartner@xxxxxxxxx>
    >     > Sent: 04 June 2018 15:31
    >     > To: users
    >     > Subject: Re: advanced networking with public IPs direct to VMs
    >     >
    >     > What type of failover are you talking about?
    >     > What ACS version are you using?
    >     > What hypervisor are you using?
    >     > How are you configuring your NICs in the hypervisor?
    >     > How are you configuring the traffic labels in ACS?
    >     >
    >     > On Mon, Jun 4, 2018 at 11:29 AM, Jon Marshall <jms.123@xxxxxxxxxxxxx
    > >
    >     > wrote:
    >     >
    >     > > Hi all
    >     > >
    >     > >
    >     > > I am close to giving up on basic networking as I just cannot get
    > failover
    >     > > working with multiple NICs (I am not even sure it is supported).
    >     > >
    >     > >
    >     > > What I would like is to use 3 NICs for management, storage and
    > guest
    >     > > traffic. I would like to assign public IPs direct to the VMs which
    > is
    >     > why I
    >     > > originally chose basic.
    >     > >
    >     > >
    >     > > If I switch to advanced networking do I just configure a guest VM
    > with
    >     > > public IPs on one NIC and not both with the public traffic -
    >     > >
    >     > >
    >     > > would this work ?
    >     > >
    >     >
    >     >
    >     >
    >     > --
    >     > Rafael Weingärtner
    >     >
    >
    >
    >
    >     --
    >     Rafael Weingärtner
    >
    >
    >


    --
    Rafael Weingärtner



Dag.Sonstebo@xxxxxxxxxxxxx
www.shapeblue.com<http://www.shapeblue.com>
53 Chandos Place, Covent Garden, London  WC2N 4HSUK
@shapeblue