16:19 <blackboxsw> #startmeeting Cloud-init bi-weekly status 16:19 <meetingology> Meeting started Mon Jun 10 16:19:45 2019 UTC. The chair is blackboxsw. Information about MeetBot at http://wiki.ubuntu.com/meetingology. 16:19 <meetingology> 16:19 <meetingology> Available commands: action commands idea info link nick 16:19 <rharper> o/ 16:20 <Odd_Bloke> o/ 16:20 <blackboxsw> hi cloud-init folks. let's kick off the bi-weekly meeting again 16:21 <blackboxsw> our last meeting minutes are hosted on github 16:21 <blackboxsw> #link https://cloud-init.github.io 16:22 <blackboxsw> welcome all. Generally cloud-init upstream uses this meeting to provide a platform for status updates, raising questions or concerns and feature discussion. All are encouraged to participate as you see fit. 16:22 <blackboxsw> our format is the following topics: Previous Actions, Recent Changes, In-progress Development, Office Hours 16:23 <blackboxsw> interjections and additional topics are welcome 16:23 <blackboxsw> #topic Previous Actions 16:24 <blackboxsw> Checking last meeting's minutes we were clear of old actions. 16:24 <blackboxsw> so we'll jump to the next topic this week. 16:24 <blackboxsw> #topic Recent Changes 16:26 <blackboxsw> the following commits landedd in cloud-init tip since the last status meeting 16:26 <blackboxsw> - Allow identification of OpenStack by Asset Tag 16:26 <blackboxsw> [Mark T. Voelker] ([LP: #1669875](https://bugs.launchpad.net/bugs/1669875)) 16:26 <blackboxsw> - Fix spelling error making 'an Ubuntu' consistent. [Brian Murray] 16:26 <blackboxsw> - run-container: centos: comment out the repo mirrorlist [Paride Legovini] 16:26 <blackboxsw> - netplan: update netplan key mappings for gratuitous-arp 16:26 <blackboxsw> [Ryan Harper] ([LP: #1827238](https://bugs.launchpad.net/bugs/1827238)) 16:26 <ubot5> Launchpad bug 1669875 in OpenStack Compute (nova) "identify openstack vmware platform" [Wishlist,Confirmed] 16:26 <ubot5> Launchpad bug 1827238 in cloud-init "Machines fail to deploy because cloud-init needs to accept both netplan spellings for grat arp" [Medium,Fix committed] 16:30 <blackboxsw> I was poking around out trello board to see if we've moved other cloud-init related content into the done lane, but I think those commits about capture the recent work 16:30 <blackboxsw> #link https://trello.com/b/hFtWKUn3/daily-cloud-init-curtin 16:30 <blackboxsw> #topic In Progress Development 16:31 <blackboxsw> our active reviews are located here (as mentioned in the topic) 16:31 <blackboxsw> #link https://code.launchpad.net/cloud-init/+activereviews 16:32 <blackboxsw> Goneri: thanks for all the work on freebsd branches, there has been some good momentum there 16:32 <blackboxsw> there is ongoing work from Azure datasource that will likely land in the next week or two 16:33 <paride> ^^ "run-container: centos: comment out the repo mirrorlist", only actually relevent when using an http/https proxy, in all the other cases the mirrorlist works as usual 16:33 <blackboxsw> and some network-related changes landing shortly 16:33 <blackboxsw> paride: thank you paride for the extra note 16:33 <AnhVoMSFT> blackboxsw can you share more details on the work from Azure datasource ? Any bug that we can reference? 16:33 <blackboxsw> I was thinking https://code.launchpad.net/~jasonzio/cloud-init/+git/cloud-init/+merge/364012 AnhVoMSFT 16:35 <rharper> related to sorting out covering the all the network related scenarios so that we configure network in a way that ensures access to IMDS and internet in the face of additional static ips on the same subnet as the primary interface, multiple dhcp interfaces with default routes, 16:35 <AnhVoMSFT> I see - I think there potentially needs some bigger change there, as there was some issue around identifying the primary/secondary NIC. We got confirmation from our netwoking team that the first NIC returned is the primary 16:35 <rharper> AnhVoMSFT: good to know; that was our observation 16:36 <rharper> AnhVoMSFT: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1815254 , related as well; the plan being to put in place some source-based routing; 16:36 <ubot5> Launchpad bug 1815254 in cloud-init (Ubuntu) "Azure multiple ips prevent access to metadata service" [Undecided,Confirmed] 16:38 <AnhVoMSFT> thanks rharper - is that something that should be changed/fixed from cloudinit, or is this more platform related? 16:38 <rharper> that's a good question; generally it would be *great* if a platform were to include source-routes and metrics in the config they send 16:38 <AnhVoMSFT> if the latter I will file a workitem on our side to go do some research and get the right team to take a look at it 16:39 <rharper> currently no cloud does this, rather *some* indicate a *primary* via metadata, and then the OS scripts apply a metric to all non-primary routes to ensure that default routes go to the primary 16:39 <AnhVoMSFT> I see - so I guess we can do similarly on Azure since we know what the primary is (first nic returned in IMDS) 16:40 <rharper> AnhVoMSFT: so in the short term, I think cloud-init should (where possible with the OS network config) provide additional tuning (likely post-scripts in some cases) to tune the routing for what cloud-init knows is the primary route 16:40 <rharper> AnhVoMSFT: yes, I prefer a primary=True or whatever, but it's good enough to have the current behavior documented (in the code) 16:40 <AnhVoMSFT> thanks rharper 16:40 <rharper> so if it change/breaks, then we know 16:44 <rharper> I think that covers our in-progress items for the moment 16:45 <rharper> not sure if the bot will listen to me, but just in case 16:45 <robjo> Be mindful that in Azure the metadata service may lag behind by minutes w.r.t. secondary IPs on an interface 16:45 <rharper> #topic Office Hours 16:45 <rharper> robjo: in general, my awareness is that the instance has to be off line to change vnets and such; and booting back up has been enough time to see IMDS updated, do you see differently ? 16:46 <AnhVoMSFT> robjo that is good to know, I will check on that 16:46 <robjo> We've had various issues with cloud-netconfig due to the metadata server in Azure being slow and reverted to polling, which of course got us in trouble with API rate limits 16:46 <rharper> robjo: interesting 16:47 <rharper> We'll here in channel so if youve;; got merges or bugs that need an eye or just questions, fire away 16:47 <AnhVoMSFT> robjo feel free to file a bug on that and we will investigate - IMDS is our partner team so we'll get some answer quickly there 16:48 <AnhVoMSFT> rharper, a couple things I want to ask for Office Hours 16:48 <robjo> AnhVoMSFT: We have been working with Stephen Zarkos on the issues 16:48 <blackboxsw> #topic Office Hours (next ~30 mins) 16:48 <AnhVoMSFT> robjo I will ping Stephen and get more detail and see if we have any follow up items 16:48 <blackboxsw> sorry folks got pulled away for a bit thx rharper 16:48 <robjo> And double checked that the polling direction was OK form the Microsoft perspective before we implemented that 16:49 <AnhVoMSFT> I see, glad you're not blocked on it 16:50 <robjo> rharper: We always had bug reports that upon reboot not everything was always configured when secondary IP addresses were in play. But theoretically yes upon reboot everything should be there 16:50 <AnhVoMSFT> rharper we have a customer who booted up a VM based on 18.04, which uses netplan. Cloudinit wrote a netplan file to the image. He then installed ifupdown, then had some networking change which triggered a mac address change. Upon rebooting, cloudinit tries to use eni, but netplan file was still there, which caused his VM to mess up the network config 16:50 <robjo> putting cloud-netconfig into polling mode pretty mush addresses the issues we had reports about 16:51 <rharper> AnhVoMSFT: yes; that sounds very likely 16:51 <rharper> AnhVoMSFT: did they file a bug? 16:51 <rharper> cloud-init net "detects" which service is present 16:51 <AnhVoMSFT> I'm checking to see if this should be a bug, or that is expected behavior 16:51 <rharper> so if they did not uninstall netplan.io then cloud-init will likely prefer that over eni 16:52 <AnhVoMSFT> cloudinit actually prefers eni if ifupdown is installed, I think 16:52 <rharper> AnhVoMSFT: so the etc/netplan/*.yaml would only trigger things if netplan is still present; the systemd-generator will read yaml and write out networkd files 16:53 <AnhVoMSFT> right, I think the customer's mistake was to not uninstall netplan (or remove any netplan configuration file) after installing ifupdown 16:53 <rharper> AnhVoMSFT: right; I think we'll need to see the log and system state, but it sounds like an incomplete uninstall of netplan 16:53 <rharper> uninstall of netplan should be enough to make the cloud-init.yaml inert 16:54 <rharper> https://netplan.io/faq#how-to-go-back-to-ifupdown 16:54 <rharper> AnhVoMSFT: it *should* have automatically uninstall netplan.io 16:54 <AnhVoMSFT> I'm not sure if there is much we can do from the cloudinit side - perhaps if choosing eni, disable the cloud-init netplan yaml 16:54 <rharper> AnhVoMSFT: well, we could check writable paths of the renderers 16:54 <AnhVoMSFT> rharper I don't think that is the behavior on 18.04 - installing ifupdown will not uninstall netplan 16:55 <rharper> AnhVoMSFT: you're right; =( 16:55 <rharper> that sort of feels like a bug in the packaging 16:55 <AnhVoMSFT> yes, I share the same sentiment 16:56 <AnhVoMSFT> I will go ahead and file a bug so even if we don't have a short term action we can still capture the discussion 16:57 <rharper> AnhVoMSFT: thanks, I'm pinging in #netplan and the bug will be great so we can figure out the right plan 16:59 <AnhVoMSFT> second question: We have an intern working in our team and as part of warming up in cloudinit he wrote some additional capabilities into cloud-init analyze, adding a "boot" module (in addition to show/blame/dump), which collects timestamps of phases happening during vm booting up, but before cloudinit started, such as kernel initialization, systemd initialization.. 17:00 <AnhVoMSFT> this should work for all cloud (he tested in AWE/GCP). Currently only works for distros that uses systemd. He'll try to figure out how to get those counters for freebsd and others 17:00 <AnhVoMSFT> rharper since you were the original author of analyze, I'm trying to gauge the interest on this and we're open to suggestions/questions 17:01 <cyphermox> rharper: they can coexist and configure each their own interface, so it's not a conflict. It's no different than coexisting ifupdown and NetworkManager, or also NetworkManager and systemd-networkd 17:01 <rharper> AnhVoMSFT: that sounds excellent 17:01 <blackboxsw> nice AnhVoMSFT on the commandline extensions! 17:01 <rharper> AnhVoMSFT: happy to review branch or Work-in-Progress when it's available 17:02 <AnhVoMSFT> thanks rharper blackboxsw we will have that in a branch very soon. 17:03 <AnhVoMSFT> cyphermox if that is the case then either the customer or cloudinit needs to make sure the system does not have conflicting configuration for netplan/eni. 17:03 <rharper> cyphermox: ok; would you be open to some sort of warning about having config in both or something? I dunno; it's just not a great experience to add the new package, configure it, reboot and not have networking since the same interface was configured (differenlty) in both packages 17:03 <blackboxsw> yeah, I'm quite intterested in any additional cli functionality that cloud-init more versatile as a system debug tool 17:04 <blackboxsw> *makes cloud-init more versatile* 17:04 <cyphermox> rharper: I'm not opposed to a warning, but that's not necessarily better UX. 17:05 <cyphermox> debconf prompts are quite annoying to have at upgrade, and just writing it out people are likely to miss it altogether 17:05 <cyphermox> (so you wouldn't really gain much) 17:05 <AnhVoMSFT> blackboxsw yep that was the goal - we want to be able to deploy 1000 VMs, then use cloud-init analyze output to analyze the 50th/99th percentile of where the timing was spent during system boot, and we need some more insights into phases before cloud-init started as well 17:05 <rharper> cyphermox: agreed; having a pointer to suggest cleaning/checking/confirming configs if /etc/netplan/ is non-empty and netplan.io is installed 17:06 <cyphermox> rharper: one option is to parse enough of /etc/network/ to catch mentions of the interface, but that's not necessarily super solid (though it's the best option), because people can rename interfaces in netplan and match by mac 17:06 <rharper> might be helpful; though I agree that they may still ignore that; and cloud-init could do some more work to see if an image has multiple renderers available and ensure it didn't leave config for a previous boot around 17:07 <rharper> cyphermox: yeah; cloud-init knows more about the config and both formats; we're likely in a better spot to see "you've configured this interface twice" 17:08 <cyphermox> rharper: so in short, I'm not opposed to improving the UX, but I'm not wowed by any solution right now (even mine) 17:09 <rharper> cyphermox: that's fair; thanks 17:09 <AnhVoMSFT> i think a fix in cloudinit might make most stakeholders happy here. It knows which configuration file it wrote, so it can definitely look for conflicting configurations 17:09 <rharper> cyphermox: AnhVoMSFT is going to file the customer bug with details and we can discuss what (if any) improvements are to be made; I suspect cloud-init can help most here 17:09 <cyphermox> yes, I think so too 17:09 <rharper> cyphermox: thanks for the input 17:09 <AnhVoMSFT> it can't be responsible for everything the customer does though. If customer writes some my-own-netplan.yml, we can't help much 17:10 <cyphermox> rharper: but hey, if someone was to write a check when running netplan apply that there exists config in /etc/network, I wouldn't have much issues merging it 17:10 <rharper> AnhVoMSFT: right, we have several "maybe_delete_if" where we verify expected output before we remove things 17:10 <cyphermox> I just know I won't have time to look into this myself in the near future 17:10 <rharper> cyphermox: ack 17:11 <cyphermox> I think what will help most is aggressively deprecating and removing ifupdown 17:13 <cyphermox> that said, the best we can realistically do for the time being is to demote it to universe 17:13 <cyphermox> (and that's not going to change anything for UX) 17:15 <AnhVoMSFT> we had another instance of someone installing ifupdown2, which had the effect of removing cloud-init on debian/ubuntu 16.04 17:16 <AnhVoMSFT> and totally hosed his system, but that's a different issue altogether 17:26 <blackboxsw> s 17:27 <blackboxsw> thanks for the good discussion folks, I guess we'll just add an action item to followup on a netplan bug for next time to see where we are at 17:31 <blackboxsw> #action follow up any bugs related to Azure/netplan uninstall in favor ifupdown to see if cloud-init has actionable feature work to ensure proper network renderer is used 17:31 * meetingology follow up any bugs related to Azure/netplan uninstall in favor ifupdown to see if cloud-init has actionable feature work to ensure proper network renderer is used 17:31 <blackboxsw> ok, I'll post minutes on this. thank you again rharper for driving 17:31 <blackboxsw> and for the participation robjo cyphermox and AnhVoMSFT 17:31 <blackboxsw> #endmeeting