16:19 <blackboxsw> #startmeeting Cloud-init bi-weekly status
16:19 <meetingology> Meeting started Mon Jun 10 16:19:45 2019 UTC.  The chair is blackboxsw. Information about MeetBot at http://wiki.ubuntu.com/meetingology.
16:19 <meetingology> 
16:19 <meetingology> Available commands: action commands idea info link nick
16:19 <rharper> o/
16:20 <Odd_Bloke> o/
16:20 <blackboxsw> hi cloud-init folks. let's kick off the bi-weekly meeting again
16:21 <blackboxsw> our last meeting minutes are hosted on github
16:21 <blackboxsw> #link https://cloud-init.github.io
16:22 <blackboxsw> welcome all. Generally cloud-init upstream uses this meeting to provide a platform for status updates, raising questions or concerns and feature discussion. All are encouraged to participate as you see fit.
16:22 <blackboxsw> our format is the following topics: Previous Actions, Recent Changes, In-progress Development, Office Hours
16:23 <blackboxsw> interjections and additional topics are welcome
16:23 <blackboxsw> #topic Previous Actions
16:24 <blackboxsw> Checking last meeting's minutes  we were clear of old actions.
16:24 <blackboxsw> so we'll jump to the next topic this week.
16:24 <blackboxsw> #topic Recent Changes
16:26 <blackboxsw> the following commits landedd in cloud-init tip since the last status meeting
16:26 <blackboxsw> - Allow identification of OpenStack by Asset Tag
16:26 <blackboxsw> [Mark T. Voelker] ([LP: #1669875](https://bugs.launchpad.net/bugs/1669875))
16:26 <blackboxsw> - Fix spelling error making 'an Ubuntu' consistent. [Brian Murray]
16:26 <blackboxsw> - run-container: centos: comment out the repo mirrorlist [Paride Legovini]
16:26 <blackboxsw> - netplan: update netplan key mappings for gratuitous-arp
16:26 <blackboxsw> [Ryan Harper] ([LP: #1827238](https://bugs.launchpad.net/bugs/1827238))
16:26 <ubot5> Launchpad bug 1669875 in OpenStack Compute (nova) "identify openstack vmware platform" [Wishlist,Confirmed]
16:26 <ubot5> Launchpad bug 1827238 in cloud-init "Machines fail to deploy because cloud-init needs to accept both netplan spellings for grat arp" [Medium,Fix committed]
16:30 <blackboxsw> I was poking around out trello board to see if we've moved other cloud-init related content into the done lane, but I think those commits about capture the recent work
16:30 <blackboxsw> #link  https://trello.com/b/hFtWKUn3/daily-cloud-init-curtin
16:30 <blackboxsw> #topic In Progress Development
16:31 <blackboxsw> our active reviews are located here (as mentioned in the topic)
16:31 <blackboxsw> #link https://code.launchpad.net/cloud-init/+activereviews
16:32 <blackboxsw> Goneri: thanks for all the work on freebsd branches, there has been some good momentum there
16:32 <blackboxsw> there is ongoing work from Azure datasource that will likely land in the next week or two
16:33 <paride> ^^ "run-container: centos: comment out the repo mirrorlist", only actually relevent when using an http/https proxy, in all the other cases the mirrorlist works as usual
16:33 <blackboxsw> and some network-related changes landing shortly
16:33 <blackboxsw> paride: thank you paride  for the extra note
16:33 <AnhVoMSFT> blackboxsw can you share more details on the work from Azure datasource ? Any bug that we can reference?
16:33 <blackboxsw> I was thinking https://code.launchpad.net/~jasonzio/cloud-init/+git/cloud-init/+merge/364012 AnhVoMSFT
16:35 <rharper> related to sorting out covering the all the network related scenarios so that we configure network in a way that ensures access to IMDS and internet in the face of  additional static ips on the same subnet as the primary interface, multiple dhcp interfaces with default routes,
16:35 <AnhVoMSFT> I see - I think there potentially needs some bigger change there, as there was some issue around identifying the primary/secondary NIC. We got confirmation from our netwoking team that the first NIC returned is the primary
16:35 <rharper> AnhVoMSFT: good to know; that was our observation
16:36 <rharper> AnhVoMSFT: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1815254 , related  as well;  the plan being to put in place some source-based routing;
16:36 <ubot5> Launchpad bug 1815254 in cloud-init (Ubuntu) "Azure multiple ips prevent access to metadata service" [Undecided,Confirmed]
16:38 <AnhVoMSFT> thanks rharper - is that something that should be changed/fixed from cloudinit, or is this more platform related?
16:38 <rharper> that's a good question;  generally it would be *great* if a platform were to include source-routes and metrics in the config they send
16:38 <AnhVoMSFT> if the latter I will file a workitem on our side to go do some research and get the right team to take a look at it
16:39 <rharper> currently no cloud does this, rather *some* indicate a *primary* via metadata, and then the OS scripts apply a metric to all non-primary routes to ensure that default routes go to the primary
16:39 <AnhVoMSFT> I see - so I guess we can do similarly on Azure since we know what the primary is (first nic returned in IMDS)
16:40 <rharper> AnhVoMSFT: so in the short term, I think cloud-init should (where possible with the OS network config) provide additional tuning (likely post-scripts in some cases) to tune the routing for what cloud-init knows is the primary route
16:40 <rharper> AnhVoMSFT: yes, I prefer a primary=True or whatever, but it's good enough to have the current behavior documented (in the code)
16:40 <AnhVoMSFT> thanks rharper
16:40 <rharper> so if it change/breaks, then we know
16:44 <rharper> I think that covers our in-progress items for the moment
16:45 <rharper> not sure if the bot will listen to me, but just in case
16:45 <robjo> Be mindful that in Azure the metadata service may lag behind by minutes w.r.t. secondary IPs on an interface
16:45 <rharper> #topic Office Hours
16:45 <rharper> robjo: in general, my awareness is that the instance has to be off line to change vnets and such; and booting back up has been enough time to see IMDS updated, do you see differently ?
16:46 <AnhVoMSFT> robjo that is good to know, I will check on that
16:46 <robjo> We've had various issues with cloud-netconfig due to the metadata server in Azure being slow and reverted to polling, which of course got us in trouble with API rate limits
16:46 <rharper> robjo: interesting
16:47 <rharper> We'll here in channel so if youve;; got merges or bugs that need an eye or just questions, fire away
16:47 <AnhVoMSFT> robjo feel free to file a bug on that and we will investigate - IMDS is our partner team so we'll get some answer quickly there
16:48 <AnhVoMSFT> rharper, a couple things I want to ask for Office Hours
16:48 <robjo> AnhVoMSFT: We have been working with Stephen Zarkos on the issues
16:48 <blackboxsw> #topic Office Hours (next ~30 mins)
16:48 <AnhVoMSFT> robjo I will ping Stephen and get more detail and see if we have any follow up items
16:48 <blackboxsw> sorry folks got pulled away for a bit thx rharper
16:48 <robjo> And double checked that the polling direction was OK form the Microsoft perspective before we implemented that
16:49 <AnhVoMSFT> I see, glad you're not blocked on it
16:50 <robjo> rharper: We always had bug reports that upon reboot not everything was always configured when secondary IP addresses were in play. But theoretically yes upon reboot everything should be there
16:50 <AnhVoMSFT> rharper we have a customer who booted up a VM based on 18.04, which uses netplan. Cloudinit wrote a netplan file to the image. He then installed ifupdown, then had some networking change which triggered a mac address change. Upon rebooting, cloudinit tries to use eni, but netplan file was still there, which caused his VM to mess up the network config
16:50 <robjo> putting cloud-netconfig into polling mode pretty mush addresses the issues we had reports about
16:51 <rharper> AnhVoMSFT: yes; that sounds very likely
16:51 <rharper> AnhVoMSFT: did they file a bug?
16:51 <rharper> cloud-init net "detects" which service is present
16:51 <AnhVoMSFT> I'm checking to see if this should be a bug, or that is expected behavior
16:51 <rharper> so if they did not uninstall netplan.io then cloud-init will likely prefer that over eni
16:52 <AnhVoMSFT> cloudinit actually prefers eni if ifupdown is installed, I think
16:52 <rharper> AnhVoMSFT: so the etc/netplan/*.yaml would only trigger things if netplan is still present;  the systemd-generator will read yaml and write out networkd files
16:53 <AnhVoMSFT> right, I think the customer's mistake was to not uninstall netplan (or remove any netplan configuration file) after installing ifupdown
16:53 <rharper> AnhVoMSFT: right;  I think we'll need to see the log and system state, but it sounds like an incomplete uninstall of netplan
16:53 <rharper> uninstall of netplan should be enough to make the cloud-init.yaml inert
16:54 <rharper> https://netplan.io/faq#how-to-go-back-to-ifupdown
16:54 <rharper> AnhVoMSFT: it *should* have automatically uninstall netplan.io
16:54 <AnhVoMSFT> I'm not sure if there is much we can do from the cloudinit side - perhaps if choosing eni, disable the cloud-init netplan yaml
16:54 <rharper> AnhVoMSFT: well, we could check writable paths of the renderers
16:54 <AnhVoMSFT> rharper I don't think that is the behavior on 18.04 - installing ifupdown will not uninstall netplan
16:55 <rharper> AnhVoMSFT: you're right; =(
16:55 <rharper> that sort of feels like a bug in the packaging
16:55 <AnhVoMSFT> yes, I share the same sentiment
16:56 <AnhVoMSFT> I will go ahead and file a bug so even if we don't have a short term action we can still capture the discussion
16:57 <rharper> AnhVoMSFT: thanks, I'm pinging in #netplan  and the bug will be great so we can figure out the right plan
16:59 <AnhVoMSFT> second question: We have an intern working in our team and as part of warming up in cloudinit he wrote some additional capabilities into cloud-init analyze, adding a "boot" module (in addition to show/blame/dump), which collects timestamps of phases happening during vm booting up, but before cloudinit started, such as kernel initialization, systemd initialization..
17:00 <AnhVoMSFT> this should work for all cloud (he tested in AWE/GCP). Currently only works for distros that uses systemd. He'll try to figure out how to get those counters for freebsd and others
17:00 <AnhVoMSFT> rharper since you were the original author of analyze, I'm trying to gauge the interest on this and we're open to suggestions/questions
17:01 <cyphermox> rharper: they can coexist and configure each their own interface, so it's not a conflict. It's no different than coexisting ifupdown and NetworkManager, or also NetworkManager and systemd-networkd
17:01 <rharper> AnhVoMSFT: that sounds excellent
17:01 <blackboxsw> nice AnhVoMSFT on the commandline extensions!
17:01 <rharper> AnhVoMSFT: happy to review  branch or Work-in-Progress when it's available
17:02 <AnhVoMSFT> thanks rharper blackboxsw we will have that in a branch very soon.
17:03 <AnhVoMSFT> cyphermox if that is the case then either the customer or cloudinit needs to make sure the system does not have conflicting configuration for netplan/eni.
17:03 <rharper> cyphermox: ok;  would you be open to some sort of warning about having config in both or something? I dunno; it's just not a great experience to add the new package, configure it, reboot and not have networking since the same interface was configured (differenlty) in both packages
17:03 <blackboxsw> yeah, I'm quite intterested in any additional cli functionality that cloud-init more versatile as a system debug tool
17:04 <blackboxsw> *makes cloud-init more versatile*
17:04 <cyphermox> rharper: I'm not opposed to a warning, but that's not necessarily better UX.
17:05 <cyphermox> debconf prompts are quite annoying to have at upgrade, and just writing it out people are likely to miss it altogether
17:05 <cyphermox> (so you wouldn't really gain much)
17:05 <AnhVoMSFT> blackboxsw yep that was the goal - we want to be able to deploy 1000 VMs, then use cloud-init analyze output to analyze the 50th/99th percentile of where the timing was spent during system boot, and we need some more insights into phases before cloud-init started as well
17:05 <rharper> cyphermox: agreed; having a pointer to suggest cleaning/checking/confirming configs if /etc/netplan/ is non-empty and netplan.io is installed
17:06 <cyphermox> rharper: one option is to parse enough of /etc/network/ to catch mentions of the interface, but that's not necessarily super solid (though it's the best option), because people can rename interfaces in netplan and match by mac
17:06 <rharper> might be helpful; though I agree that they may still ignore that; and cloud-init could do some more work to see if an image has multiple renderers available and ensure it didn't leave config for a previous boot around
17:07 <rharper> cyphermox: yeah; cloud-init knows more about the config and both formats; we're likely in a better spot to see "you've configured this interface twice"
17:08 <cyphermox> rharper: so in short, I'm not opposed to improving the UX, but I'm not wowed by any solution right now (even mine)
17:09 <rharper> cyphermox: that's fair; thanks
17:09 <AnhVoMSFT> i think a fix in cloudinit might make most stakeholders happy here. It knows which configuration file it wrote, so it can definitely look for conflicting configurations
17:09 <rharper> cyphermox: AnhVoMSFT is going to file the customer bug with details and we can discuss what (if any) improvements are to be made;  I suspect cloud-init can help most here
17:09 <cyphermox> yes, I think so too
17:09 <rharper> cyphermox: thanks for the input
17:09 <AnhVoMSFT> it can't be responsible for everything the customer does though. If customer writes some my-own-netplan.yml, we can't help much
17:10 <cyphermox> rharper: but hey, if someone was to write a check when running netplan apply that there exists config in /etc/network, I wouldn't have much issues merging it
17:10 <rharper> AnhVoMSFT: right, we have several "maybe_delete_if" where we verify expected output before we remove things
17:10 <cyphermox> I just know I won't have time to look into this myself in the near future
17:10 <rharper> cyphermox: ack
17:11 <cyphermox> I think what will help most is aggressively deprecating and removing ifupdown
17:13 <cyphermox> that said, the best we can realistically do for the time being is to demote it to universe
17:13 <cyphermox> (and that's not going to change anything for UX)
17:15 <AnhVoMSFT> we had another instance of someone installing ifupdown2, which had the effect of removing cloud-init on debian/ubuntu 16.04
17:16 <AnhVoMSFT> and totally hosed his system, but that's a different issue altogether
17:26 <blackboxsw> s
17:27 <blackboxsw> thanks for the good discussion folks, I guess we'll just add an action item to followup on a netplan bug for next time to see where we are at
17:31 <blackboxsw> #action follow up any bugs related to Azure/netplan uninstall in favor ifupdown to see if cloud-init has actionable feature work to ensure proper network renderer is used
17:31 * meetingology follow up any bugs related to Azure/netplan uninstall in favor ifupdown to see if cloud-init has actionable feature work to ensure proper network renderer is used
17:31 <blackboxsw> ok, I'll post minutes on this. thank you again rharper for driving
17:31 <blackboxsw> and for the participation robjo cyphermox and AnhVoMSFT
17:31 <blackboxsw> #endmeeting