00:00If you want to strip down the city from the main address here like this and
00:03determine the population live from the World Wide Web and also at the same time
00:08determine the tier based on the population, this is how you do it. First
00:12and foremost of all, this is what the table looks like all empty. The blue ones
00:16is US address and the one below in red are Canadian address. So you can see in
00:22most address the city is the second one. You can see two commas here. Also the
00:27Canadian address, the city is on the second one here. If you have taken the
00:30setup it's pretty straightforward to strip down the city. You use this formula here
00:34like this and hit enter and you can see they stripped down the New York for you
00:39and if I apply to the rest of them you'll work the same way like that. There we go
00:44and to get population you have to use a function called import XML and the way to
00:52do it is that let's say import XML like that. The first argument on import XML is
00:59a URL of where you're gonna get those population from and the second one is
01:04called expat query. We're gonna leave the locale as it is and not gonna even enter
01:09them. So let me give you a quick introduction. The way I'm
01:12getting the population is from this web page called population city
01:16population.de and let's say for example if you want a US pop cities population
01:22this is New Jersey New York so if you want a particular text from this web
01:27page this is basic web scraping right what you do is that you highlight this
01:31text here like this and you go right click and you inspect. Now this is a
01:36Google Chrome Firefox is a bit different but the idea is the same I think they use
01:40the same term as inspect. Once that's done this this section of the code will be
01:44highlighted under elements here. If you right click in here you go copy and then
01:49you click on copy full path and that's what you need for the expat. So and also
01:55the thing to note is that the way that the URL is formed is like that. So what
02:01you're gonna have to do is that in this in this particular website they use city.deen
02:08US Metro New York slash that's the URL and we do have New York over here so we're
02:14gonna have to make everything lowercase and strip it down. So basically this is
02:18what the first portion of your import URL looks like like that. So basically
02:26that's a URL and you're taking the city and you are getting rid of the spaces and
02:33then after that trimming front and back of any white spaces and changing
02:39everything to lowercase and concatenating them. And then I did say that the that
02:45the expat would be something like that. See is this this thing here. This is where
02:53it looks like if you copy I mean this is where this came from was let me hit enter
02:58here and it'll give you the population of that. So where this came from was this if
03:02you go this and you go right click and you go copy and you go full path and if I
03:07were to open my notepad here let me get my notepad here and if I were to paste it
03:13see this is what it looks like. So that's what you need for your expat there. So
03:18that's for US and for Canada you're gonna have to change this to CA and this expat
03:24it's gonna be a bit different as well. Let's do this here for now and then to
03:29determine the tier I'm going to use population determine tier. So this is a
03:34formula I'm using for tier. So basically 10 million is going to be called tier 1
03:385 million tier 2 and 1 million tier 3 and 4 million for anything else. So that's how
03:45I determine the tier and then if you take this and apply to all the US address
03:50you get this population from that web page and also the tier as well as you go
03:56along. For the Canadian city, city extraction is similar same formula I'm
04:01going to copy the same formula and paste it in here and it works for all of them
04:05because the structure is same the cities at the center here but the but the what
04:11you call that the equation the formula to calculate or the formula to get the
04:14population is going to be slightly different. If you look at this web page
04:18here like I said citypopulation.deen this is Canada previously was US so
04:25that's something you have to change. Metro followed by city is the same and this
04:30one here is your XPath. This is the population in 2010. You will right click
04:34and you go inspect you will highlight this thing here for you. This is the number
04:39that you want. Right click here and what you do is you hit copy full path again and
04:44if you look at your this is this one was for US. Let me give a couple of space here and
04:50if you press Ctrl V or paste it this is what it looks like for Canadian. So what
04:57you do is this so go back to your Google sheet here I'm going to copy this
05:02formula here. As you can see I'm going to use the same formula called import XML
05:08HTTPS population city as you can see this is called Canada Metro and basically doing
05:14the same thing getting rid of space between if that's any of the city has two
05:20words in it trimming it which means removing white spaces from back and
05:24front and changing it to lower place basically forming the the URL itself and
05:29then I'm going to copy this this what I got it for expat here and then that's my
05:36second argument or my import XML and I'm going to close it with parentheses and hit
05:40enter and that didn't work at all so let me have a quick look at it what happened
05:44here is this double click here you got to put the expat query in double
05:52code don't forget that and hit enter and you'll get your total population from
05:58that web page there now this is so six million mainly because this is this is
06:03Calgary what they want here was Toronto so if you take this and paste it to
06:09Calgary you will see 1.481 million is from here okay so gonna apply the same
06:17formula to rest of the Canadian city here like this and I'm gonna keep the tier
06:23calculation the same likewise if you want to change let's say make it a bit
06:28smaller for a Canadian city maybe call it what 1 million for tier 1 and half a
06:35million for tier 2 and then 100,000 for tier 3 and so on you can change it that
06:42way too right it depends on how you want to categorize each city well that's how I
06:48do it basically city population from the World Wide Web and tier calculation
Comments