【SQL】件数10倍超ブロックの緯度経度中央値(Median lat/long of blocks where the number of crimes more than 10-fold)

発生件数5倍以上は17件、いっそ10倍以上になってるブロックについて可視化したい。「あのへん近付かないほうがいいか」と。しかしブロック基準で地図に落とし込むとすると、各ブロックの緯度経度を集約しなきゃいけない。同じ道路でも距離があったり細かく入力されていたりすると、歩道の対面でも緯度経度変わるから、道路の中央値を取ってまとめたいところ。


  1. 発生件数10倍以上のブロックを探す
  2. ブロックの位置情報一覧を出す
  3. ブロックごとの緯度経度中央値を出す


1.発生件数10倍以上のブロックを探す

【SQL】前回続き:メーン部分のみ変更

SELECT block, count1014, count1519, 
 ROUND(( count1519 / count1014 ),1) AS increase_rate
FROM count201014 JOIN count201519 USING(block)
WHERE ( count1519 / count1014 ) > 10
ORDER BY ( count1519 / count1014 ) DESC

【結果】

近付きたくないとこ。1位は前半が1件だから、どっちかって2行目のほうが危なそう。いやそれでも多いか。


2.ブロックの位置情報一覧を出す

1.をFROM内サブクエリで元データとJOINします。

【SQL】

SELECT
DISTINCT block, latitude, longitude, increase_rate
FROM
 `angular-radar-411310.chicago_crime_data.crime` JOIN
 (SELECT block AS rate10over, count1014, count1519, ROUND(( count1519 / count1014 ),1) AS increase_rate
 FROM count201014 JOIN count201519 USING(block)
 WHERE ( count1519 / count1014 ) > 10 ) AS rate10
 ON block = rate10.rate10over
WHERE latitude IS NOT NULL
ORDER BY increase_rate DESC

【結果】

8ブロックに対して異なる位置情報が234件ある。このままじゃ集約できないので、緯度経度の平均値をブロックの代表値としてとります。


3.ブロックごとの緯度経度中央値を出す

平均値をとるので、小数点以下を揃えます。緯度経度の小数点以下桁数、精度について。

小数点以下4桁での精度(誤差)は、赤道上で11.132メートル。小数点以下5桁での値は、赤道で1.1132メートルとなる。(Wikipedia:十進角)

6桁にしよかな。2.のSQLをFROMにして包みます。(ひとつで書こうとすると、こうやって長くなっていくんだな)


【SQL】

SELECT
 block,
 ROUND(AVG(latitude),6) AS latitude,
 ROUND(AVG(longitude),6) AS longitude,
 MAX(increase_rate) AS rate10
FROM(
 SELECT DISTINCT block, latitude, longitude, increase_rate
  FROM
  `angular-radar-411310.chicago_crime_data.crime` JOIN
  (SELECT block AS rate10over, count1014, count1519,
   ROUND(( count1519 / count1014 ),1) AS increase_rate

   FROM count201014 JOIN count201519 USING(block)

   WHERE ( count1519 / count1014 ) > 10 ) AS rate10

   ON block = rate10.rate10over

  WHERE latitude IS NOT NULL )

GROUP BY block

ORDER BY rate10 DESC


【結果】

8ブロックの緯度経度平均値が出ました。これを使って地図に載せていきます。rate10(犯罪増加率)を残して、地図ポイントの大きさにします。



********


There are 17 cases with more than five times as many occurrences. I would like to visualise the blocks that are more than 10 times more frequent.

We can say, "This area is dangerous these days".

However, if the map is based on blocks, the latitude and longitude of each block must be aggregated. If the same road has a distance or is detailed, the latitude and longitude will change even if the footpath is facing each other, so I want to take the median and aggregate it.


  1. Find blocks with more than 10 times the number of occurrences
  2. Generate a list of block locations
  3. Produce the mean latitude and longitude of each block


1.Find blocks with more than 10 times the number of occurrences

【SQL】 Continued from previous issue: only the main part was changed.

SELECT block, count1014, count1519,
 ROUND(( count1519 / count1014 ),1) AS increase_rate
FROM count201014 JOIN count201519 USING(block)
WHERE ( count1519 / count1014 ) > 10
ORDER BY ( count1519 / count1014 ) DESC

Results

Places you don't want to go near.The first place is one in the first half, so if anything, the second place in the second line looks dangerous. No, but that's still a lot.


2. Generate a list of block locations

JOIN 1. with the original data in a subquery in FROM.

【SQL】

SELECT
DISTINCT block, latitude, longitude, increase_rate
FROM
 `angular-radar-411310.chicago_crime_data.crime` JOIN
 (SELECT block AS rate10over, count1014, count1519, ROUND(( count1519 / count1014 ),1) AS increase_rate
 FROM count201014 JOIN count201519 USING(block)
 WHERE ( count1519 / count1014 ) > 10 ) AS rate10
 ON block = rate10.rate10over
WHERE latitude IS NOT NULL

ORDER BY increase_rate DESC

【Results】

For eight blocks, there are 234 different location data. As this cannot be aggregated, the average value of latitude and longitude is taken as the representative value of the block.


3.Produce the mean latitude and longitude of each block

The decimal point is aligned as the average value is taken. Number of decimals and accuracy of latitude and longitude.


The precision (error) with four decimal places is 11.132 metres at the equator. The value with five decimal places is 1.1132 metres at the equator. (Wikipedia: Decimal angles)


Let's make it six decimal places. 2. Wrap the SQL in FROM. (This is how the code gets longer.)


【SQL】

SELECT
 block,
 ROUND(AVG(latitude),6) AS latitude,
 ROUND(AVG(longitude),6) AS longitude,
 MAX(increase_rate) AS rate10
FROM(
 SELECT DISTINCT block, latitude, longitude, increase_rate
  FROM
  `angular-radar-411310.chicago_crime_data.crime` JOIN

  (SELECT block AS rate10over, count1014, count1519,

   ROUND(( count1519 / count1014 ),1) AS increase_rate

   FROM count201014 JOIN count201519 USING(block)

   WHERE ( count1519 / count1014 ) > 10 ) AS rate10

   ON block = rate10.rate10over

  WHERE latitude IS NOT NULL )

GROUP BY block

ORDER BY rate10 DESC


【Results】

The latitude and longitude average for the eight blocks are now available. I will use this to put on the map.

Leave the rate10 (percentage increase in crime) in the table and make it the size of a map point.