본문 바로가기
GIS Tech/GIS Data Process

OpenStreetMap에서 POI 크롤링

by mpv 2024. 6. 3.

이번 PAKDD DMO-Fintech Workshop에서 발표한  Improving Real Estate Appraisal with POI Integration and Areal Embedding (25 mins) Authors: Sumin Han, Youngjun Park, Sonia Sabir, Jisun An, Dongman Lee 에서도 활용한 코드이다

 

https://github.com/SuminHan/AMMASI

 

GitHub - SuminHan/AMMASI

Contribute to SuminHan/AMMASI development by creating an account on GitHub.

github.com

 

osmnx라이브러리를 활용한다.

 

import osmnx as ox
from shapely.geometry import *

x1, y1, x2, y2 = gdf.total_bounds

house_center_latitude = (y1 + y2)/2 #sensor_hull.centroid.y
house_center_longitude = (x1 + x2)/2 #sensor_hull.centroid.x

center_point = gpd.GeoDataFrame(geometry = [Point(house_center_longitude, house_center_latitude)])
center_point.crs = 'epsg:4326'
center_point = center_point.to_crs('epsg:3310')
max_distance = gdf.to_crs('epsg:3310').distance(center_point.iloc[0].geometry).max()+1000


tag_dict_list = [
    {'amenity':'hospital'},
    {'amenity': 'university'},
    {'amenity': 'school'},
    {'amenity': 'place_of_worship'},
    {'landuse': 'cemetery'},
    {'landuse': 'commercial'},
    {'landuse': 'industrial'},
    {'landuse': 'retail'},
    {'landuse': 'railway'},
    {'leisure': 'golf_course'},
    {'leisure': 'park'},
    {'leisure': 'sports_centre'},
    {'natural': 'water'},
    {'natural': 'wood'},
    {'aeroway': 'aerodrome'}
]

for ttag_dict in tqdm.tqdm(tag_dict_list):
    fname = ','.join(['-'.join(items) for items in ttag_dict.items()])
    print(fname)
    
    
import tqdm
for ttag_dict in tqdm.tqdm(tag_dict_list):
    fname = ','.join(['-'.join(items) for items in ttag_dict.items()])
    if os.path.isfile(f'osm_poi/{dname}/{fname}.geojson'):
        continue
    buildings = ox.geometries.geometries_from_point((house_center_latitude, house_center_longitude), 
                                        tags=ttag_dict,
                                        dist=max_distance)
    
    buildings = buildings.reset_index().copy()

    for col in buildings.columns:
        if col != 'geometry':
            buildings[col] = buildings[col].astype(str)
            
    buildings.to_file(f'osm_poi/{dname}/{fname}.geojson', driver='GeoJSON')
    print(fname)
    
    
    
import tqdm
for ttag_dict in tqdm.tqdm(tag_dict_list):
    fname = ','.join(['-'.join(items) for items in ttag_dict.items()])
    mbuildings = gpd.read_file(f'osm_poi/{dname}/{fname}.geojson')
    mbuildings = mbuildings[mbuildings.geometry.type == 'Polygon'].copy()
    water_geo = mbuildings.unary_union
    if water_geo:
        house_gdf[fname + '_dist'] = house_gdf.distance(water_geo)
    else:
        house_gdf[fname + '_dist'] = 0
    print(fname, len(mbuildings))

 

다만 osmnx==1.9.1 버전에서 돌아가는 코드여서, 최근 라이브러리에서는 호환이 안맞을 가능성이 있다.

'GIS Tech > GIS Data Process' 카테고리의 다른 글

Useful Python GIS Skills 2  (0) 2023.11.02
folium HeatMapWithTime 이용하기  (0) 2021.02.15
생활인구 시각화  (0) 2021.01.22
Python (GIS+General) Tip 모음  (5) 2021.01.11
주요 좌표계 활용 방법 (TM, UTM-K, KATEC)  (0) 2021.01.11

댓글