Fundamental analysis of the GeoLife dataset and track pattern 001

This is the first time this dataset has come from a landmark. The task the teacher gives me is to do a simple job on this data set. Then I think of Sports Track #001.

Introduction to the GeoLife dataset (

This GPS track dataset was collected by the GREIF project’s Girifer in April (Microsoft Asia) from 182 users (August 2012). This means that each point contains latitude, longitude and altitude information. This dataset contains 17,621 tracks, with a total distance of 1,292,951 kilometers and a total duration of 50,176 hours. It has decreased and has different sampling rates. For example, a path of 91.5 percent is 10 metres. The outdoors includes not only the living habits of going home to work, but also some leisure and sports activities such as shopping, sightseeing, dining, hiking, and cycling. Website recommendations. Although this dataset is widely distributed in more than 30 datasets.

data format:

Each folder in this dataset stores a user’s GPS log file that is converted to PLT format. Each PLT file contains a toolpath and is named after its start. To avoid possible timezone confusion, we use GMT in the date/time attribute of each point, which is different from our previous version.


This is the main content of this dataset

Draw motion path 001

Create a new project called GeoProject. My idea is that since this set of data is written by recording its dimensions and longitude, you can draw a flat map consisting of dimension, longitude, and longitude

#Определение даты набора данных даты даты
import os
import matplotlib.pyplot as plt
 Plt.rcparams ['font.sans-serif'] = ['simhei'] #используется для отображения китайских тегов обычно
 Plt.rcparams ['axes.unicode_minus'] = false #используется для отображения отрицательного знака обычно
 Lat = []#dimension
 lng = [] #longity
 #Генеральный путь
path = os.getcwd()+"\\Geolife Trajectories 1.3"+"\\Data"+"\\003"+"\\Trajectory"
 #001 путь
#print(os.listdir(os.getcwd()+"\\Geolife Trajectories 1.3"+"\\Data"+"\\003"+"\\Trajectory"
plts_001 = os.scandir(path)
 #Абсолютный путь каждого файла
for item in plts_001:
    path_item = path+"\\"
    with open(path_item,'r+') as fp:
        for item in fp.readlines()[6::600]:
            item_list = item.split(',')

lat_new = [float(x) for x in lat]
lng_new = [float(x) for x in lng]

 Plt.title ("003 Тест трека")
 Plt.xlabel ("enerithy")#Определите имя x координатного вала
 Plt.ylabel ("Dimension")#Определите имя оси координат Y
 Plt.plot (lng_new, lat_new) #drawing () #display

Although this is a short code because I haven’t used Python for a long time, previous mistakes have been made again.

Path problem when opening a file

path = os.getcwd () + “\\ Geolife 1.3 paths” + “\\ data” + “\\ 003” + “\\ path”

Chinese renderings are garbled in Chinese when drawing

plt. rcparams [‘font.sans-serif’] = [‘simhei’] # Used to display Chinese characters normally
plt. rcparams [‘axes.unicode_minus’] =false # used to display a negative signal normally

String and number types are confused

lat_new = [float(x) for x in lat]
lng_new = [float(x) for x in lng]

The end result is fine


But the glitch is that this data set is too big and my computer can’t handle that much data, so I cut 600 data to take latitude and longitude with slices, and ended up switching to this image.

On this small example, he also showed his weaknesses. I forgot my knowledge of Python and I forgot about the map gallery. Numpy forgot it too. In short, I forgot a lot of knowledge. From now on, you should slowly pick up this knowledge and it will definitely be used in the future.

Leave a Comment