Monday, May 14, 2018

Python Implementations for Spatial Statistics

Install Pystan on Windows

Install Pystan: https://github.com/stan-dev/pystan
1. Install Visual Studio 2017 & Visual Studio C++ Build Tool 2015 at http://landinghub.visualstudio.com/visual-cpp-build-tools

2. Update Conda
conda update conda
conda update --all

3. Recheck the dependencies
pip install setuptool
conda install numpy cython matplotlib scipy pandas

4. Install gcc compiler components
conda install libpython
conda install -c msys2 m2w64-toolchain=5.3.0

5. created distutils.cfg file inside Anaconda3\Lib\distutils folder with the following:
[build]
compiler = mingw32

6. Git at https://git-scm.com/downloads
git clone --recursive https://github.com/stan-dev/pystan.git

7. Compile from the source code
python setup.py build --compiler=mingw32
python setup.py install

P.S. The solution for the issue: Cannot build msvcr library: "vcruntime140d.dll" not found

Copy vcruntime140d.dll from C:\Windows\System32 to any folder, which is reachable through the path in the system variable

Friday, June 17, 2016

Set up AutoRoute on Windows

AutoRoute on Windows
·        Install MSVC, downloaded from http://www.gisinternals.com/release.php, and here are steps to decide which version to download.
o   Open Windows Command Processor by typing in cmd
o   Type in “ipython”

o   Find out which compiler you need
·        System Variable:
o   Edit System Variable: Add to Path- C:\Program Files\GDAL
o   Add two System Variable: (1) GDAL_DATA =  C:\Program Files\GDAL\gdal-data (2) GDAL_DRIVER_PATH = C:\Program Files\GDAL\gdalplugins
·        Move the directory to where the GDAL file is saved by cd “path”
·        Install GDAL package by “pip install GDAL-2.0.2-cp27-none-win_amd64.whl
·        Enter “from osgeo import gdal” in ipython concole to check the installation
3.  rapidpy package: https://github.com/erdc-cm/RAPIDpy
·        Download ZIP
·        Unzip the file
·        Move the directory where the zip file is saved
·         Type in “python setup.py install”

4. optcomplete 1.2-devel package: https://pypi.python.org/pypi/optcomplete/1.2-devel


6. Check the installation


Tuesday, June 2, 2015

How to set up a pc for geographers (continue updating)

A. Linkage between ArcGIS and Anaconda: Using ArcPy in Anaconda

1. Install ArcGIS, including ArcGIS for Desktop and ArcGIS for Desktop Background Geoprocessing (64 bit) (required, as large data processing is much more common)

2. Install Anaconda:
Add Anaconda to my PATH environment variable: convenient for using Anaconda in the Windows Command Processor by a command, ipython
Register Anaconda as my default Python 2.7: All python packages will save in the folder, where Anaconda is located, and will be different from python for ArcGIS. => in my opinion, this way avoids the direct dependency of commercial software, and is easy to maintain.

3. Copy the link file of ArcPy to the Anaconda folder
Copy DTBGGP64.pth from C:\Python27\ArcGISx6410.2\Lib\site-packages (depending where you save your python for ArcGIS)
Save DTBGGP64.pth under C:\Anaconda\Lib\site-packages (depending where you save your Anaconda)
Test whether ArcPy can work in Anaconda
(1) Type in cmd in the start menu to open Windows Command Processor
(2) Type in ipython to open Anaconda
(3) import arcpy => it may take appropriately five seconds to load arcpy

B. Linkage between R and Anaconda: Using R in Anaconda
1. Install R and Anaconda separately: 64 bit version is more recommended, as large data processing is much more common
R 64 bit: select 64 bit version when installation
Anaconda 64 bit: select Windows 64-Bit Python 2.7 Graphical Installer when download

2. Install the Rpy2 and singledispatch packages
Download the packages from Unofficial Windows Binaries for Python Extension Packages (http://www.lfd.uci.edu/~gohlke/pythonlibs/)
Install the packages within Windows Command Processor
(1) locate to the file, where the rpy and singledispatch packages saves by the command, e.g. cd /d c:/ (or any location of folder)
(2) install the packages by the command, pip install rpy2-2.5.6-cp27-none-win_amd64.whl & pip install singledispatch-3.4.0.3-py2.py3-none- any.whl
  
3. Add Paths
Open System properties by mouse right checks
Add two paths in System variables: R_HOME is to point out the location where R installed, while R_USER indicated the user name (Control panel -> Users Accounts and Family Safety -> User Accounts)
 
4. Test whether rpy2 works without any error message: type in the following codes
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
import rpy2.robjects.numpy2ri
rpy2.robjects.numpy2ri.activate()

C. Install packages in R
install.packages('ctv')
library('ctv')
install.views('Spatial')
install.views('SpatioTemporal')

D. install Postgis TIGER geocoder
1. Create extension
CREATE DATABASE geocoder;
CREATE EXTENSION postgis;
CREATE EXTENSION address_standardizer;
CREATE EXTENSION fuzzystrmatch;
CREATE EXTENSION postgis_topology;
CREATE EXTENSION postgis_tiger_geocoder;

2.  Prepare files
download postgis-2.2.0dev from http://postgis.net/stuff/postgis-2.2.0dev.tar.gz
Download Wegt and its Dependencies from http://gnuwin32.sourceforge.net/packages/wget.htm
-> extract file and put under C:/
-> move wget file from the bin folder to C:/wget
-> include Dependencies (four dll files) under C:/wget
install 7zip from http://www.7-zip.org/download.html

3.  Follow steps in the README file (under ~ postgis-2.2.0dev/extras/tiger_geocoder)
Format output: " IF YOU ARE AT A PSQL PROMPT, FIRST RUN "\a", "\t", AND "\o script.xxx".  THIS WILL MAKE YOUR OUTPUT UNALIGNED AND REDIRECT IT TO script.xxx.  WITHOUT "\a" and "\t", THE SCRIPT WILL HAVE EXTRA WHITESPACE AND POSSIBLY NON-SCRIPT CHARACTERS THAT CAUSE IT TO BREAK."
Download county, state tables and shapes at national level by using the command: SELECT loader_generate_nation_script('windows');  
Download other shapefiles at the state level by using the command: SELECT loader_generate_script(ARRAY['DC','RI'], 'windows');
=> the script can be changed to any state abbreviation
=> copy all the outputs into any text editor
=> save the two text outputs as .bat file under C:/ (should be under the root, or the path will be too long to download shapefiles)
=> run .bat files

4.  Test the geocoding function
Road: SELECT g.rating, ST_SRID(ST_Transform(g.geomout, 4326)) AS srid_wgs84 , ST_X(ST_Transform(g.geomout, 4326)) As lon_wgs84, ST_Y(ST_Transform(g.geomout, 4326)) As lat_wgs84, ST_SRID (g.geomout) AS srid_nad83, ST_X (g.geomout) AS lon_nad83, ST_Y(g.geomout) AS lat_nad83, (addy).address As stno, (addy).streetname As street, (addy).streettypeabbrev As styp, (addy).location As city, (addy).stateabbrev As st,(addy).zip FROM geocode('1701 S ELM, Norman OK') AS g order by g.rating;
Road Intersection: SELECT pprint_addy(addy), st_astext(geomout),rating FROM geocode_intersection('ANN ARBOR', 'BERRY RD', 'OK', 'Norman');

Friday, December 20, 2013

Curriculum Vitae

Yan-Ting (Vicky) Liau
yliau1@hotmail.com 480-2890869 https://www.linkedin.com/in/vicky-liau/ Richardson, Texas 75080
Ph.D. Candidate in Geospatial Information Sciences
Data Developer/Scientist with 10+ experiences specializing in spatiotemporal data collection, pre-processing, management, analysis, and prediction. Data awareness, which enables advanced algorithm developments for over 90% missing data in multiple applications, is the strongest strength. Seeking to leverage my great analysis and prediction abilities as a Data Scientist and contribute to the team.
CORE COMPETENCIES
·  Programming: Python, R, SQL, Netlogo with BehaviorSearch, Matlab, etc.
·  Statistical Modeling: Generalized Linear Models, Hierarchical Model, Multiple Imputation, Eigenvector Spatial Filtering Model, Kriging, Spatial Lag Model and Spatial Error Model, Geographically Weighted Regression, Bayesian Analysis, Time Series Analysis, Principal Component Analysis, Factor Analysis, Stepwise Regression/LASSO/Ridge/Elastic-Net, etc.
·  Image Processing: PCI Geomatica, Erdas Imagine, eCognition, ENVI, IDRISI Kilimanjaro, etc.
·  Geographical Information Systems: ESRI ArcGIS, ArcGIS Pro, QGIS (with PyQt)
·  Machine Learning Algorithms: Decision Tree, Random Forest, Neural Network Models, Support Vector Machines, etc.
·  Database: AWS Dynamodb, S3 (Simple Storage Service), RDS (Relational Database Service), Redshift
·  Data Analysis Tools: (1) AWS: Kinesis, Lambda, SageMaker, Forecast; (2) Python: Tensorflow, PyTorch, PyTorch Geometric, theano, Scikit-Learn, Scikit-Image, Statsmodels, PySAL; (3) R: Glmnet, Spatstat, Tseries, Spdep, Gstat, etc.
EXPERIENCES
Graduate Teaching Assistant                                               September 2015 – Present
The University of Texas at Dallas, Richardson, Texas
·  Led and coordinated 2 lab members and won high appreciation by the National Geospatial-Intelligence Agency in GIS Day
·  Motivated students with new skills and cultivate 5 out of 18 students (under-average capability) with self-learning ability
·  Collected fine-resolution datasets via convincing 3 agencies, instead of synthetic data, to conduct experiments
·  Critiqued the most influential factors, when using spatially imputations, introduce the most biased results
·  Developed the pioneer assessments to evaluate influences of using spatial imputed values on regression analysis
·  Identified limitations of using imputations in a response variable, covariates and both types of variables with the least risk
·  Propose algorithms of improving spatial imputation methodology to originally deal with over 90% missing spatial data
National Water Center Research Fellow                                   June 2016 – July 2016
NOAA: National Oceanic & Atmospheric Administration, Tuscaloosa, Alabama Area
·  Formulated a new moisture index for detecting flooding extents from images with 5% improvements
·  Applied the moisture indices and filters for improving segmentation and classifications of flooding extents by 10%
Graduate Research Associate for National Institute of Justice Project    April 2014 – September 2014
Center for Spatial Analysis, University of Oklahoma, Norman, Oklahoma
·  Devised a new geocoding method with 20% more match rates and 80% less time over the most widely used ESRI geocoder
·  Investigated similarity measures for detecting crime patterns in real-time
EDUCATION
Ph.D. in Geospatial Information Sciences                     August 2015May 2020(expected)
The University of Texas at Dallas, Richardson, Texas
Title: Use of Spatially Imputed Variables: Three Papers Addressing Implications of Imputation-based Measurement Error in Spatial Regression Analysis
Master of Arts in Geography                                   August 2011December 2013
Arizona State University, Tempe, Arizona
Title: Evaluation of Hierarchical Segmentation for Natural Vegetation: A Case Study of the Tehachapi Mountains, California
·  Identified the optimal segmentation for individual trees over the most widely used software, eCognition
·  Published independently in the peer-reviewed journal, Remote Sensing with IF = 4.118 (2018)
Master of Arts in Geography                                     September 2008June 2010
National Taiwan Normal University, Taipei, Taiwan
Title: Frontier Exploitation and Environmental Impacts: Case Study of Environment Changes in Teh-Chi Reservoir, Taiwan
·  Collected spatiotemporal data with over 400 citations across 50 years in order to handle predictions with large missing data
·  Pioneered the insight on drivers of environmental changes to support predictions of long-term landscape changes
Bachelor of Arts in Geography                                   September 2003June 2007
National Taiwan Normal University, Taipei, Taiwan