This repository contains all the scripts used for the python class for JRFs at IITM
Some of the tutorials and resources to follow
- http://pure.iiasa.ac.at/id/eprint/14952/1/xarray-tutorial-egu2017-answers.pdf
- https://rabernat.github.io/research_computing/xarray.html
- Ocean Data Analysis https://currents.soest.hawaii.edu/ocn_data_analysis/exercise_data.html#id1
- Parallelization http://xarray.pydata.org/en/stable/dask.html
- Satellite Data Analyis https://github.com/nansencenter/nansat-lectures
- https://github.com/NCAR/CESM_postprocessing CESM Postprocessing
- https://github.com/NCAR/PyCect This repo is used to compare the results of a set of new CAM simulations against the accepted ensemble
- https://github.com/nichannah/ocean-regrid Regrid ocean reanalysis data from normal to tripolar grids
- https://github.com/jswhit/gfstonc Read GFS sigma and sfc files in python
- f2py
- Pandas
- https://github.com/tmiyachi/data2gfs Make python version of this using f2py
- Shallow water equation model using pyspharm https://github.com/jswhit/pyspharm and https://www.aosc.umd.edu/~dkleist/docs/shtns/doc/html/shallow_water_8py-example.html explaining the code
- Scientific Computing Lectures https://github.com/jrjohansson/scientific-python-lectures
- Geopandas satellite data analysis https://towardsdatascience.com/satellite-imagery-access-and-analysis-in-python-jupyter-notebooks-387971ece84b
- Rasterio https://medium.com/analytics-vidhya/satellite-imagery-analysis-with-python-3f8ccf8a7c32
- Eo-learn https://medium.com/dataseries/satellite-imagery-analysis-with-python-ii-8001e5c41a52
- Satpy
- Use of Landsat and Sentinel datasets
- Pyunicorn
- Keras, tensorflow, pytorch, django, theano, scikit-learn, theano, bokeh, pandas, seaborn, bokeh, plotly, scrapy,
- Python tutorial https://carpentrieslab.github.io/python-aos-lesson/ plotting CMIP data - highlight
- Python for oceanography http://www.soest.hawaii.edu/oceanography/courses/OCN681/python.html
- Python tools for oceanography https://pyoceans.github.io/sea-py/
- Python Land Surface Modelling https://www.geosci-model-dev.net/12/2781/2019/
- Python hydrology tools https://github.com/raoulcollenteur/Python-Hydrology-Tools
- Docker
- Python and GIS https://automating-gis-processes.github.io/CSC18/lessons/L1/overview.html
- https://automating-gis-processes.github.io/2016/
- https://geohackweek.github.io/raster/
- https://github.com/pangeo-data/pangeo
- https://github.com/pangeo-data/awesome-open-climate-science
- https://uwescience.github.io/sat-image-analysis/resources.html
- Radar data analysis https://data.world/datasets/radarhttps://arm-doe.github.io/pyart/https://docs.wradlib.org/
- https://www.earthdatascience.org/courses/use-data-open-source-python/multispectral-remote-sensing/landsat-in-Python/
- Deep Learning on Satellite Imagery https://github.com/robmarkcole/satellite-image-deep-learning
- Google Earth Engine https://sites.google.com/view/eeindia-advanced-summit/summit-resources
- https://geohackweek.github.io/GEE-Python-API/
- https://github.com/google/earthengine-api/tree/master/python/examples/ipynb
- http://www.jerico-ri.eu/download/summer%20school%20-%20the%20netherlands/Genna%20Donchyts%20-%20GEE%20Training.pdf
- https://www.earthdatascience.org/tutorials/intro-google-earth-engine-python-api/
- Installing Google Earth Engine and requesting access https://github.com/google/earthengine-api/issues/27
- https://github.com/giswqs/earthengine-py-notebooks
- Google Earth Engine image to numpy https://mygeoblog.com/2019/08/21/google-earth-engine-to-numpy/
- Stippling to show statistical significance bradyrx/esmtools#13
- Resampling from swath to grid https://github.com/TerraFusion/pytaf
- Making a docker container for data science https://towardsdatascience.com/docker-for-data-scientists-5732501f0ba4
- Docker commands:
Run interactively: docker run -it manmeet3591/dl:iitm:latest
Install the necessary libraries
Open a new terminal and do docker images to see the id and run the following command
$ docker tag id_ manmeet3591/dl_iitm:v2
$ docker push manmeet3591/dl_iitm:v2
Projects for the class
https://docs.google.com/spreadsheets/d/1m2ZIJ_To8IbE18Teb70a7BVZg0o29sOM6rlgFkE2b3E/edit#gid=0
https://docs.google.com/document/d/12h9bcIdBPJUFc_fJssJe8hVBzledq2Dtk5-9OpKHbfg/edit
Homogenous regions India shape files: https://github.com/Cassimsannan/Shapefiles
Download CMIP6 data: https://github.com/TaufiqHassan/acccmip6
Download MSWEP data from Google drive:
Setup rclone: https://www.youtube.com/watch?v=vPs9K_VC-lg
- Run jupyter notebook from docker container
docker run --rm -it --entrypoint bash -p 8891:8891 manmeet3591/tensortrade
Inside the container jupyter-notebook --ip 0.0.0.0 --port=8891 --no-browser --allow-root &
In the browser http://localhost:8891/
$ rclone sync -v --exclude 3hourly/ --drive-shared-with-me GoogleDrive:/MSWEP_V280 /lus/dal/cccr_rnd/manmeet/AI_IITM/WeatherBench/data/dataserv.ub.tum.de/mswep/.
- Create any number of subplots matplotlib
$ fig,ax = plt.subplots(ncols=2,nrows=4, figsize=(11.69,8.27), subplot_kw={'projection': ccrs.PlateCarree()})
Google Earth Engine timelapse gif generator: https://9611d0317f71.ngrok.io/voila/render/timelapse.ipynb
Handling expver dimension in a netcdf file downloaded as ERA5 data
ds.reduce(np.nansum, 'expver') Solution from marco venturini https://confluence.ecmwf.int/pages/viewpage.action?pageId=173385064
GeoTIFF to netcdf and exporting data from Google Earth Engine https://medium.com/@wenzhao.li1989/nco-translate-geotiff-files-exported-from-gee-to-a-netcdf-file-with-correct-time-dimension-ce97a8f3043f
Make pipeline to avoid test data leaking into train https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html
Potential evapotranspiration (PET) from netcdf file https://climate-indices.readthedocs.io/en/latest/#
t-distributed Stochastic Neighbourhood Embedding (tSNE) versus PCA https://stats.stackexchange.com/questions/238538/are-there-cases-where-pca-is-more-suitable-than-t-sne
Stationarity of time series: https://towardsdatascience.com/stationarity-in-time-series-analysis-90c94f27322
SARIMAX model: https://towardsdatascience.com/end-to-end-time-series-analysis-and-forecasting-a-trio-of-sarimax-lstm-and-prophet-part-1-306367e57db8
Prevent kaggle from disconnecting https://stackoverflow.com/questions/57113226/how-to-prevent-google-colab-from-disconnecting
function ClickConnect(){console.log("Working"); document.querySelector("colab-toolbar-button#connect").click() } setInterval(ClickConnect,60000)
Solving NVIDIA driver installation issues https://stackoverflow.com/questions/42984743/nvidia-smi-has-failed-because-it-couldnt-communicate-with-the-nvidia-driver/51113428#51113428
Installing NVIDIA drivers https://www.itzgeek.com/post/how-to-install-nvidia-drivers-on-ubuntu-20-04-ubuntu-18-04.html
Install cuda https://www.tensorflow.org/install/gpu
bashrc commands for cuda
export CUDA_HOME=/usr/local/cuda-11.0
export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:/usr/local/cuda-11.0/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-11.0/bin:$PATH
sudo ln -s /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.10 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.11 # https://stackoverflow.com/questions/63199164/how-to-install-libcusolver-so-11
echo | sudo -S ln -s /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.10 /usr/local/cuda-11.0/targets/x86_64-linux/lib/libcusolver.so.11
Test time augmentation https://towardsdatascience.com/test-time-augmentation-tta-and-how-to-perform-it-with-keras-4ac19b67fb4d
Geometric Deep Learning https://geometricdeeplearning.com/lectures/\\
DeepSphere Spherical convolutions using Graph convolutions https://github.com/deepsphere/deepsphere-pytorch
Interactively logging into Pratyush GPU
$ qsub -I -l select=1:ncpus=1:naccelerators=1:accelerator_model="Tesla_P100-PCIE-12GB" -q gpu
$ source activate py36
$ module load cudatoolkit
$ aprun -n 1 jupyter-notebook --no-browser --ip=0.0.0.0 --port=8890 >> NOTEBOOK_LOGFILE1 2>&1
$ tail -f NOTEBOOK_LOGFILE1
$ Ctrl+C
$ ssh -N -f -L localhost:8888:node:8890 cccr_rnd@nid00019 (Here nid should be the one as seen from NOTEBOOK_LOGFILE1)
$ source activate py36
$ module load cudatoolkit
$ firefox&
To check if a port is being used
$ netstat -antp ! grep -i port_id
The above statement will only work in the interactive login node.
Now start the notebook by noting the link from NOTEBOOK_LOGFILE1
Markdown tool https://dillinger.io/
LRP for explainable AI (XAI): https://github.com/albermax/innvestigate Application paper: https://arxiv.org/abs/2103.10005
Lower tropospheric stability (LTS = θ700hPa − θ1000hPa; Kelvin), which is defined as the difference in potential temperature (θ) between the 700-hPa level and the surface
Shallow copy and deep copy in python https://stackoverflow.com/questions/41125834/trying-to-do-a-shallow-copy-on-list-in-python
Climate indices https://climate-indices.readthedocs.io/en/latest/
Copying to box using lftp mirror -R folder
Transforming argparse python code to jupyter notebook: https://stackoverflow.com/questions/37534440/passing-command-line-arguments-to-argv-in-jupyter-ipython-notebook
Simply add the following lines:
import sys sys.argv = ['']
- Install pytorch with cuda
- Running docker containers on NVIDIA DGX A100
$ docker pull nvcr.io/nvidia/tensorflow:20.10-tf2-py3
$ docker run --gpus all -it -v /home/cccr_rnd:/apollo nvcr.io/nvidia/tensorflow:20.10-tf2-py3
Troubleshooting
Continue in outer loop using multi-loops https://stackoverflow.com/questions/14829640/how-to-continue-in-nested-loops-in-python
Numbering the subplots https://matplotlib.org/3.1.1/gallery/axes_grid1/simple_anchored_artists.html
Fortran compilation may sometimes be solved by running the command ulimit -s unlimited
There are visualization problems in cartopy if the lon is from 0 to 360 and not from -180 to 180
Run docker as a non-root user https://docs.docker.com/engine/install/linux-postinstall/
In the first instance of an image sometimes docker hub may deny you to push the image https://stackoverflow.com/questions/41984399/denied-requested-access-to-the-resource-is-denied-docker
Numpy to xarray : foo = xr.DataArray(data, coords=[times, locs], dims=["time", "space"])
data = ds_merra2_jjas.DUSCATAU.sel(time='2002').values[0,:,:] lats_ = ds_merra2_jjas.DUSCATAU.sel(time='2002').lat.values lons_ = ds_merra2_jjas.DUSCATAU.sel(time='2002').lon.values ds_merra2_jjas_new = xr.DataArray(data, coords=[lats_, lons_], dims=["lat", "lon"])
Using matplotlib to make map plots plt.contourf(ds_merra2_jjas.DUSCATAU.sel(time='2002').lon.values,
ds_merra2_jjas.DUSCATAU.sel(time='2002').lat.values ,
ds_merra2_jjas.DUSCATAU.sel(time='2002').values[0,:,:],
cmap='bwr') plt.colorbar()Sometimes xarray plot might show blank, the way to resolve that is select the area and that should work.
Pattern correlation formula: https://www.mdpi.com/2073-4441/10/1/28 may use weights as well for the pattern correlation
For the weights, the following can be followed: https://stackoverflow.com/questions/58881607/calculating-the-cosine-of-latitude-as-weights-for-gridded-data
When installing packages otherwise difficult to install like ESMF we can set the compiler environment variables such as CC and FC to force conda to install using that particular compiler. This saves a lot of time and effort. https://stackoverflow.com/questions/59284298/conda-install-c-anaconda-gcc-linux-64-not-being-used Many build tools such as make and CMake search by default for a compiler named simply gcc, so we set environment variables to point these tools to the correct compiler.
When using the isin function with sel we can at present use it only once in a call. Need to instantiate a new variable for doing it twice.
Installing PyRQA (Runs only with python 2.7) https://github.com/szhan/pyrqa
conda install https://anaconda.org/conda-forge/pytools/2017.2/download/linux-64/pytools-2017.2-py27_0.tar.bz2
conda install https://anaconda.org/conda-forge/pyopencl/2018.1.1/download/linux-64/pyopencl-2018.1.1-py27_1.tar.bz2
conda install -c conda-forge pocl
pip install Mako
pip install PyRQA
Even after all this, unable to run pyrqa smoothly. However, this activity ensured that the environment to run pyrqa was perfect. So then clone the github repository and inside the main github repository pyrqa, there is a folder pyrqa. Copy that to your desired location, rename it lets say PYRQA. And use the library as PYRQA.
Logging to a remote server without password https://www.thegeekstuff.com/2008/11/3-steps-to-perform-ssh-login-without-password-using-ssh-keygen-ssh-copy-id/
Create xarray dataset from dataarrays
Use tmux to run a process in the background: https://medium.com/@praveendhawan/tmux-run-commands-in-the-background-bad007810318
Indentation error when transferring from jupyter notebook to python script https://stackoverflow.com/questions/1024435/how-to-fix-python-indentation
Correcting aspect ratio in cartopy: ax[i,j].set_aspect('auto')