Nantes Université

Skip to content
Extraits de code Groupes Projets
Vérifiée Valider bc5ef9dc rédigé par Benoit SEIGNOVERT's avatar Benoit SEIGNOVERT
Parcourir les fichiers

Update Welcome notebook with planning

parent 902c58ee
Aucune branche associée trouvée
Aucune étiquette associée trouvée
Aucune requête de fusion associée trouvée
%% Cell type:markdown id:0ef87507-4419-480b-9e05-2e1444af60d1 tags: %% Cell type:markdown id:0ef87507-4419-480b-9e05-2e1444af60d1 tags:
<center><img src="https://s3.glicid.fr/nuts/workshop-banner.svg"/></center> <center><img src="https://s3.glicid.fr/nuts/workshop-banner.svg"/></center>
# Welcome to the NuTS workshop praticals # Welcome to the NuTS workshop praticals
Here is a list of the notebook that we will exercice during this workshop: The praticals were prepared and will be presented by [Leonard Sédoux](https://leonard-seydoux.github.io/) with the help of [Benoît Seignovert](https://benoit.seignovert.fr).
- [Machine learning using MNIST classification with fully connected neural network](machine-learning/notebooks/session_1a_fcnn.ipynb)
- [Machine learning using MNIST classification with a convolutional connected neural network](machine-learning/notebooks/session_1b_cnn.ipynb)
- [Machine learning using Seisbench and PhaseNet](machine-learning/notebooks/session_2_phasnet.ipynb) > This Jupyter environment **is already preconfigured** for you and you just have to click on the links above to open the notebooks and edit them.
> If you want to, you can download the notebooks individually with the file explorer (on the left)
This jupyter environment is already already preconfigured for you and you juste have to click on the links above to open the notebooks and play with them. > All the notebooks are also available in the [NuTS Gitlab repo](https://gitlab.univ-nantes.fr/nuts/tp/machine-learning).
You can download the notebook manually (with the file explorer on the left) or you can clone the [source files from the NuTS Gitlab repo](https://gitlab.univ-nantes.fr/nuts/tp).
If you wish to report an [issue you can also send us an email](mailto:gitlab-incoming+nuts-tp-machine-learning-18610-issue-@univ-nantes.fr) ✉️. ## Thursday, June 1<sup>st</sup>
### Atelier 1 - First steps with data preparation (8h-12h)
Here we will investigate the basic statistical properties and some feeling of the data. Do we have sufficient training points? Is the data stationary? What features will be relevant to solve the task at hand? Can we avoid overfitting?
- [Basics of programming with Python and scientific libraries](machine-learning/1_inspection/1_check_basics.ipynb)
- [River load sensor calibration](machine-learning/1_inspection/2_calibration.ipynb)
- [Classification of Iris dataset using various classifiers](machine-learning/2_iris/iris_classification.ipynb)
🍴 Lunch break and posters (12h-14h)
### Atelier 2: Machine learning approches (14h-18h)
If deep learning is powerful to solve a vast majority of problems, machine-learning approches, if successful, provide more insight about the physics at play for solving problems. We will investigate and criticize several supervised and unsupervised approaches to solve the task of classifying clouds of Lidar points (scenes) to infer the type of structure visible therein.
- [Label a subset of a lidar point cloud](machine-learning/3_lidar/label.ipynb)
- [Three-dimensional lidar data classification of complex natural scenes with multi-scale features](machine-learning/3_lidar/lidar.ipynb)
## Friday, June 2<sup>nd</sup>
### Atelier 3: Deep Learning with Pytorch (9h-12h)
When the attempts to solve the problem are unsuccessful or if defining features is a challenge because of the input data's dimension, we can also learn the features that best solve the task. Here we will revisit the problem of Lidar with deep-learning approaches and see how we can efficiently design a neural network to solve the classification task.
- [MNIST classification with a fully connected neural network](deep-learning/session_1a_fcnn.ipynb)
- [MNIST classification with a convolutional neural network](deep-learning/session_1b_cnn.ipynb)
- [Understanding and training PhaseNet on a local dataset](deep-learning/session_2_phasnet.ipynb)
🍴 Lunch break and posters (12h-14h)
### Atelier 4: Deeper dive in artificial intelligence (14h-15h30)
This last class will be a free hands class on various problems. We invite people to bring datasets and tasks of interest so we can sit and try machine-learning solutions for solving them. We will also dive deeper into algorithms, try to improve performances and discuss some good practices in machine and deep learning.
<center><img src="https://s3.glicid.fr/nuts/workshop-footer.svg"/></center>
%% Cell type:markdown id:096bcfe7-b208-4567-b9ee-8290b8dc325e tags: %% Cell type:markdown id:096bcfe7-b208-4567-b9ee-8290b8dc325e tags:
## Test your Jupyter environement ## Explore your environement configuration
### Hardware config
%% Cell type:code id:87eeb059-2ba9-47b4-ad4d-3a50e9d75741 tags: %% Cell type:code id:c0246502-79e3-4f50-9f95-4e4ff0045547 tags:
``` python ``` python
import psutil from os import environ
from psutil import cpu_count, virtual_memory
``` ```
%% Cell type:code id:f07d76ac-e6d0-4fe5-8130-0fdd36ac815c tags: %% Cell type:code id:f07d76ac-e6d0-4fe5-8130-0fdd36ac815c tags:
``` python ``` python
print('CPU core: ', psutil.cpu_count(logical=False)) print('Username: ', environ.get('USER'))
print('CPU threads: ', psutil.cpu_count(logical=True)) print('SLURM node: ', environ.get('SLURMD_NODENAME'))
print('RAM total: ', round(psutil.virtual_memory().total / 1024**3, 1), 'GB')
print('RAM available:', round(psutil.virtual_memory().available / 1024**3, 1), 'GB') print('CPU core: ', cpu_count(logical=False))
print('CPU threads: ', cpu_count(logical=True))
print('RAM total: ', round(virtual_memory().total / 1024 ** 3, 1), 'GB')
print('RAM available:', round(virtual_memory().available / 1024 ** 3, 1), 'GB')
``` ```
%% Output %% Output
Username: user-id@domain.fr
SLURM node: budbud020
CPU core: 32 CPU core: 32
CPU threads: 64 CPU threads: 64
RAM total: 251.5 GB RAM total: 251.5 GB
RAM available: 243.2 GB RAM available: 243.1 GB
%% Cell type:markdown id:adc1930d-503c-4caf-b79a-c226b39ddf7d tags: %% Cell type:markdown id:ac43551d-8768-4f9b-9d19-41b6ff9f2660 tags:
## Pytorch ### Python packages
%% Cell type:code id:54da43ac-74ed-4505-996b-0c51f8d5dbb4 tags: %% Cell type:code id:47a04f49-75b4-45ca-977e-a27f4a1af324 tags:
``` python ``` python
import torch from platform import python_version
import numpy, matplotlib, seaborn, pandas, torch, sklearn, obspy, seisbench
``` ```
%% Cell type:code id:7bc5a46e-caac-4ac6-94c8-efd317a9b8d2 tags: %% Cell type:code id:fd3e0810-01cc-4017-a6e4-25542a63bb84 tags:
``` python ``` python
print('Pytorch verison:', torch.__version__) print('Python: ', python_version())
print('Numpy: ', numpy.__version__)
print('Matplotlib: ', matplotlib.__version__)
print('Seaborn: ', seaborn.__version__)
print('Pandas: ', pandas.__version__)
print('PyTorch: ', torch.__version__)
print('Scikit Learn: ', sklearn.__version__)
print('Obspy: ', obspy.__version__)
print('Seisbench: ', seisbench.__version__)
```
%% Output
Python: 3.9.16
Numpy: 1.24.3
Matplotlib: 3.7.1
Seaborn: 0.12.2
Pandas: 1.5.3
PyTorch: 1.13.1
Scikit Learn: 1.2.2
Obspy: 1.4.0
Seisbench: 0.4.0
%% Cell type:markdown id:adc1930d-503c-4caf-b79a-c226b39ddf7d tags:
### PyTorch GPU configuration
%% Cell type:code id:7bc5a46e-caac-4ac6-94c8-efd317a9b8d2 tags:
``` python
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device) print('Using device:', device)
if device.type == 'cuda': if device.type == 'cuda':
print('Cuda version:', torch.version.cuda) print('Cuda version:', torch.version.cuda)
for i in range(torch.cuda.device_count()): for i in range(torch.cuda.device_count()):
print() print()
print(torch.cuda.get_device_name(i)) print(torch.cuda.get_device_name(i))
print('Memory Usage:') print('Memory Usage:')
print('Allocated:', round(torch.cuda.memory_allocated(i) / 1024 ** 3, 1), 'GB') print('Allocated:', round(torch.cuda.memory_allocated(i) / 1024 ** 3, 1), 'GB')
print('Cached: ', round(torch.cuda.memory_reserved(i) / 1024 ** 3, 1), 'GB') print('Cached: ', round(torch.cuda.memory_reserved(i) / 1024 ** 3, 1), 'GB')
``` ```
%% Output %% Output
Pytorch verison: 1.13.1
Using device: cuda Using device: cuda
Cuda version: 11.7 Cuda version: 11.7
NVIDIA A100-PCIE-40GB NVIDIA A100-PCIE-40GB
Memory Usage: Memory Usage:
Allocated: 0.0 GB Allocated: 0.0 GB
Cached: 0.0 GB Cached: 0.0 GB
NVIDIA A100-PCIE-40GB NVIDIA A100-PCIE-40GB
Memory Usage: Memory Usage:
Allocated: 0.0 GB Allocated: 0.0 GB
Cached: 0.0 GB Cached: 0.0 GB
%% Cell type:markdown id:55e02b91-907b-469a-b7e8-6de4d2c16599 tags: %% Cell type:markdown id:55e02b91-907b-469a-b7e8-6de4d2c16599 tags:
## Seisbench ### Seisbench datasets
%% Cell type:code id:e2c6e92d-c58c-473d-bdd0-5dddc4580c0f tags: %% Cell type:code id:e2c6e92d-c58c-473d-bdd0-5dddc4580c0f tags:
``` python ``` python
from seisbench.data import Iquique, ETHZ from seisbench.data import Iquique, ETHZ
``` ```
%% Cell type:code id:af776464-eb23-4308-bf43-8244928d5d6b tags: %% Cell type:code id:af776464-eb23-4308-bf43-8244928d5d6b tags:
``` python ``` python
# data = Iquique() # data = Iquique()
data = ETHZ() data = ETHZ()
print(data) print(data)
data.metadata.head() data.metadata.head()
``` ```
%% Output %% Output
2023-05-19 17:45:22,994 | seisbench | WARNING | Check available storage and memory before downloading and general use of ETHZ dataset. Dataset size: waveforms.hdf5 ~22Gb, metadata.csv ~13Mb 2023-05-30 12:00:44,151 | seisbench | WARNING | Check available storage and memory before downloading and general use of ETHZ dataset. Dataset size: waveforms.hdf5 ~22Gb, metadata.csv ~13Mb
2023-05-19 17:45:23,201 | seisbench | WARNING | Data set contains mixed sampling rate, but no sampling rate was specified for the dataset.get_waveforms will return mixed sampling rate waveforms. 2023-05-30 12:00:44,897 | seisbench | WARNING | Data set contains mixed sampling rate, but no sampling rate was specified for the dataset.get_waveforms will return mixed sampling rate waveforms.
ETHZ - 36743 traces ETHZ - 36743 traces
index source_id source_origin_time index source_id source_origin_time \
0 0 2020zmwrjy 2020-12-27T02:46:42.620452Z \ 0 0 2020zmwrjy 2020-12-27T02:46:42.620452Z
1 1 2020zmwrjy 2020-12-27T02:46:42.620452Z 1 1 2020zmwrjy 2020-12-27T02:46:42.620452Z
2 2 2020zmwrjy 2020-12-27T02:46:42.620452Z 2 2 2020zmwrjy 2020-12-27T02:46:42.620452Z
3 3 2020zmwrjy 2020-12-27T02:46:42.620452Z 3 3 2020zmwrjy 2020-12-27T02:46:42.620452Z
4 4 2020zmwrjy 2020-12-27T02:46:42.620452Z 4 4 2020zmwrjy 2020-12-27T02:46:42.620452Z
source_origin_uncertainty_sec source_latitude_deg source_origin_uncertainty_sec source_latitude_deg \
0 NaN 47.147641 \ 0 NaN 47.147641
1 NaN 47.147641 1 NaN 47.147641
2 NaN 47.147641 2 NaN 47.147641
3 NaN 47.147641 3 NaN 47.147641
4 NaN 47.147641 4 NaN 47.147641
source_latitude_uncertainty_km source_longitude_deg source_latitude_uncertainty_km source_longitude_deg \
0 0.620493 6.371343 \ 0 0.620493 6.371343
1 0.620493 6.371343 1 0.620493 6.371343
2 0.620493 6.371343 2 0.620493 6.371343
3 0.620493 6.371343 3 0.620493 6.371343
4 0.620493 6.371343 4 0.620493 6.371343
source_longitude_uncertainty_km source_depth_km source_longitude_uncertainty_km source_depth_km \
0 0.927755 10.965625 \ 0 0.927755 10.965625
1 0.927755 10.965625 1 0.927755 10.965625
2 0.927755 10.965625 2 0.927755 10.965625
3 0.927755 10.965625 3 0.927755 10.965625
4 0.927755 10.965625 4 0.927755 10.965625
source_depth_uncertainty_km ... trace_Pn_status trace_Pn_polarity source_depth_uncertainty_km ... trace_Pn_status trace_Pn_polarity \
0 2.116958 ... NaN NaN \ 0 2.116958 ... NaN NaN
1 2.116958 ... NaN NaN 1 2.116958 ... NaN NaN
2 2.116958 ... NaN NaN 2 2.116958 ... NaN NaN
3 2.116958 ... NaN NaN 3 2.116958 ... NaN NaN
4 2.116958 ... NaN NaN 4 2.116958 ... NaN NaN
trace_P_arrival_sample trace_P_status trace_P_polarity trace_P_arrival_sample trace_P_status trace_P_polarity \
0 NaN NaN NaN \ 0 NaN NaN NaN
1 NaN NaN NaN 1 NaN NaN NaN
2 NaN NaN NaN 2 NaN NaN NaN
3 NaN NaN NaN 3 NaN NaN NaN
4 NaN NaN NaN 4 NaN NaN NaN
trace_Sn_arrival_sample trace_Sn_status trace_Sn_polarity trace_chunk trace_Sn_arrival_sample trace_Sn_status trace_Sn_polarity trace_chunk \
0 NaN NaN NaN \ 0 NaN NaN NaN
1 NaN NaN NaN 1 NaN NaN NaN
2 NaN NaN NaN 2 NaN NaN NaN
3 NaN NaN NaN 3 NaN NaN NaN
4 NaN NaN NaN 4 NaN NaN NaN
trace_component_order trace_component_order
0 ZNE 0 ZNE
1 ZNE 1 ZNE
2 ZNE 2 ZNE
3 ZNE 3 ZNE
4 ZNE 4 ZNE
[5 rows x 58 columns] [5 rows x 58 columns]
%% Cell type:markdown id:bbd05dfd-14a5-4d13-bd86-bf74b3b7f009 tags:
## ObsPy API / HTTP proxy error
%% Cell type:code id:bafe58a4-396c-42da-9754-e60d31f382fa tags:
``` python
from obspy.clients.fdsn import Client
```
%% Cell type:code id:d4ff92bf-8137-4389-af95-48ff8a958e1f tags:
``` python
Client('ETH')
```
%% Output
FDSN Webservice Client (base url: http://eida.ethz.ch)
Available Services: 'dataselect' (v1.1.1), 'event' (v1.2.4), 'station' (v1.1.4), 'available_event_catalogs', 'available_event_contributors', 'eida-auth'
Use e.g. client.help('dataselect') for the
parameter description of the individual services
or client.help() for parameter description of
all webservices.
%% Cell type:markdown id:b4f41f45-c6b5-4ae1-9506-c6981f91343d tags: %% Cell type:markdown id:b4f41f45-c6b5-4ae1-9506-c6981f91343d tags:
<center><img src="https://s3.glicid.fr/nuts/workshop-footer.svg"/></center> <center><img src="https://s3.glicid.fr/nuts/workshop-footer.svg"/></center>
......
0% Chargement en cours ou .
You are about to add 0 people to the discussion. Proceed with caution.
Terminez d'abord l'édition de ce message.
Veuillez vous inscrire ou vous pour commenter