{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"nbsphinx": "hidden"
},
"source": [
"[prev: Aperçu de l'écosystème](intro.ipynb) | [home](../index.ipynb) | [next: Scipy](scipy.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Numpy\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## La structure de base : le *array*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"La contribution majeure de Numpy est de proposer une implémentation performante de tableaux uniformes multi-dimensionnels : le `array`"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# on importe le package numpy.\n",
"# il est très fréquent d'abréger son nom en 'np'\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1., 0., 3.],\n",
" [0., 1., 5.]])"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Le array est un conteneur qui peut être initialisé\n",
"# avec une liste, une liste de listes, une liste de listes de listes, ...\n",
"# le niveau d'imbrication décrit le nombre de dimensions du array.\n",
"x = np.array([[1, 0.0, 3], [0, 1, 5]])\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"2"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 'ndim' est le nombre de dimensions du array\n",
"x.ndim"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(2, 3)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 'shape' informe sur la taille de chaque dimension\n",
"# Dans l'exemple, x contient 2 listes à 3 éléments.\n",
"x.shape"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"dtype('float64')"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Contrairement aux conteneurs 'classiques', tous les éléments d'un array dovient être du même type.\n",
"# Dans l'exemple, des flottants.\n",
"x.dtype"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(nan, inf)"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Numpy dispose de types pour gérer des valeurs non-numériques spécifiques : \"Not A Number\", et \"Infinity\".\n",
"np.NAN, np.Inf"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"float64\n"
]
},
{
"data": {
"text/plain": [
"array([nan, 2.])"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Ces types peuvent cohabiter avec des valeurs numériques\n",
"y = np.array([np.NaN, 2], dtype=float)\n",
"print(y.dtype)\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 0, 0],\n",
" [0, 0, 0]])"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Des fonctions existent pour créer des array aux remplissages particuliers.\n",
"# Un array de 0\n",
"np.zeros((2, 3), dtype=int)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1., 1., 1., 1., 1.])"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Un array de 1\n",
"np.ones(5)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[-1, -1, 0, 0],\n",
" [ 0, 0, 0, 0],\n",
" [ 0, 0, 0, 0],\n",
" [ 0, 0, 0, 0],\n",
" [ 0, 0, 0, 0]])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Un array avec un contenu non prédéfini, à remplir par la suite\n",
"# (le contenu initial du array sera conditionné par ce qu'il y a en mémoire, mais n'épiloguons pas sur le sujet)\n",
"np.empty((5,4), dtype=int)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5])"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Au delà des array, numpy dispose de plusieurs fonctions pratiques\n",
"# L'équivalent du range() de Python, mais qui retourne un array\n",
"np.arange(6)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0. , 1.5, 3. , 4.5, 6. ])"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Le pendant de np.arange, pour lequel on ne précise pas le pas mais le nombre de valeurs\n",
"np.linspace(0,6,5)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"# Et bien plus encore..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Indexation\n",
"\n",
"L'accès aux éléments d'un `array` est plus souple que dans le cas des conteneurs de base."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"int32\n"
]
},
{
"data": {
"text/plain": [
"array([[ 1, 2, 3, 4],\n",
" [ 5, 6, 7, 8],\n",
" [ 9, 10, 11, 12]])"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# On créer un array d'entiers\n",
"x = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n",
"print(x.dtype)\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 2, 3, 4])"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Le premier indice permet d'accéder aux lignes, ...\n",
"x[0]"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# ... et le deuxième indice aux colonnes (etc pour les array de dimensions supérieures)\n",
"x[0][2] # marche mais peut mieux faire"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# ... et le deuxième indice aux colonnes (etc pour les array de dimensions supérieures)\n",
"x[0, 2] # voilà, là c'est plus propre"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 5, 9])"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# On peut accéder aux colonnes en utilisant un slice sur le première indice.\n",
"x[:, 0]"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[-1, 2, 3, 4],\n",
" [ 5, 6, 7, 8],\n",
" [ 9, 10, 11, 12]])"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Le contenu d'un array peut être modifié\n",
"x[0, 0] = -1\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0, 2, 3, 4],\n",
" [ 0, 6, 7, 8],\n",
" [ 0, 10, 11, 12]])"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Il est possible de remplacer plusieurs éléments par la même valeur d'un seul coup.\n",
"x[:, 0] = 0\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0, 3],\n",
" [ 0, 7],\n",
" [ 0, 11]])"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Toutes les fonctionnalités des slices sont disponibles: arr[start:stop:step]\n",
"x[:, ::2]"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0.10193638, 0.31366494, 0.58893349, 0.62615365],\n",
" [0.69863562, 0.22738459, 0.85689817, 0.20049676],\n",
" [0.73153055, 0.7271929 , 0.74053103, 0.70424826],\n",
" [0.07807063, 0.90004515, 0.83373539, 0.57301106],\n",
" [0.99646386, 0.19844358, 0.83802383, 0.63492711]])"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Un accès *très* utile : l'indexation par tableau de booléens\n",
"a = np.random.random((5,4))\n",
"a"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ True, True, False, False],\n",
" [False, True, False, True],\n",
" [False, False, False, False],\n",
" [ True, False, False, False],\n",
" [False, True, False, False]])"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Admettons : on veut tronquer les valeurs inférieures à 0.5.\n",
"# On commence par se créer un \"masque\"\n",
"small = a < 0.5\n",
"small"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0. , 0. , 0.58893349, 0.62615365],\n",
" [0.69863562, 0. , 0.85689817, 0. ],\n",
" [0.73153055, 0.7271929 , 0.74053103, 0.70424826],\n",
" [0. , 0.90004515, 0.83373539, 0.57301106],\n",
" [0.99646386, 0. , 0.83802383, 0.63492711]])"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# On accède au array par le \"masque\"...\n",
"a[small] = 0\n",
"# ... et le tour est joué!\n",
"a"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Arithmétique\n",
"\n",
"Les opérations arithmétiques sur `array` suivent la convention de l'algèbre linéaire (et sont donc plus intuitive)."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4])"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Créons un array\n",
"x = np.arange(5)\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0. , 2.5, 5. , 7.5, 10. ])"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Les opérations entre un array et un nombre sont effectuées sur tous les éléments du array\n",
"# Exemple de la multiplication :\n",
"# (pour rappel l'opération float * list dans Python duplique la liste)\n",
"2.5 * x"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"# Les opérations entre array de même taille s'effectuent élément par élément.\n",
"y = np.array([10, 11, 12, 13, 14])"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([10, 12, 14, 16, 18])"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Les opérations entre array de même taille s'effectuent élément par élément.\n",
"x + y"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0, 11, 24, 39, 56])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Les opérations entre array de même taille s'effectuent élément par élément.\n",
"x * y"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3.16227766, 3.31662479, 3.46410162, 3.60555128, 3.74165739])"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Numpy dispose de nombreuses fonctions mathématiques : trigo, log, exp, ...\n",
"# Les fonctions de Numpy peuvent être appelées sur des array, auquel cas l'opération est appliquée sur tous les éléments.\n",
"np.sqrt(y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Toutes les fonctions disponibles : http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 0, 10, 20, 30, 40],\n",
" [ 0, 11, 22, 33, 44],\n",
" [ 0, 12, 24, 36, 48],\n",
" [ 0, 13, 26, 39, 52],\n",
" [ 0, 14, 28, 42, 56]])"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Une dernière remarque : numpy peut traiter les opérations arithmétiques entre array de dimensions différentes,\n",
"# on parle de \"broadcasting\".\n",
"# Exemple d'application au produit tensoriel :\n",
"# On redimensionne x pour avoir un vecteur ligne.\n",
"x = x.reshape((1,5))\n",
"# On redimensionne y pour avoir un vecteur colonne.\n",
"y = y.reshape((5,1))\n",
"# Leur produit donne un array de dimensions (5,5).\n",
"x*y"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
"
],
"text/plain": [
""
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from IPython.display import Image\n",
"Image(width=600, url='https://scipy-lectures.github.io/_images/numpy_broadcasting.png')\n",
"# source: http://scipy-lectures.github.io"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Changer la forme\n",
"\n",
"Il est possible de changer la forme (*shape*) d'un array sans faire de copie (mais pas toujours)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5])"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.arange(6)\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 1, 2],\n",
" [3, 4, 5]])"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# on peut voir le contenu de x sous la forme d'un array 2d\n",
"y = x.reshape((2, 3))\n",
"y"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[-1 1 2 3 4 5]\n",
"[[-1 1 2]\n",
" [ 3 4 5]]\n"
]
}
],
"source": [
"# l'information est partagée, pas copiée, on parle de différentes 'views' sur la même donnée.\n",
"# modifier le contenu de x a un effet sur y\n",
"x[0] = -1\n",
"print(x)\n",
"print(y)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[-1],\n",
" [ 1],\n",
" [ 2],\n",
" [ 3],\n",
" [ 4],\n",
" [ 5]])"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# on peut aussi utiliser l'indexation pour ajouter des dimensions\n",
"x[:, np.newaxis]"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[0, 1, 2, 3, 4],\n",
" [1, 2, 3, 4, 5],\n",
" [2, 3, 4, 5, 6]])"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# ce comportement se combine bien avec le broadcasting\n",
"a = np.arange(3)\n",
"b = np.arange(5)\n",
"a[:, np.newaxis] + b[np.newaxis, :]"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[-1, 1],\n",
" [ 2, 3],\n",
" [ 4, 5]])"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# on peut modifier directement la forme d'un array\n",
"x.shape = (3, 2)\n",
"x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Opérations sur les arrays\n",
"\n",
"Quelques fonctions utiles parmi d'autres : np.where(), np.sum(), np.maximum(), np.minimum()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### np.where() : \"mélanger\" deux arrays suivant une condition"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.arange(10)\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.where(x<5, 0, 1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### np.sum() : sommer un array selon un axe"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 1, 2, 3, 4],\n",
" [ 5, 6, 7, 8],\n",
" [ 9, 10, 11, 12]])"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([15, 18, 21, 24])"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.sum(x, axis=0)"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([10, 26, 42])"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.sum(x, axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### np.maximum(a, b) : construit un array composé du maximum entre a et b (avec du broadcasting)"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.arange(10)\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([4, 4, 4, 4, 4, 5, 6, 7, 8, 9])"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.maximum(x, 4)"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 4, 4, 4, 4, 4])"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"np.minimum(x, 4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercices"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 4
}