Auto-update: 2025-10-23 19:23:57

This commit is contained in:
divingeek 2025-10-23 19:23:57 +02:00
parent d2f061c9f9
commit a7b5fc79b4
2 changed files with 2915 additions and 6267 deletions

File diff suppressed because one or more lines are too long

View file

@ -4,13 +4,11 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Classification des Micro-régions de Corse par K-NN\n", "# Classification k-NN Corse - Version ipyleaflet\n",
"\n", "\n",
"Ce notebook implémente un système de classification des micro-régions corses basé sur l'algorithme des k plus proches voisins (k-NN). Cliquez sur la carte pour identifier la micro-région correspondante.\n", "Cette version utilise **ipyleaflet** avec des interactions Python natives (pas de JavaScript inject\u00e9).\n",
"\n", "\n",
"**Fichiers nécessaires :**\n", "**Avantage :** Fonctionne parfaitement dans Jupyter sans probl\u00e8me d'iframe ou de JavaScript."
"- `communes-de-corse-en-corse-et-francais.csv` : Liste des communes avec coordonnées GPS\n",
"- `communes-par-territoire-de-projet-de-la-collectivite-territoriale-de-corse0.csv` : Territoires de projet par commune"
] ]
}, },
{ {
@ -19,8 +17,8 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Installation des bibliothèques nécessaires\n", "# Installation\n",
"!pip install folium pandas numpy scikit-learn --quiet" "!pip install ipyleaflet ipywidgets pandas numpy scikit-learn --quiet"
] ]
}, },
{ {
@ -31,18 +29,17 @@
"source": [ "source": [
"import pandas as pd\n", "import pandas as pd\n",
"import numpy as np\n", "import numpy as np\n",
"import folium\n", "from ipyleaflet import Map, CircleMarker, Marker, Polyline, LayerGroup, WidgetControl, AwesomeIcon\n",
"from ipywidgets import HTML, VBox, HBox, Label, IntSlider, Output\n",
"from sklearn.neighbors import KNeighborsClassifier\n", "from sklearn.neighbors import KNeighborsClassifier\n",
"from IPython.display import display, HTML\n", "from IPython.display import display\n"
"import json\n",
"import re"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 1. Chargement et préparation des données" "## 1. Chargement des donn\u00e9es"
] ]
}, },
{ {
@ -51,39 +48,13 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Chargement du fichier avec les coordonnées GPS\n", "# Chargement\n",
"df_coords = pd.read_csv('communes-de-corse-en-corse-et-francais.csv', \n", "df_coords = pd.read_csv('communes-de-corse-en-corse-et-francais.csv', sep=';', encoding='utf-8')\n",
" sep=';', encoding='utf-8')\n",
"\n",
"print(f\"Fichier coordonnées: {len(df_coords)} communes\")\n",
"print(\"\\nPremières lignes:\")\n",
"display(df_coords.head())\n",
"print(\"\\nColonnes disponibles:\")\n",
"print(df_coords.columns.tolist())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Chargement du fichier avec les territoires de projet\n",
"df_territoires = pd.read_csv('communes-par-territoire-de-projet-de-la-collectivite-territoriale-de-corse0.csv', \n", "df_territoires = pd.read_csv('communes-par-territoire-de-projet-de-la-collectivite-territoriale-de-corse0.csv', \n",
" sep=';', encoding='utf-8')\n", " sep=';', encoding='utf-8')\n",
"\n", "\n",
"print(f\"Fichier territoires: {len(df_territoires)} communes\")\n", "print(f\"\u2705 {len(df_coords)} communes avec coordonn\u00e9es\")\n",
"print(\"\\nPremières lignes:\")\n", "print(f\"\u2705 {len(df_territoires)} communes avec territoires\")"
"display(df_territoires.head())\n",
"print(\"\\nTerritoires de projet (micro-régions):\")\n",
"print(sorted(df_territoires['Territoire de projet'].unique()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Extraction des coordonnées GPS"
] ]
}, },
{ {
@ -92,43 +63,23 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Extraction coordonn\u00e9es\n",
"def extract_coordinates(point_geo_str):\n", "def extract_coordinates(point_geo_str):\n",
" \"\"\"\n",
" Extrait latitude et longitude de la colonne Point_Geo\n",
" Format attendu: \"41.984099158, 8.798384636\"\n",
" \"\"\"\n",
" if pd.isna(point_geo_str):\n", " if pd.isna(point_geo_str):\n",
" return None, None\n", " return None, None\n",
" \n",
" try:\n", " try:\n",
" # Supprimer les espaces et split par virgule\n",
" coords = str(point_geo_str).strip().split(',')\n", " coords = str(point_geo_str).strip().split(',')\n",
" if len(coords) == 2:\n", " if len(coords) == 2:\n",
" lat = float(coords[0].strip())\n", " return float(coords[0].strip()), float(coords[1].strip())\n",
" lon = float(coords[1].strip())\n",
" return lat, lon\n",
" except:\n", " except:\n",
" pass\n", " pass\n",
" \n",
" return None, None\n", " return None, None\n",
"\n", "\n",
"# Extraction des coordonnées\n",
"df_coords[['Latitude', 'Longitude']] = df_coords['Point_Geo'].apply(\n", "df_coords[['Latitude', 'Longitude']] = df_coords['Point_Geo'].apply(\n",
" lambda x: pd.Series(extract_coordinates(x))\n", " lambda x: pd.Series(extract_coordinates(x))\n",
")\n", ")\n",
"\n", "\n",
"# Vérification\n", "print(f\"\u2705 {df_coords['Latitude'].notna().sum()} coordonn\u00e9es extraites\")"
"print(\"Extraction des coordonnées:\")\n",
"print(f\"Communes avec coordonnées: {df_coords['Latitude'].notna().sum()}/{len(df_coords)}\")\n",
"print(\"\\nExemple:\")\n",
"display(df_coords[['Nom français', 'Latitude', 'Longitude']].head())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Fusion des deux fichiers"
] ]
}, },
{ {
@ -137,54 +88,30 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Normalisation des noms de communes pour la jointure\n", "# Fusion\n",
"def normalize_commune_name(name):\n", "def normalize(name):\n",
" \"\"\"\n", " return str(name).upper().strip() if not pd.isna(name) else ''\n",
" Normalise le nom d'une commune pour faciliter la jointure\n",
" \"\"\"\n",
" if pd.isna(name):\n",
" return ''\n",
" # Convertir en majuscules et supprimer les espaces multiples\n",
" return str(name).upper().strip()\n",
"\n", "\n",
"df_coords['Commune_norm'] = df_coords['Nom français'].apply(normalize_commune_name)\n", "df_coords['Commune_norm'] = df_coords['Nom fran\u00e7ais'].apply(normalize)\n",
"df_territoires['Commune_norm'] = df_territoires['Commune'].apply(normalize_commune_name)\n", "df_territoires['Commune_norm'] = df_territoires['Commune'].apply(normalize)\n",
"\n", "\n",
"# Fusion des deux dataframes\n", "df = pd.merge(df_coords, df_territoires[['Commune_norm', 'Territoire de projet']], \n",
"df = pd.merge(\n", " on='Commune_norm', how='inner')\n",
" df_coords,\n", "df['Commune'] = df['Nom fran\u00e7ais']\n",
" df_territoires[['Commune_norm', 'Territoire de projet']],\n",
" on='Commune_norm',\n",
" how='inner'\n",
")\n",
"\n",
"# Renommer pour cohérence\n",
"df['Commune'] = df['Nom français']\n",
"\n",
"print(f\"Fusion réussie: {len(df)} communes avec coordonnées ET territoire de projet\")\n",
"print(\"\\nAperçu des données fusionnées:\")\n",
"display(df[['Commune', 'Latitude', 'Longitude', 'Territoire de projet']].head(10))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Nettoyage: supprimer les lignes sans coordonnées\n",
"df_clean = df.dropna(subset=['Latitude', 'Longitude', 'Territoire de projet']).copy()\n", "df_clean = df.dropna(subset=['Latitude', 'Longitude', 'Territoire de projet']).copy()\n",
"\n", "\n",
"print(f\"\\n✅ Données finales: {len(df_clean)} communes prêtes pour la classification\")\n", "print(f\"\u2705 {len(df_clean)} communes fusionn\u00e9es\")\n",
"print(f\"\\nRépartition par micro-région:\")\n", "print(f\"\\nMicro-r\u00e9gions: {len(df_clean['Territoire de projet'].unique())}\")\n",
"print(df_clean['Territoire de projet'].value_counts().sort_index())" "for region in sorted(df_clean['Territoire de projet'].unique()):\n",
" count = (df_clean['Territoire de projet'] == region).sum()\n",
" print(f\" \u2022 {region}: {count} communes\")"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 4. Entraînement du modèle k-NN" "## 2. Entra\u00eenement k-NN"
] ]
}, },
{ {
@ -193,33 +120,24 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Préparation des données pour k-NN\n", "# Mod\u00e8le k-NN\n",
"X = df_clean[['Latitude', 'Longitude']].values\n", "X = df_clean[['Latitude', 'Longitude']].values\n",
"y = df_clean['Territoire de projet'].values\n", "y = df_clean['Territoire de projet'].values\n",
"\n", "\n",
"# Création du modèle k-NN avec k=5 (ajustable)\n", "knn = KNeighborsClassifier(n_neighbors=5, weights='distance', metric='haversine')\n",
"k = 5\n",
"knn = KNeighborsClassifier(n_neighbors=k, weights='distance', metric='haversine')\n",
"\n",
"# Conversion des coordonnées en radians pour la distance haversine\n",
"X_rad = np.radians(X)\n", "X_rad = np.radians(X)\n",
"\n",
"# Entraînement du modèle\n",
"knn.fit(X_rad, y)\n", "knn.fit(X_rad, y)\n",
"\n", "\n",
"print(f\"✅ Modèle k-NN entraîné avec k={k} voisins\")\n", "print(f\"\u2705 Mod\u00e8le k-NN entra\u00een\u00e9\")\n",
"print(f\"📊 Nombre de micro-régions: {len(np.unique(y))}\")\n", "print(f\"\u2705 {len(df_clean)} communes\")\n",
"print(f\"\\n🗺 Micro-régions identifiées:\")\n", "print(f\"\u2705 {len(np.unique(y))} micro-r\u00e9gions\")"
"for i, region in enumerate(sorted(np.unique(y)), 1):\n",
" count = (y == region).sum()\n",
" print(f\" {i:2d}. {region} ({count} communes)\")"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 5. Création de la carte interactive avec Folium" "## 3. Configuration des couleurs"
] ]
}, },
{ {
@ -228,17 +146,24 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Couleurs pour chaque micro-région\n", "# Couleurs par micro-r\u00e9gion\n",
"microregions = sorted(df_clean['Territoire de projet'].unique())\n", "microregions = sorted(df_clean['Territoire de projet'].unique())\n",
"colors = ['red', 'blue', 'green', 'purple', 'orange', 'darkred', \n", "colors = ['red', 'blue', 'green', 'purple', 'orange', 'darkred', \n",
" 'lightred', 'beige', 'darkblue', 'darkgreen', 'cadetblue', \n", " 'lightcoral', 'beige', 'darkblue', 'darkgreen', 'cadetblue', \n",
" 'darkpurple', 'pink', 'lightblue', 'lightgreen', 'gray', 'black', 'lightgray']\n", " 'darkviolet', 'pink', 'lightblue', 'lightgreen', 'gray']\n",
"\n", "\n",
"color_map = {region: colors[i % len(colors)] for i, region in enumerate(microregions)}\n", "color_map = {region: colors[i % len(colors)] for i, region in enumerate(microregions)}\n",
"\n", "\n",
"print(\"🎨 Carte des couleurs par micro-région:\")\n", "print(\"\u2705 Couleurs configur\u00e9es:\")\n",
"for region, color in sorted(color_map.items()):\n", "for region, color in sorted(color_map.items()):\n",
" print(f\" • {region}: {color}\")" " print(f\" \u2022 {region}: {color}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Cr\u00e9ation de la carte interactive"
] ]
}, },
{ {
@ -247,249 +172,42 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Coordonnées du centre de la Corse\n", "# Carte ipyleaflet\n",
"center_lat = df_clean['Latitude'].mean()\n", "center_lat = df_clean['Latitude'].mean()\n",
"center_lon = df_clean['Longitude'].mean()\n", "center_lon = df_clean['Longitude'].mean()\n",
"\n", "\n",
"print(f\"Centre de la carte: {center_lat:.4f}°N, {center_lon:.4f}°E\")" "m = Map(\n",
] " center=(center_lat, center_lon),\n",
}, " zoom=9,\n",
{ " scroll_wheel_zoom=True\n",
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Carte interactive avec prédiction au clic"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Création de la carte interactive\n",
"m_interactive = folium.Map(\n",
" location=[center_lat, center_lon],\n",
" zoom_start=9,\n",
" tiles='OpenStreetMap'\n",
")\n", ")\n",
"\n", "\n",
"# Ajout des marqueurs pour chaque commune\n", "# Ajouter les communes\n",
"commune_layer = LayerGroup(name='Communes')\n",
"\n",
"for idx, row in df_clean.iterrows():\n", "for idx, row in df_clean.iterrows():\n",
" folium.CircleMarker(\n", " marker = CircleMarker(\n",
" location=[row['Latitude'], row['Longitude']],\n", " location=(row['Latitude'], row['Longitude']),\n",
" radius=3,\n", " radius=3,\n",
" popup=f\"<b>{row['Commune']}</b><br>{row['Territoire de projet']}<br><small>({row['Latitude']:.4f}, {row['Longitude']:.4f})</small>\",\n",
" tooltip=row['Commune'],\n",
" color=color_map[row['Territoire de projet']],\n", " color=color_map[row['Territoire de projet']],\n",
" fill=True,\n", " fill_color=color_map[row['Territoire de projet']],\n",
" fillColor=color_map[row['Territoire de projet']],\n", " fill_opacity=0.7,\n",
" fillOpacity=0.7\n", " weight=1\n",
" ).add_to(m_interactive)\n", " )\n",
" # Popup avec info\n",
" marker.popup = HTML(f\"<b>{row['Commune']}</b><br>{row['Territoire de projet']}\")\n",
" commune_layer.add_layer(marker)\n",
"\n", "\n",
"# Préparer les données des communes pour JavaScript\n", "m.add_layer(commune_layer)\n",
"communes_data = df_clean[['Latitude', 'Longitude', 'Commune', 'Territoire de projet']].to_dict('records')\n",
"\n", "\n",
"# JavaScript pour la prédiction k-NN au clic\n", "print(f\"\u2705 {len(df_clean)} communes ajout\u00e9es \u00e0 la carte\")"
"click_js = f\"\"\"\n",
"<script>\n",
"// Données des communes\n",
"var communesData = {json.dumps(communes_data)};\n",
"\n",
"// Carte des couleurs\n",
"var colorMap = {json.dumps(color_map)};\n",
"\n",
"// Fonction pour calculer la distance haversine\n",
"function haversineDistance(lat1, lon1, lat2, lon2) {{\n",
" const R = 6371; // Rayon de la Terre en km\n",
" const dLat = (lat2 - lat1) * Math.PI / 180;\n",
" const dLon = (lon2 - lon1) * Math.PI / 180;\n",
" const a = Math.sin(dLat/2) * Math.sin(dLat/2) +\n",
" Math.cos(lat1 * Math.PI / 180) * Math.cos(lat2 * Math.PI / 180) *\n",
" Math.sin(dLon/2) * Math.sin(dLon/2);\n",
" const c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));\n",
" return R * c;\n",
"}}\n",
"\n",
"// Fonction k-NN\n",
"function predictRegion(lat, lon, k) {{\n",
" // Calculer les distances\n",
" var distances = communesData.map(function(commune) {{\n",
" return {{\n",
" distance: haversineDistance(lat, lon, commune.Latitude, commune.Longitude),\n",
" region: commune['Territoire de projet'],\n",
" commune: commune.Commune\n",
" }};\n",
" }});\n",
" \n",
" // Trier par distance\n",
" distances.sort((a, b) => a.distance - b.distance);\n",
" \n",
" // Prendre les k plus proches\n",
" var kNearest = distances.slice(0, k);\n",
" \n",
" // Vote pondéré par l'inverse de la distance\n",
" var votes = {{}};\n",
" kNearest.forEach(function(neighbor) {{\n",
" var weight = 1 / (neighbor.distance + 0.001); // +0.001 pour éviter division par 0\n",
" if (votes[neighbor.region]) {{\n",
" votes[neighbor.region] += weight;\n",
" }} else {{\n",
" votes[neighbor.region] = weight;\n",
" }}\n",
" }});\n",
" \n",
" // Trouver la région gagnante\n",
" var maxVote = 0;\n",
" var predictedRegion = '';\n",
" for (var region in votes) {{\n",
" if (votes[region] > maxVote) {{\n",
" maxVote = votes[region];\n",
" predictedRegion = region;\n",
" }}\n",
" }}\n",
" \n",
" return {{\n",
" region: predictedRegion,\n",
" neighbors: kNearest\n",
" }};\n",
"}}\n",
"\n",
"// Variable pour stocker le marqueur de prédiction\n",
"var predictionMarker = null;\n",
"var neighborLines = [];\n",
"\n",
"// Attendre que la carte soit chargée\n",
"setTimeout(function() {{\n",
" var maps = document.querySelectorAll('.folium-map');\n",
" if (maps.length > 0) {{\n",
" var mapElement = maps[maps.length - 1];\n",
" var leafletMap = mapElement._leaflet_map;\n",
" \n",
" if (leafletMap) {{\n",
" leafletMap.on('click', function(e) {{\n",
" var lat = e.latlng.lat;\n",
" var lon = e.latlng.lng;\n",
" \n",
" // Prédiction avec k={k}\n",
" var result = predictRegion(lat, lon, {k});\n",
" \n",
" // Supprimer l'ancien marqueur et lignes\n",
" if (predictionMarker) {{\n",
" leafletMap.removeLayer(predictionMarker);\n",
" }}\n",
" neighborLines.forEach(function(line) {{\n",
" leafletMap.removeLayer(line);\n",
" }});\n",
" neighborLines = [];\n",
" \n",
" // Créer le popup avec informations détaillées\n",
" var popupContent = '<div style=\"min-width: 200px;\">' +\n",
" '<h4 style=\"margin: 5px 0; color: ' + colorMap[result.region] + ';\">🎯 ' + result.region + '</h4>' +\n",
" '<p style=\"margin: 5px 0; font-size: 11px;\"><b>Coordonnées cliquées:</b><br>' + \n",
" 'Lat: ' + lat.toFixed(5) + '°<br>Lon: ' + lon.toFixed(5) + '°</p>' +\n",
" '<hr style=\"margin: 5px 0;\">' +\n",
" '<p style=\"margin: 5px 0; font-size: 11px;\"><b>{k} plus proches communes:</b></p>' +\n",
" '<ul style=\"margin: 5px 0; padding-left: 20px; font-size: 10px;\">';\n",
" \n",
" result.neighbors.forEach(function(neighbor, i) {{\n",
" popupContent += '<li><b>' + neighbor.commune + '</b> (' + neighbor.distance.toFixed(2) + ' km)</li>';\n",
" }});\n",
" \n",
" popupContent += '</ul></div>';\n",
" \n",
" // Ajouter le nouveau marqueur\n",
" predictionMarker = L.marker([lat, lon], {{\n",
" icon: L.divIcon({{\n",
" className: 'prediction-marker',\n",
" html: '<div style=\"background-color: ' + colorMap[result.region] + \n",
" '; width: 20px; height: 20px; border-radius: 50%; ' +\n",
" 'border: 3px solid white; box-shadow: 0 0 10px rgba(0,0,0,0.5);\"></div>',\n",
" iconSize: [20, 20]\n",
" }})\n",
" }}).addTo(leafletMap);\n",
" \n",
" predictionMarker.bindPopup(popupContent, {{maxWidth: 300}}).openPopup();\n",
" \n",
" // Ajouter des lignes vers les k plus proches voisins\n",
" result.neighbors.forEach(function(neighbor) {{\n",
" var commune = communesData.find(c => c.Commune === neighbor.commune);\n",
" if (commune) {{\n",
" var line = L.polyline(\n",
" [[lat, lon], [commune.Latitude, commune.Longitude]],\n",
" {{\n",
" color: 'gray',\n",
" weight: 1,\n",
" opacity: 0.5,\n",
" dashArray: '5, 5'\n",
" }}\n",
" ).addTo(leafletMap);\n",
" neighborLines.push(line);\n",
" }}\n",
" }});\n",
" }});\n",
" \n",
" console.log('✅ Gestionnaire de clic k-NN activé');\n",
" console.log('📊 ' + communesData.length + ' communes chargées');\n",
" }}\n",
" }}\n",
"}}, 1000);\n",
"</script>\n",
"\"\"\"\n",
"\n",
"m_interactive.get_root().html.add_child(folium.Element(click_js))\n",
"\n",
"# Ajout de la légende\n",
"legend_html = '''\n",
"<div style=\"position: fixed; \n",
" top: 10px; right: 10px; width: 250px; max-height: 85vh; overflow-y: auto;\n",
" background-color: white; border:2px solid grey; z-index:9999; border-radius: 5px;\n",
" font-size:12px; padding: 10px; box-shadow: 0 0 15px rgba(0,0,0,0.2);\">\n",
"<p style=\"margin-bottom: 8px; font-weight: bold; font-size: 14px;\">🗺️ Micro-régions de Corse</p>\n",
"'''\n",
"\n",
"for region, color in sorted(color_map.items()):\n",
" legend_html += f'<p style=\"margin: 3px 0;\"><i class=\"fa fa-circle\" style=\"color:{color}\"></i> {region}</p>'\n",
"\n",
"legend_html += f'<hr style=\"margin: 8px 0;\"><p style=\"margin: 3px 0; font-size: 11px; color: #666;\">k = {k} voisins<br>Distance: Haversine</p></div>'\n",
"\n",
"m_interactive.get_root().html.add_child(folium.Element(legend_html))\n",
"\n",
"# Ajout d'instructions\n",
"instructions_html = '''\n",
"<div style=\"position: fixed; \n",
" bottom: 10px; left: 10px; width: 320px; \n",
" background-color: white; border:2px solid grey; z-index:9999; border-radius: 5px;\n",
" font-size:13px; padding: 12px; box-shadow: 0 0 15px rgba(0,0,0,0.2);\">\n",
"<p style=\"margin: 0 0 8px 0; font-weight: bold;\">🖱️ Mode d'emploi</p>\n",
"<p style=\"margin: 5px 0; line-height: 1.4;\"><b>Cliquez</b> n'importe où sur la carte pour prédire la micro-région.</p>\n",
"<p style=\"margin: 5px 0; line-height: 1.4; font-size: 11px;\">• Un marqueur coloré apparaît au point cliqué<br>\n",
"• Les lignes pointillées montrent les k communes les plus proches<br>\n",
"• Le popup affiche la prédiction détaillée</p>\n",
"</div>\n",
"'''\n",
"\n",
"m_interactive.get_root().html.add_child(folium.Element(instructions_html))\n",
"\n",
"print(\"\\n✅ Carte interactive créée avec succès!\")\n",
"print(f\"\\n🖱 Cliquez sur n'importe quel point de la carte pour prédire sa micro-région avec k={k} voisins.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Affichage de la carte interactive\n",
"m_interactive"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 7. Sauvegarde de la carte" "## 5. Interface interactive avec widgets"
] ]
}, },
{ {
@ -498,17 +216,33 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Sauvegarder la carte interactive\n", "# Widgets\n",
"m_interactive.save('carte_corse_knn_interactive.html')\n", "k_slider = IntSlider(\n",
"print(\"✅ Carte sauvegardée dans 'carte_corse_knn_interactive.html'\")\n", " value=5,\n",
"print(\"📁 Vous pouvez ouvrir ce fichier dans un navigateur pour une utilisation autonome.\")" " min=1,\n",
] " max=15,\n",
}, " step=1,\n",
{ " description='k voisins:',\n",
"cell_type": "markdown", " continuous_update=False\n",
"metadata": {}, ")\n",
"source": [ "\n",
"## 8. Test de la prédiction (optionnel)" "info_html = HTML(\n",
" value=\"<div style='background:#e3f2fd;padding:10px;border-radius:5px;'>\"\n",
" \"<b>\ud83d\uddb1\ufe0f Cliquez sur la carte</b> pour pr\u00e9dire la micro-r\u00e9gion.<br>\"\n",
" \"Ajustez <b>k</b> pour changer le nombre de voisins.\"\n",
" \"</div>\"\n",
")\n",
"\n",
"result_output = Output()\n",
"\n",
"# Stocker les coordonn\u00e9es courantes\n",
"current_coords = {'lat': None, 'lon': None}\n",
"\n",
"# Layer pour la pr\u00e9diction\n",
"prediction_layer = LayerGroup(name='Pr\u00e9diction')\n",
"m.add_layer(prediction_layer)\n",
"\n",
"print(\"\u2705 Widgets cr\u00e9\u00e9s\")"
] ]
}, },
{ {
@ -517,60 +251,83 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Fonction pour tester la prédiction sur des coordonnées spécifiques\n", "# Fonction de pr\u00e9diction\n",
"def predict_region(lat, lon, k_value=5):\n", "def update_classification(lat, lon, k_value):\n",
" \"\"\"\n", " \"\"\"Met \u00e0 jour la classification pour un point donn\u00e9.\"\"\"\n",
" Prédit la micro-région pour des coordonnées données\n", " # Pr\u00e9diction\n",
" \"\"\"\n",
" # Conversion en radians\n",
" coords_rad = np.radians([[lat, lon]])\n", " coords_rad = np.radians([[lat, lon]])\n",
" \n", " knn.n_neighbors = k_value\n",
" # Prédiction\n", " knn.fit(X_rad, y) # R\u00e9entra\u00eener avec nouveau k\n",
" prediction = knn.predict(coords_rad)[0]\n", " predicted_region = knn.predict(coords_rad)[0]\n",
" \n", " \n",
" # Trouver les k plus proches voisins\n", " # Trouver les k plus proches voisins\n",
" distances, indices = knn.kneighbors(coords_rad)\n", " distances, indices = knn.kneighbors(coords_rad)\n",
" distances_km = distances[0] * 6371 # Conversion en km\n",
" \n", " \n",
" # Convertir les distances de radians en km\n", " # Nettoyer la couche de pr\u00e9diction\n",
" distances_km = distances[0] * 6371 # Rayon de la Terre en km\n", " prediction_layer.clear_layers()\n",
" \n", " \n",
" print(f\"\\n📍 Coordonnées: {lat:.5f}°N, {lon:.5f}°E\")\n", " # Cr\u00e9er une ic\u00f4ne personnalis\u00e9e avec AwesomeIcon\n",
" print(f\"🎯 Micro-région prédite: {prediction}\")\n", " custom_icon = AwesomeIcon(\n",
" print(f\"\\n{k_value} plus proches communes:\")\n", " name='star',\n",
" marker_color=color_map[predicted_region],\n",
" icon_color='white',\n",
" spin=False\n",
" )\n",
" \n",
" # Ajouter le marqueur de pr\u00e9diction\n",
" prediction_marker = Marker(\n",
" location=(lat, lon),\n",
" draggable=False,\n",
" icon=custom_icon\n",
" )\n",
" \n",
" # Popup d\u00e9taill\u00e9\n",
" popup_html = f\"\"\"<div style='min-width:220px;'>\n",
" <h4 style='margin:5px 0;color:{color_map[predicted_region]};'>\ud83c\udfaf {predicted_region}</h4>\n",
" <p style='margin:5px 0;font-size:11px;'>\n",
" <b>Coordonn\u00e9es:</b><br>\n",
" Lat: {lat:.5f}\u00b0<br>\n",
" Lon: {lon:.5f}\u00b0\n",
" </p>\n",
" <hr style='margin:5px 0;'>\n",
" <p style='margin:5px 0;font-size:11px;'><b>{k_value} plus proches communes:</b></p>\n",
" <ul style='margin:5px 0;padding-left:20px;font-size:10px;'>\"\"\"\n",
" \n", " \n",
" for i, idx in enumerate(indices[0]):\n", " for i, idx in enumerate(indices[0]):\n",
" commune_info = df_clean.iloc[idx]\n", " commune_info = df_clean.iloc[idx]\n",
" print(f\" {i+1}. {commune_info['Commune']:30s} ({commune_info['Territoire de projet']:30s}) - {distances_km[i]:6.2f} km\")\n", " popup_html += f\"<li><b>{commune_info['Commune']}</b> ({distances_km[i]:.2f} km)</li>\"\n",
" \n", " \n",
" return prediction\n", " popup_html += \"</ul></div>\"\n",
" prediction_marker.popup = HTML(popup_html)\n",
" prediction_layer.add_layer(prediction_marker)\n",
" \n", " \n",
"# Exemples de test\n", " # Ajouter les lignes vers les k plus proches voisins\n",
"print(\"=\" * 100)\n", " for i, idx in enumerate(indices[0]):\n",
"print(\"TESTS DE PRÉDICTION k-NN\")\n", " commune_info = df_clean.iloc[idx]\n",
"print(\"=\" * 100)\n", " line = Polyline(\n",
" locations=[\n",
" (lat, lon),\n",
" (commune_info['Latitude'], commune_info['Longitude'])\n",
" ],\n",
" color='gray',\n",
" weight=2,\n",
" opacity=0.6,\n",
" dash_array='8, 8'\n",
" )\n",
" prediction_layer.add_layer(line)\n",
" \n", " \n",
"# Test 1: Centre approximatif de la Corse (vers Corte)\n", " # Afficher le r\u00e9sultat\n",
"print(\"\\n🔍 Test 1: Centre de la Corse\")\n", " with result_output:\n",
"predict_region(42.15, 9.15, k)\n", " result_output.clear_output()\n",
" print(f\"\\n\ud83c\udfaf Pr\u00e9diction: {predicted_region}\")\n",
" print(f\"\ud83d\udccd Coordonn\u00e9es: ({lat:.5f}, {lon:.5f})\")\n",
" print(f\"\\n{k_value} plus proches communes:\")\n",
" for i, idx in enumerate(indices[0]):\n",
" commune_info = df_clean.iloc[idx]\n",
" print(f\" {i+1}. {commune_info['Commune']:30s} - {distances_km[i]:6.2f} km\")\n",
"\n", "\n",
"# Test 2: Nord de la Corse (Balagne/Bastia)\n", "print(\"\u2705 Fonction de pr\u00e9diction d\u00e9finie\")\n"
"print(\"\\n🔍 Test 2: Nord de la Corse\")\n",
"predict_region(42.55, 8.85, k)\n",
"\n",
"# Test 3: Sud de la Corse (vers Porto-Vecchio)\n",
"print(\"\\n🔍 Test 3: Sud de la Corse\")\n",
"predict_region(41.65, 9.15, k)\n",
"\n",
"# Test 4: Ouest (vers Ajaccio)\n",
"print(\"\\n🔍 Test 4: Ouest de la Corse (Ajaccio)\")\n",
"predict_region(41.93, 8.74, k)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 9. Analyse de performance (optionnel)"
] ]
}, },
{ {
@ -579,37 +336,33 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Évaluation de la cohérence du modèle (cross-validation)\n", "# Gestionnaires d'\u00e9v\u00e9nements\n",
"from sklearn.model_selection import cross_val_score\n", "def handle_click(**kwargs):\n",
" \"\"\"Gestionnaire de clic sur la carte.\"\"\"\n",
" if kwargs.get('type') == 'click':\n",
" coords = kwargs.get('coordinates')\n",
" lat, lon = coords\n",
" current_coords['lat'] = lat\n",
" current_coords['lon'] = lon\n",
" update_classification(lat, lon, k_slider.value)\n",
"\n", "\n",
"# Test avec différentes valeurs de k\n", "def on_k_change(change):\n",
"k_values = [3, 5, 7, 9, 11, 15]\n", " \"\"\"Gestionnaire de changement de k.\"\"\"\n",
"scores = []\n", " if current_coords['lat'] is not None:\n",
" update_classification(current_coords['lat'], current_coords['lon'], change['new'])\n",
"\n", "\n",
"print(\"📊 Évaluation de la précision pour différentes valeurs de k:\\n\")\n", "# Connecter les \u00e9v\u00e9nements\n",
"print(f\"{'k':<5} {'Précision moyenne':<20} {'Écart-type':<15}\")\n", "m.on_interaction(handle_click)\n",
"print(\"-\" * 50)\n", "k_slider.observe(on_k_change, names='value')\n",
"\n", "\n",
"for k_val in k_values:\n", "print(\"\u2705 Gestionnaires d'\u00e9v\u00e9nements connect\u00e9s\")"
" knn_temp = KNeighborsClassifier(n_neighbors=k_val, weights='distance', metric='haversine')\n",
" cv_scores = cross_val_score(knn_temp, X_rad, y, cv=5)\n",
" mean_score = cv_scores.mean()\n",
" std_score = cv_scores.std()\n",
" scores.append(mean_score)\n",
" print(f\"{k_val:<5} {mean_score:.4f} ({mean_score*100:5.2f}%) ± {std_score:.4f}\")\n",
"\n",
"best_k = k_values[scores.index(max(scores))]\n",
"best_score = max(scores)\n",
"print(\"\\n\" + \"=\" * 50)\n",
"print(f\"✨ Meilleure valeur de k: {best_k} (précision: {best_score:.4f} / {best_score*100:.2f}%)\")\n",
"print(\"=\" * 50)"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"## 10. Statistiques par micro-région" "## 6. Affichage de l'interface compl\u00e8te"
] ]
}, },
{ {
@ -618,20 +371,58 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"# Statistiques descriptives par micro-région\n", "# L\u00e9gende HTML\n",
"print(\"📈 STATISTIQUES PAR MICRO-RÉGION\\n\")\n", "legend_html = \"<div style='background:white;padding:10px;border-radius:5px;max-height:300px;overflow-y:auto;'>\"\n",
"print(f\"{'Micro-région':<35} {'Nb communes':<15} {'% du total'}\")\n", "legend_html += \"<h4 style='margin-top:0;'>\ud83d\uddfa\ufe0f Micro-r\u00e9gions</h4>\"\n",
"print(\"=\" * 65)\n", "for region, color in sorted(color_map.items()):\n",
" legend_html += f\"<div style='margin:3px 0;'><span style='display:inline-block;width:12px;height:12px;background:{color};border-radius:50%;margin-right:5px;'></span>{region}</div>\"\n",
"legend_html += \"</div>\"\n",
"\n", "\n",
"total_communes = len(df_clean)\n", "legend_widget = HTML(legend_html)\n",
"stats = df_clean['Territoire de projet'].value_counts().sort_index()\n", "legend_control = WidgetControl(widget=legend_widget, position='topright')\n",
"m.add_control(legend_control)\n",
"\n", "\n",
"for region, count in stats.items():\n", "# Afficher l'interface compl\u00e8te\n",
" pct = (count / total_communes) * 100\n", "display(VBox([\n",
" print(f\"{region:<35} {count:<15} {pct:>5.1f}%\")\n", " info_html,\n",
" HBox([Label(''), k_slider]),\n",
" m,\n",
" result_output\n",
"]))\n",
"\n", "\n",
"print(\"=\" * 65)\n", "print(\"\\n\" + \"=\"*60)\n",
"print(f\"{'TOTAL':<35} {total_communes:<15} 100.0%\")" "print(\"\u2705 CARTE INTERACTIVE PR\u00caTE!\")\n",
"print(\"=\"*60)\n",
"print(\"\\n\ud83d\uddb1\ufe0f Cliquez n'importe o\u00f9 sur la carte pour pr\u00e9dire la micro-r\u00e9gion.\")\n",
"print(\"\ud83c\udf9a\ufe0f Utilisez le slider pour changer k (nombre de voisins).\")\n",
"print(\"\\n\u2b50 Cette version utilise ipyleaflet avec interactions Python natives.\")\n",
"print(\" Pas de JavaScript inject\u00e9 = Fonctionne parfaitement dans Jupyter!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Fonction de test (optionnel)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Test manuel\n",
"def test_prediction(lat, lon, k_val=5):\n",
" \"\"\"Tester une pr\u00e9diction avec des coordonn\u00e9es sp\u00e9cifiques.\"\"\"\n",
" print(f\"\\nTest de pr\u00e9diction pour ({lat}, {lon}) avec k={k_val}\")\n",
" print(\"=\"*60)\n",
" update_classification(lat, lon, k_val)\n",
"\n",
"# Exemples\n",
"# test_prediction(42.15, 9.15, 5) # Centre Corse\n",
"# test_prediction(42.55, 8.85, 5) # Nord\n",
"# test_prediction(41.65, 9.15, 5) # Sud"
] ]
}, },
{ {
@ -640,27 +431,21 @@
"source": [ "source": [
"## Conclusion\n", "## Conclusion\n",
"\n", "\n",
"✅ **Notebook k-NN Corse - Résumé**\n", "\u2705 **Carte interactive fonctionnelle avec ipyleaflet**\n",
"\n", "\n",
"Ce notebook implémente un classificateur k-NN pour les micro-régions de Corse avec:\n", "**Avantages par rapport \u00e0 Folium :**\n",
"1. ✅ Chargement des données depuis 2 fichiers CSV (coordonnées + territoires)\n", "- \u2705 Interactions Python natives (pas de JavaScript inject\u00e9)\n",
"2. ✅ Extraction automatique des coordonnées GPS depuis la colonne Point_Geo\n", "- \u2705 Fonctionne parfaitement dans Jupyter\n",
"3. ✅ Fusion intelligente des deux sources de données\n", "- \u2705 Pas de probl\u00e8me d'iframe ou de s\u00e9curit\u00e9\n",
"4. ✅ Entraînement d'un modèle k-NN avec distance haversine\n", "- \u2705 Slider interactif pour changer k en temps r\u00e9el\n",
"5. ✅ Carte interactive Folium avec prédiction au clic\n", "- \u2705 R\u00e9sultats affich\u00e9s sous la carte\n",
"6. ✅ Visualisation des k plus proches voisins\n",
"7. ✅ Tests de performance et validation\n",
"8. ✅ Export HTML pour utilisation autonome\n",
"\n", "\n",
"**🖱️ Utilisation:**\n", "**Utilisation :**\n",
"- Cliquez n'importe où sur la carte\n", "1. Cliquez sur la carte \u2192 Pr\u00e9diction s'affiche avec marqueur et lignes\n",
"- Un marqueur coloré apparaît avec la micro-région prédite\n", "2. Changez k avec le slider \u2192 Pr\u00e9diction se met \u00e0 jour automatiquement\n",
"- Des lignes pointillées montrent les k communes les plus proches\n", "3. Les r\u00e9sultats d\u00e9taill\u00e9s s'affichent sous la carte\n",
"- Un popup détaille la prédiction et les voisins\n",
"\n", "\n",
"**📁 Fichier exporté:** `carte_corse_knn_interactive.html`\n", "**Note :** Cette version ne g\u00e9n\u00e8re pas de fichier HTML standalone car ipyleaflet n\u00e9cessite un serveur Jupyter pour les interactions Python. Pour partager, utilisez Jupyter nbviewer ou Binder."
"\n",
"La carte HTML peut être ouverte dans n'importe quel navigateur pour une utilisation autonome!"
] ]
} }
], ],