Auto-update: 2025-10-23 19:23:57

This commit is contained in:
divingeek 2025-10-23 19:23:57 +02:00
parent d2f061c9f9
commit a7b5fc79b4
2 changed files with 2915 additions and 6267 deletions

File diff suppressed because one or more lines are too long

View file

@ -4,13 +4,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Classification des Micro-régions de Corse par K-NN\n",
"# Classification k-NN Corse - Version ipyleaflet\n",
"\n",
"Ce notebook implémente un système de classification des micro-régions corses basé sur l'algorithme des k plus proches voisins (k-NN). Cliquez sur la carte pour identifier la micro-région correspondante.\n",
"Cette version utilise **ipyleaflet** avec des interactions Python natives (pas de JavaScript inject\u00e9).\n",
"\n",
"**Fichiers nécessaires :**\n",
"- `communes-de-corse-en-corse-et-francais.csv` : Liste des communes avec coordonnées GPS\n",
"- `communes-par-territoire-de-projet-de-la-collectivite-territoriale-de-corse0.csv` : Territoires de projet par commune"
"**Avantage :** Fonctionne parfaitement dans Jupyter sans probl\u00e8me d'iframe ou de JavaScript."
]
},
{
@ -19,8 +17,8 @@
"metadata": {},
"outputs": [],
"source": [
"# Installation des bibliothèques nécessaires\n",
"!pip install folium pandas numpy scikit-learn --quiet"
"# Installation\n",
"!pip install ipyleaflet ipywidgets pandas numpy scikit-learn --quiet"
]
},
{
@ -31,18 +29,17 @@
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"import folium\n",
"from ipyleaflet import Map, CircleMarker, Marker, Polyline, LayerGroup, WidgetControl, AwesomeIcon\n",
"from ipywidgets import HTML, VBox, HBox, Label, IntSlider, Output\n",
"from sklearn.neighbors import KNeighborsClassifier\n",
"from IPython.display import display, HTML\n",
"import json\n",
"import re"
"from IPython.display import display\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Chargement et préparation des données"
"## 1. Chargement des donn\u00e9es"
]
},
{
@ -51,39 +48,13 @@
"metadata": {},
"outputs": [],
"source": [
"# Chargement du fichier avec les coordonnées GPS\n",
"df_coords = pd.read_csv('communes-de-corse-en-corse-et-francais.csv', \n",
"# Chargement\n",
"df_coords = pd.read_csv('communes-de-corse-en-corse-et-francais.csv', sep=';', encoding='utf-8')\n",
"df_territoires = pd.read_csv('communes-par-territoire-de-projet-de-la-collectivite-territoriale-de-corse0.csv', \n",
" sep=';', encoding='utf-8')\n",
"\n",
"print(f\"Fichier coordonnées: {len(df_coords)} communes\")\n",
"print(\"\\nPremières lignes:\")\n",
"display(df_coords.head())\n",
"print(\"\\nColonnes disponibles:\")\n",
"print(df_coords.columns.tolist())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Chargement du fichier avec les territoires de projet\n",
"df_territoires = pd.read_csv('communes-par-territoire-de-projet-de-la-collectivite-territoriale-de-corse0.csv',\n",
" sep=';', encoding='utf-8')\n",
"\n",
"print(f\"Fichier territoires: {len(df_territoires)} communes\")\n",
"print(\"\\nPremières lignes:\")\n",
"display(df_territoires.head())\n",
"print(\"\\nTerritoires de projet (micro-régions):\")\n",
"print(sorted(df_territoires['Territoire de projet'].unique()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Extraction des coordonnées GPS"
"print(f\"\u2705 {len(df_coords)} communes avec coordonn\u00e9es\")\n",
"print(f\"\u2705 {len(df_territoires)} communes avec territoires\")"
]
},
{
@ -92,43 +63,23 @@
"metadata": {},
"outputs": [],
"source": [
"# Extraction coordonn\u00e9es\n",
"def extract_coordinates(point_geo_str):\n",
" \"\"\"\n",
" Extrait latitude et longitude de la colonne Point_Geo\n",
" Format attendu: \"41.984099158, 8.798384636\"\n",
" \"\"\"\n",
" if pd.isna(point_geo_str):\n",
" return None, None\n",
" \n",
" try:\n",
" # Supprimer les espaces et split par virgule\n",
" coords = str(point_geo_str).strip().split(',')\n",
" if len(coords) == 2:\n",
" lat = float(coords[0].strip())\n",
" lon = float(coords[1].strip())\n",
" return lat, lon\n",
" return float(coords[0].strip()), float(coords[1].strip())\n",
" except:\n",
" pass\n",
" \n",
" return None, None\n",
"\n",
"# Extraction des coordonnées\n",
"df_coords[['Latitude', 'Longitude']] = df_coords['Point_Geo'].apply(\n",
" lambda x: pd.Series(extract_coordinates(x))\n",
")\n",
"\n",
"# Vérification\n",
"print(\"Extraction des coordonnées:\")\n",
"print(f\"Communes avec coordonnées: {df_coords['Latitude'].notna().sum()}/{len(df_coords)}\")\n",
"print(\"\\nExemple:\")\n",
"display(df_coords[['Nom français', 'Latitude', 'Longitude']].head())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Fusion des deux fichiers"
"print(f\"\u2705 {df_coords['Latitude'].notna().sum()} coordonn\u00e9es extraites\")"
]
},
{
@ -137,54 +88,30 @@
"metadata": {},
"outputs": [],
"source": [
"# Normalisation des noms de communes pour la jointure\n",
"def normalize_commune_name(name):\n",
" \"\"\"\n",
" Normalise le nom d'une commune pour faciliter la jointure\n",
" \"\"\"\n",
" if pd.isna(name):\n",
" return ''\n",
" # Convertir en majuscules et supprimer les espaces multiples\n",
" return str(name).upper().strip()\n",
"# Fusion\n",
"def normalize(name):\n",
" return str(name).upper().strip() if not pd.isna(name) else ''\n",
"\n",
"df_coords['Commune_norm'] = df_coords['Nom français'].apply(normalize_commune_name)\n",
"df_territoires['Commune_norm'] = df_territoires['Commune'].apply(normalize_commune_name)\n",
"df_coords['Commune_norm'] = df_coords['Nom fran\u00e7ais'].apply(normalize)\n",
"df_territoires['Commune_norm'] = df_territoires['Commune'].apply(normalize)\n",
"\n",
"# Fusion des deux dataframes\n",
"df = pd.merge(\n",
" df_coords,\n",
" df_territoires[['Commune_norm', 'Territoire de projet']],\n",
" on='Commune_norm',\n",
" how='inner'\n",
")\n",
"\n",
"# Renommer pour cohérence\n",
"df['Commune'] = df['Nom français']\n",
"\n",
"print(f\"Fusion réussie: {len(df)} communes avec coordonnées ET territoire de projet\")\n",
"print(\"\\nAperçu des données fusionnées:\")\n",
"display(df[['Commune', 'Latitude', 'Longitude', 'Territoire de projet']].head(10))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Nettoyage: supprimer les lignes sans coordonnées\n",
"df = pd.merge(df_coords, df_territoires[['Commune_norm', 'Territoire de projet']], \n",
" on='Commune_norm', how='inner')\n",
"df['Commune'] = df['Nom fran\u00e7ais']\n",
"df_clean = df.dropna(subset=['Latitude', 'Longitude', 'Territoire de projet']).copy()\n",
"\n",
"print(f\"\\n✅ Données finales: {len(df_clean)} communes prêtes pour la classification\")\n",
"print(f\"\\nRépartition par micro-région:\")\n",
"print(df_clean['Territoire de projet'].value_counts().sort_index())"
"print(f\"\u2705 {len(df_clean)} communes fusionn\u00e9es\")\n",
"print(f\"\\nMicro-r\u00e9gions: {len(df_clean['Territoire de projet'].unique())}\")\n",
"for region in sorted(df_clean['Territoire de projet'].unique()):\n",
" count = (df_clean['Territoire de projet'] == region).sum()\n",
" print(f\" \u2022 {region}: {count} communes\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Entraînement du modèle k-NN"
"## 2. Entra\u00eenement k-NN"
]
},
{
@ -193,33 +120,24 @@
"metadata": {},
"outputs": [],
"source": [
"# Préparation des données pour k-NN\n",
"# Mod\u00e8le k-NN\n",
"X = df_clean[['Latitude', 'Longitude']].values\n",
"y = df_clean['Territoire de projet'].values\n",
"\n",
"# Création du modèle k-NN avec k=5 (ajustable)\n",
"k = 5\n",
"knn = KNeighborsClassifier(n_neighbors=k, weights='distance', metric='haversine')\n",
"\n",
"# Conversion des coordonnées en radians pour la distance haversine\n",
"knn = KNeighborsClassifier(n_neighbors=5, weights='distance', metric='haversine')\n",
"X_rad = np.radians(X)\n",
"\n",
"# Entraînement du modèle\n",
"knn.fit(X_rad, y)\n",
"\n",
"print(f\"✅ Modèle k-NN entraîné avec k={k} voisins\")\n",
"print(f\"📊 Nombre de micro-régions: {len(np.unique(y))}\")\n",
"print(f\"\\n🗺 Micro-régions identifiées:\")\n",
"for i, region in enumerate(sorted(np.unique(y)), 1):\n",
" count = (y == region).sum()\n",
" print(f\" {i:2d}. {region} ({count} communes)\")"
"print(f\"\u2705 Mod\u00e8le k-NN entra\u00een\u00e9\")\n",
"print(f\"\u2705 {len(df_clean)} communes\")\n",
"print(f\"\u2705 {len(np.unique(y))} micro-r\u00e9gions\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Création de la carte interactive avec Folium"
"## 3. Configuration des couleurs"
]
},
{
@ -228,17 +146,24 @@
"metadata": {},
"outputs": [],
"source": [
"# Couleurs pour chaque micro-région\n",
"# Couleurs par micro-r\u00e9gion\n",
"microregions = sorted(df_clean['Territoire de projet'].unique())\n",
"colors = ['red', 'blue', 'green', 'purple', 'orange', 'darkred', \n",
" 'lightred', 'beige', 'darkblue', 'darkgreen', 'cadetblue', \n",
" 'darkpurple', 'pink', 'lightblue', 'lightgreen', 'gray', 'black', 'lightgray']\n",
" 'lightcoral', 'beige', 'darkblue', 'darkgreen', 'cadetblue', \n",
" 'darkviolet', 'pink', 'lightblue', 'lightgreen', 'gray']\n",
"\n",
"color_map = {region: colors[i % len(colors)] for i, region in enumerate(microregions)}\n",
"\n",
"print(\"🎨 Carte des couleurs par micro-région:\")\n",
"print(\"\u2705 Couleurs configur\u00e9es:\")\n",
"for region, color in sorted(color_map.items()):\n",
" print(f\" • {region}: {color}\")"
" print(f\" \u2022 {region}: {color}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Cr\u00e9ation de la carte interactive"
]
},
{
@ -247,249 +172,42 @@
"metadata": {},
"outputs": [],
"source": [
"# Coordonnées du centre de la Corse\n",
"# Carte ipyleaflet\n",
"center_lat = df_clean['Latitude'].mean()\n",
"center_lon = df_clean['Longitude'].mean()\n",
"\n",
"print(f\"Centre de la carte: {center_lat:.4f}°N, {center_lon:.4f}°E\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Carte interactive avec prédiction au clic"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Création de la carte interactive\n",
"m_interactive = folium.Map(\n",
" location=[center_lat, center_lon],\n",
" zoom_start=9,\n",
" tiles='OpenStreetMap'\n",
"m = Map(\n",
" center=(center_lat, center_lon),\n",
" zoom=9,\n",
" scroll_wheel_zoom=True\n",
")\n",
"\n",
"# Ajout des marqueurs pour chaque commune\n",
"# Ajouter les communes\n",
"commune_layer = LayerGroup(name='Communes')\n",
"\n",
"for idx, row in df_clean.iterrows():\n",
" folium.CircleMarker(\n",
" location=[row['Latitude'], row['Longitude']],\n",
" marker = CircleMarker(\n",
" location=(row['Latitude'], row['Longitude']),\n",
" radius=3,\n",
" popup=f\"<b>{row['Commune']}</b><br>{row['Territoire de projet']}<br><small>({row['Latitude']:.4f}, {row['Longitude']:.4f})</small>\",\n",
" tooltip=row['Commune'],\n",
" color=color_map[row['Territoire de projet']],\n",
" fill=True,\n",
" fillColor=color_map[row['Territoire de projet']],\n",
" fillOpacity=0.7\n",
" ).add_to(m_interactive)\n",
" fill_color=color_map[row['Territoire de projet']],\n",
" fill_opacity=0.7,\n",
" weight=1\n",
" )\n",
" # Popup avec info\n",
" marker.popup = HTML(f\"<b>{row['Commune']}</b><br>{row['Territoire de projet']}\")\n",
" commune_layer.add_layer(marker)\n",
"\n",
"# Préparer les données des communes pour JavaScript\n",
"communes_data = df_clean[['Latitude', 'Longitude', 'Commune', 'Territoire de projet']].to_dict('records')\n",
"m.add_layer(commune_layer)\n",
"\n",
"# JavaScript pour la prédiction k-NN au clic\n",
"click_js = f\"\"\"\n",
"<script>\n",
"// Données des communes\n",
"var communesData = {json.dumps(communes_data)};\n",
"\n",
"// Carte des couleurs\n",
"var colorMap = {json.dumps(color_map)};\n",
"\n",
"// Fonction pour calculer la distance haversine\n",
"function haversineDistance(lat1, lon1, lat2, lon2) {{\n",
" const R = 6371; // Rayon de la Terre en km\n",
" const dLat = (lat2 - lat1) * Math.PI / 180;\n",
" const dLon = (lon2 - lon1) * Math.PI / 180;\n",
" const a = Math.sin(dLat/2) * Math.sin(dLat/2) +\n",
" Math.cos(lat1 * Math.PI / 180) * Math.cos(lat2 * Math.PI / 180) *\n",
" Math.sin(dLon/2) * Math.sin(dLon/2);\n",
" const c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));\n",
" return R * c;\n",
"}}\n",
"\n",
"// Fonction k-NN\n",
"function predictRegion(lat, lon, k) {{\n",
" // Calculer les distances\n",
" var distances = communesData.map(function(commune) {{\n",
" return {{\n",
" distance: haversineDistance(lat, lon, commune.Latitude, commune.Longitude),\n",
" region: commune['Territoire de projet'],\n",
" commune: commune.Commune\n",
" }};\n",
" }});\n",
" \n",
" // Trier par distance\n",
" distances.sort((a, b) => a.distance - b.distance);\n",
" \n",
" // Prendre les k plus proches\n",
" var kNearest = distances.slice(0, k);\n",
" \n",
" // Vote pondéré par l'inverse de la distance\n",
" var votes = {{}};\n",
" kNearest.forEach(function(neighbor) {{\n",
" var weight = 1 / (neighbor.distance + 0.001); // +0.001 pour éviter division par 0\n",
" if (votes[neighbor.region]) {{\n",
" votes[neighbor.region] += weight;\n",
" }} else {{\n",
" votes[neighbor.region] = weight;\n",
" }}\n",
" }});\n",
" \n",
" // Trouver la région gagnante\n",
" var maxVote = 0;\n",
" var predictedRegion = '';\n",
" for (var region in votes) {{\n",
" if (votes[region] > maxVote) {{\n",
" maxVote = votes[region];\n",
" predictedRegion = region;\n",
" }}\n",
" }}\n",
" \n",
" return {{\n",
" region: predictedRegion,\n",
" neighbors: kNearest\n",
" }};\n",
"}}\n",
"\n",
"// Variable pour stocker le marqueur de prédiction\n",
"var predictionMarker = null;\n",
"var neighborLines = [];\n",
"\n",
"// Attendre que la carte soit chargée\n",
"setTimeout(function() {{\n",
" var maps = document.querySelectorAll('.folium-map');\n",
" if (maps.length > 0) {{\n",
" var mapElement = maps[maps.length - 1];\n",
" var leafletMap = mapElement._leaflet_map;\n",
" \n",
" if (leafletMap) {{\n",
" leafletMap.on('click', function(e) {{\n",
" var lat = e.latlng.lat;\n",
" var lon = e.latlng.lng;\n",
" \n",
" // Prédiction avec k={k}\n",
" var result = predictRegion(lat, lon, {k});\n",
" \n",
" // Supprimer l'ancien marqueur et lignes\n",
" if (predictionMarker) {{\n",
" leafletMap.removeLayer(predictionMarker);\n",
" }}\n",
" neighborLines.forEach(function(line) {{\n",
" leafletMap.removeLayer(line);\n",
" }});\n",
" neighborLines = [];\n",
" \n",
" // Créer le popup avec informations détaillées\n",
" var popupContent = '<div style=\"min-width: 200px;\">' +\n",
" '<h4 style=\"margin: 5px 0; color: ' + colorMap[result.region] + ';\">🎯 ' + result.region + '</h4>' +\n",
" '<p style=\"margin: 5px 0; font-size: 11px;\"><b>Coordonnées cliquées:</b><br>' + \n",
" 'Lat: ' + lat.toFixed(5) + '°<br>Lon: ' + lon.toFixed(5) + '°</p>' +\n",
" '<hr style=\"margin: 5px 0;\">' +\n",
" '<p style=\"margin: 5px 0; font-size: 11px;\"><b>{k} plus proches communes:</b></p>' +\n",
" '<ul style=\"margin: 5px 0; padding-left: 20px; font-size: 10px;\">';\n",
" \n",
" result.neighbors.forEach(function(neighbor, i) {{\n",
" popupContent += '<li><b>' + neighbor.commune + '</b> (' + neighbor.distance.toFixed(2) + ' km)</li>';\n",
" }});\n",
" \n",
" popupContent += '</ul></div>';\n",
" \n",
" // Ajouter le nouveau marqueur\n",
" predictionMarker = L.marker([lat, lon], {{\n",
" icon: L.divIcon({{\n",
" className: 'prediction-marker',\n",
" html: '<div style=\"background-color: ' + colorMap[result.region] + \n",
" '; width: 20px; height: 20px; border-radius: 50%; ' +\n",
" 'border: 3px solid white; box-shadow: 0 0 10px rgba(0,0,0,0.5);\"></div>',\n",
" iconSize: [20, 20]\n",
" }})\n",
" }}).addTo(leafletMap);\n",
" \n",
" predictionMarker.bindPopup(popupContent, {{maxWidth: 300}}).openPopup();\n",
" \n",
" // Ajouter des lignes vers les k plus proches voisins\n",
" result.neighbors.forEach(function(neighbor) {{\n",
" var commune = communesData.find(c => c.Commune === neighbor.commune);\n",
" if (commune) {{\n",
" var line = L.polyline(\n",
" [[lat, lon], [commune.Latitude, commune.Longitude]],\n",
" {{\n",
" color: 'gray',\n",
" weight: 1,\n",
" opacity: 0.5,\n",
" dashArray: '5, 5'\n",
" }}\n",
" ).addTo(leafletMap);\n",
" neighborLines.push(line);\n",
" }}\n",
" }});\n",
" }});\n",
" \n",
" console.log('✅ Gestionnaire de clic k-NN activé');\n",
" console.log('📊 ' + communesData.length + ' communes chargées');\n",
" }}\n",
" }}\n",
"}}, 1000);\n",
"</script>\n",
"\"\"\"\n",
"\n",
"m_interactive.get_root().html.add_child(folium.Element(click_js))\n",
"\n",
"# Ajout de la légende\n",
"legend_html = '''\n",
"<div style=\"position: fixed; \n",
" top: 10px; right: 10px; width: 250px; max-height: 85vh; overflow-y: auto;\n",
" background-color: white; border:2px solid grey; z-index:9999; border-radius: 5px;\n",
" font-size:12px; padding: 10px; box-shadow: 0 0 15px rgba(0,0,0,0.2);\">\n",
"<p style=\"margin-bottom: 8px; font-weight: bold; font-size: 14px;\">🗺️ Micro-régions de Corse</p>\n",
"'''\n",
"\n",
"for region, color in sorted(color_map.items()):\n",
" legend_html += f'<p style=\"margin: 3px 0;\"><i class=\"fa fa-circle\" style=\"color:{color}\"></i> {region}</p>'\n",
"\n",
"legend_html += f'<hr style=\"margin: 8px 0;\"><p style=\"margin: 3px 0; font-size: 11px; color: #666;\">k = {k} voisins<br>Distance: Haversine</p></div>'\n",
"\n",
"m_interactive.get_root().html.add_child(folium.Element(legend_html))\n",
"\n",
"# Ajout d'instructions\n",
"instructions_html = '''\n",
"<div style=\"position: fixed; \n",
" bottom: 10px; left: 10px; width: 320px; \n",
" background-color: white; border:2px solid grey; z-index:9999; border-radius: 5px;\n",
" font-size:13px; padding: 12px; box-shadow: 0 0 15px rgba(0,0,0,0.2);\">\n",
"<p style=\"margin: 0 0 8px 0; font-weight: bold;\">🖱️ Mode d'emploi</p>\n",
"<p style=\"margin: 5px 0; line-height: 1.4;\"><b>Cliquez</b> n'importe où sur la carte pour prédire la micro-région.</p>\n",
"<p style=\"margin: 5px 0; line-height: 1.4; font-size: 11px;\">• Un marqueur coloré apparaît au point cliqué<br>\n",
"• Les lignes pointillées montrent les k communes les plus proches<br>\n",
"• Le popup affiche la prédiction détaillée</p>\n",
"</div>\n",
"'''\n",
"\n",
"m_interactive.get_root().html.add_child(folium.Element(instructions_html))\n",
"\n",
"print(\"\\n✅ Carte interactive créée avec succès!\")\n",
"print(f\"\\n🖱 Cliquez sur n'importe quel point de la carte pour prédire sa micro-région avec k={k} voisins.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Affichage de la carte interactive\n",
"m_interactive"
"print(f\"\u2705 {len(df_clean)} communes ajout\u00e9es \u00e0 la carte\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Sauvegarde de la carte"
"## 5. Interface interactive avec widgets"
]
},
{
@ -498,17 +216,33 @@
"metadata": {},
"outputs": [],
"source": [
"# Sauvegarder la carte interactive\n",
"m_interactive.save('carte_corse_knn_interactive.html')\n",
"print(\"✅ Carte sauvegardée dans 'carte_corse_knn_interactive.html'\")\n",
"print(\"📁 Vous pouvez ouvrir ce fichier dans un navigateur pour une utilisation autonome.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 8. Test de la prédiction (optionnel)"
"# Widgets\n",
"k_slider = IntSlider(\n",
" value=5,\n",
" min=1,\n",
" max=15,\n",
" step=1,\n",
" description='k voisins:',\n",
" continuous_update=False\n",
")\n",
"\n",
"info_html = HTML(\n",
" value=\"<div style='background:#e3f2fd;padding:10px;border-radius:5px;'>\"\n",
" \"<b>\ud83d\uddb1\ufe0f Cliquez sur la carte</b> pour pr\u00e9dire la micro-r\u00e9gion.<br>\"\n",
" \"Ajustez <b>k</b> pour changer le nombre de voisins.\"\n",
" \"</div>\"\n",
")\n",
"\n",
"result_output = Output()\n",
"\n",
"# Stocker les coordonn\u00e9es courantes\n",
"current_coords = {'lat': None, 'lon': None}\n",
"\n",
"# Layer pour la pr\u00e9diction\n",
"prediction_layer = LayerGroup(name='Pr\u00e9diction')\n",
"m.add_layer(prediction_layer)\n",
"\n",
"print(\"\u2705 Widgets cr\u00e9\u00e9s\")"
]
},
{
@ -517,60 +251,83 @@
"metadata": {},
"outputs": [],
"source": [
"# Fonction pour tester la prédiction sur des coordonnées spécifiques\n",
"def predict_region(lat, lon, k_value=5):\n",
" \"\"\"\n",
" Prédit la micro-région pour des coordonnées données\n",
" \"\"\"\n",
" # Conversion en radians\n",
"# Fonction de pr\u00e9diction\n",
"def update_classification(lat, lon, k_value):\n",
" \"\"\"Met \u00e0 jour la classification pour un point donn\u00e9.\"\"\"\n",
" # Pr\u00e9diction\n",
" coords_rad = np.radians([[lat, lon]])\n",
" \n",
" # Prédiction\n",
" prediction = knn.predict(coords_rad)[0]\n",
" knn.n_neighbors = k_value\n",
" knn.fit(X_rad, y) # R\u00e9entra\u00eener avec nouveau k\n",
" predicted_region = knn.predict(coords_rad)[0]\n",
" \n",
" # Trouver les k plus proches voisins\n",
" distances, indices = knn.kneighbors(coords_rad)\n",
" distances_km = distances[0] * 6371 # Conversion en km\n",
" \n",
" # Convertir les distances de radians en km\n",
" distances_km = distances[0] * 6371 # Rayon de la Terre en km\n",
" # Nettoyer la couche de pr\u00e9diction\n",
" prediction_layer.clear_layers()\n",
" \n",
" print(f\"\\n📍 Coordonnées: {lat:.5f}°N, {lon:.5f}°E\")\n",
" print(f\"🎯 Micro-région prédite: {prediction}\")\n",
" print(f\"\\n{k_value} plus proches communes:\")\n",
" # Cr\u00e9er une ic\u00f4ne personnalis\u00e9e avec AwesomeIcon\n",
" custom_icon = AwesomeIcon(\n",
" name='star',\n",
" marker_color=color_map[predicted_region],\n",
" icon_color='white',\n",
" spin=False\n",
" )\n",
" \n",
" # Ajouter le marqueur de pr\u00e9diction\n",
" prediction_marker = Marker(\n",
" location=(lat, lon),\n",
" draggable=False,\n",
" icon=custom_icon\n",
" )\n",
" \n",
" # Popup d\u00e9taill\u00e9\n",
" popup_html = f\"\"\"<div style='min-width:220px;'>\n",
" <h4 style='margin:5px 0;color:{color_map[predicted_region]};'>\ud83c\udfaf {predicted_region}</h4>\n",
" <p style='margin:5px 0;font-size:11px;'>\n",
" <b>Coordonn\u00e9es:</b><br>\n",
" Lat: {lat:.5f}\u00b0<br>\n",
" Lon: {lon:.5f}\u00b0\n",
" </p>\n",
" <hr style='margin:5px 0;'>\n",
" <p style='margin:5px 0;font-size:11px;'><b>{k_value} plus proches communes:</b></p>\n",
" <ul style='margin:5px 0;padding-left:20px;font-size:10px;'>\"\"\"\n",
" \n",
" for i, idx in enumerate(indices[0]):\n",
" commune_info = df_clean.iloc[idx]\n",
" print(f\" {i+1}. {commune_info['Commune']:30s} ({commune_info['Territoire de projet']:30s}) - {distances_km[i]:6.2f} km\")\n",
" popup_html += f\"<li><b>{commune_info['Commune']}</b> ({distances_km[i]:.2f} km)</li>\"\n",
" \n",
" return prediction\n",
" popup_html += \"</ul></div>\"\n",
" prediction_marker.popup = HTML(popup_html)\n",
" prediction_layer.add_layer(prediction_marker)\n",
" \n",
" # Ajouter les lignes vers les k plus proches voisins\n",
" for i, idx in enumerate(indices[0]):\n",
" commune_info = df_clean.iloc[idx]\n",
" line = Polyline(\n",
" locations=[\n",
" (lat, lon),\n",
" (commune_info['Latitude'], commune_info['Longitude'])\n",
" ],\n",
" color='gray',\n",
" weight=2,\n",
" opacity=0.6,\n",
" dash_array='8, 8'\n",
" )\n",
" prediction_layer.add_layer(line)\n",
" \n",
" # Afficher le r\u00e9sultat\n",
" with result_output:\n",
" result_output.clear_output()\n",
" print(f\"\\n\ud83c\udfaf Pr\u00e9diction: {predicted_region}\")\n",
" print(f\"\ud83d\udccd Coordonn\u00e9es: ({lat:.5f}, {lon:.5f})\")\n",
" print(f\"\\n{k_value} plus proches communes:\")\n",
" for i, idx in enumerate(indices[0]):\n",
" commune_info = df_clean.iloc[idx]\n",
" print(f\" {i+1}. {commune_info['Commune']:30s} - {distances_km[i]:6.2f} km\")\n",
"\n",
"# Exemples de test\n",
"print(\"=\" * 100)\n",
"print(\"TESTS DE PRÉDICTION k-NN\")\n",
"print(\"=\" * 100)\n",
"\n",
"# Test 1: Centre approximatif de la Corse (vers Corte)\n",
"print(\"\\n🔍 Test 1: Centre de la Corse\")\n",
"predict_region(42.15, 9.15, k)\n",
"\n",
"# Test 2: Nord de la Corse (Balagne/Bastia)\n",
"print(\"\\n🔍 Test 2: Nord de la Corse\")\n",
"predict_region(42.55, 8.85, k)\n",
"\n",
"# Test 3: Sud de la Corse (vers Porto-Vecchio)\n",
"print(\"\\n🔍 Test 3: Sud de la Corse\")\n",
"predict_region(41.65, 9.15, k)\n",
"\n",
"# Test 4: Ouest (vers Ajaccio)\n",
"print(\"\\n🔍 Test 4: Ouest de la Corse (Ajaccio)\")\n",
"predict_region(41.93, 8.74, k)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 9. Analyse de performance (optionnel)"
"print(\"\u2705 Fonction de pr\u00e9diction d\u00e9finie\")\n"
]
},
{
@ -579,37 +336,33 @@
"metadata": {},
"outputs": [],
"source": [
"# Évaluation de la cohérence du modèle (cross-validation)\n",
"from sklearn.model_selection import cross_val_score\n",
"# Gestionnaires d'\u00e9v\u00e9nements\n",
"def handle_click(**kwargs):\n",
" \"\"\"Gestionnaire de clic sur la carte.\"\"\"\n",
" if kwargs.get('type') == 'click':\n",
" coords = kwargs.get('coordinates')\n",
" lat, lon = coords\n",
" current_coords['lat'] = lat\n",
" current_coords['lon'] = lon\n",
" update_classification(lat, lon, k_slider.value)\n",
"\n",
"# Test avec différentes valeurs de k\n",
"k_values = [3, 5, 7, 9, 11, 15]\n",
"scores = []\n",
"def on_k_change(change):\n",
" \"\"\"Gestionnaire de changement de k.\"\"\"\n",
" if current_coords['lat'] is not None:\n",
" update_classification(current_coords['lat'], current_coords['lon'], change['new'])\n",
"\n",
"print(\"📊 Évaluation de la précision pour différentes valeurs de k:\\n\")\n",
"print(f\"{'k':<5} {'Précision moyenne':<20} {'Écart-type':<15}\")\n",
"print(\"-\" * 50)\n",
"# Connecter les \u00e9v\u00e9nements\n",
"m.on_interaction(handle_click)\n",
"k_slider.observe(on_k_change, names='value')\n",
"\n",
"for k_val in k_values:\n",
" knn_temp = KNeighborsClassifier(n_neighbors=k_val, weights='distance', metric='haversine')\n",
" cv_scores = cross_val_score(knn_temp, X_rad, y, cv=5)\n",
" mean_score = cv_scores.mean()\n",
" std_score = cv_scores.std()\n",
" scores.append(mean_score)\n",
" print(f\"{k_val:<5} {mean_score:.4f} ({mean_score*100:5.2f}%) ± {std_score:.4f}\")\n",
"\n",
"best_k = k_values[scores.index(max(scores))]\n",
"best_score = max(scores)\n",
"print(\"\\n\" + \"=\" * 50)\n",
"print(f\"✨ Meilleure valeur de k: {best_k} (précision: {best_score:.4f} / {best_score*100:.2f}%)\")\n",
"print(\"=\" * 50)"
"print(\"\u2705 Gestionnaires d'\u00e9v\u00e9nements connect\u00e9s\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 10. Statistiques par micro-région"
"## 6. Affichage de l'interface compl\u00e8te"
]
},
{
@ -618,20 +371,58 @@
"metadata": {},
"outputs": [],
"source": [
"# Statistiques descriptives par micro-région\n",
"print(\"📈 STATISTIQUES PAR MICRO-RÉGION\\n\")\n",
"print(f\"{'Micro-région':<35} {'Nb communes':<15} {'% du total'}\")\n",
"print(\"=\" * 65)\n",
"# L\u00e9gende HTML\n",
"legend_html = \"<div style='background:white;padding:10px;border-radius:5px;max-height:300px;overflow-y:auto;'>\"\n",
"legend_html += \"<h4 style='margin-top:0;'>\ud83d\uddfa\ufe0f Micro-r\u00e9gions</h4>\"\n",
"for region, color in sorted(color_map.items()):\n",
" legend_html += f\"<div style='margin:3px 0;'><span style='display:inline-block;width:12px;height:12px;background:{color};border-radius:50%;margin-right:5px;'></span>{region}</div>\"\n",
"legend_html += \"</div>\"\n",
"\n",
"total_communes = len(df_clean)\n",
"stats = df_clean['Territoire de projet'].value_counts().sort_index()\n",
"legend_widget = HTML(legend_html)\n",
"legend_control = WidgetControl(widget=legend_widget, position='topright')\n",
"m.add_control(legend_control)\n",
"\n",
"for region, count in stats.items():\n",
" pct = (count / total_communes) * 100\n",
" print(f\"{region:<35} {count:<15} {pct:>5.1f}%\")\n",
"# Afficher l'interface compl\u00e8te\n",
"display(VBox([\n",
" info_html,\n",
" HBox([Label(''), k_slider]),\n",
" m,\n",
" result_output\n",
"]))\n",
"\n",
"print(\"=\" * 65)\n",
"print(f\"{'TOTAL':<35} {total_communes:<15} 100.0%\")"
"print(\"\\n\" + \"=\"*60)\n",
"print(\"\u2705 CARTE INTERACTIVE PR\u00caTE!\")\n",
"print(\"=\"*60)\n",
"print(\"\\n\ud83d\uddb1\ufe0f Cliquez n'importe o\u00f9 sur la carte pour pr\u00e9dire la micro-r\u00e9gion.\")\n",
"print(\"\ud83c\udf9a\ufe0f Utilisez le slider pour changer k (nombre de voisins).\")\n",
"print(\"\\n\u2b50 Cette version utilise ipyleaflet avec interactions Python natives.\")\n",
"print(\" Pas de JavaScript inject\u00e9 = Fonctionne parfaitement dans Jupyter!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Fonction de test (optionnel)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Test manuel\n",
"def test_prediction(lat, lon, k_val=5):\n",
" \"\"\"Tester une pr\u00e9diction avec des coordonn\u00e9es sp\u00e9cifiques.\"\"\"\n",
" print(f\"\\nTest de pr\u00e9diction pour ({lat}, {lon}) avec k={k_val}\")\n",
" print(\"=\"*60)\n",
" update_classification(lat, lon, k_val)\n",
"\n",
"# Exemples\n",
"# test_prediction(42.15, 9.15, 5) # Centre Corse\n",
"# test_prediction(42.55, 8.85, 5) # Nord\n",
"# test_prediction(41.65, 9.15, 5) # Sud"
]
},
{
@ -640,27 +431,21 @@
"source": [
"## Conclusion\n",
"\n",
"✅ **Notebook k-NN Corse - Résumé**\n",
"\u2705 **Carte interactive fonctionnelle avec ipyleaflet**\n",
"\n",
"Ce notebook implémente un classificateur k-NN pour les micro-régions de Corse avec:\n",
"1. ✅ Chargement des données depuis 2 fichiers CSV (coordonnées + territoires)\n",
"2. ✅ Extraction automatique des coordonnées GPS depuis la colonne Point_Geo\n",
"3. ✅ Fusion intelligente des deux sources de données\n",
"4. ✅ Entraînement d'un modèle k-NN avec distance haversine\n",
"5. ✅ Carte interactive Folium avec prédiction au clic\n",
"6. ✅ Visualisation des k plus proches voisins\n",
"7. ✅ Tests de performance et validation\n",
"8. ✅ Export HTML pour utilisation autonome\n",
"**Avantages par rapport \u00e0 Folium :**\n",
"- \u2705 Interactions Python natives (pas de JavaScript inject\u00e9)\n",
"- \u2705 Fonctionne parfaitement dans Jupyter\n",
"- \u2705 Pas de probl\u00e8me d'iframe ou de s\u00e9curit\u00e9\n",
"- \u2705 Slider interactif pour changer k en temps r\u00e9el\n",
"- \u2705 R\u00e9sultats affich\u00e9s sous la carte\n",
"\n",
"**🖱️ Utilisation:**\n",
"- Cliquez n'importe où sur la carte\n",
"- Un marqueur coloré apparaît avec la micro-région prédite\n",
"- Des lignes pointillées montrent les k communes les plus proches\n",
"- Un popup détaille la prédiction et les voisins\n",
"**Utilisation :**\n",
"1. Cliquez sur la carte \u2192 Pr\u00e9diction s'affiche avec marqueur et lignes\n",
"2. Changez k avec le slider \u2192 Pr\u00e9diction se met \u00e0 jour automatiquement\n",
"3. Les r\u00e9sultats d\u00e9taill\u00e9s s'affichent sous la carte\n",
"\n",
"**📁 Fichier exporté:** `carte_corse_knn_interactive.html`\n",
"\n",
"La carte HTML peut être ouverte dans n'importe quel navigateur pour une utilisation autonome!"
"**Note :** Cette version ne g\u00e9n\u00e8re pas de fichier HTML standalone car ipyleaflet n\u00e9cessite un serveur Jupyter pour les interactions Python. Pour partager, utilisez Jupyter nbviewer ou Binder."
]
}
],