knnCorsica/activite2/knn_microregions_implementation.ipynb

473 lines
No EOL
18 KiB
Text

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Classification k-NN Corse - Version ipyleaflet\n",
"\n",
"Cette version utilise **ipyleaflet** avec des interactions Python natives (pas de JavaScript inject\u00e9).\n",
"\n",
"**Avantage :** Fonctionne parfaitement dans Jupyter sans probl\u00e8me d'iframe ou de JavaScript."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Installation\n",
"!pip install ipyleaflet ipywidgets pandas numpy scikit-learn --quiet"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"from ipyleaflet import Map, CircleMarker, Marker, Polyline, LayerGroup, WidgetControl, AwesomeIcon\n",
"from ipywidgets import HTML, VBox, HBox, Label, IntSlider, Output\n",
"from sklearn.neighbors import KNeighborsClassifier\n",
"from IPython.display import display\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Chargement des donn\u00e9es"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Chargement\n",
"df_coords = pd.read_csv('communes-de-corse-en-corse-et-francais.csv', sep=';', encoding='utf-8')\n",
"df_territoires = pd.read_csv('communes-par-territoire-de-projet-de-la-collectivite-territoriale-de-corse0.csv', \n",
" sep=';', encoding='utf-8')\n",
"\n",
"print(f\"\u2705 {len(df_coords)} communes avec coordonn\u00e9es\")\n",
"print(f\"\u2705 {len(df_territoires)} communes avec territoires\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Extraction coordonn\u00e9es\n",
"def extract_coordinates(point_geo_str):\n",
" if pd.isna(point_geo_str):\n",
" return None, None\n",
" try:\n",
" coords = str(point_geo_str).strip().split(',')\n",
" if len(coords) == 2:\n",
" return float(coords[0].strip()), float(coords[1].strip())\n",
" except:\n",
" pass\n",
" return None, None\n",
"\n",
"df_coords[['Latitude', 'Longitude']] = df_coords['Point_Geo'].apply(\n",
" lambda x: pd.Series(extract_coordinates(x))\n",
")\n",
"\n",
"print(f\"\u2705 {df_coords['Latitude'].notna().sum()} coordonn\u00e9es extraites\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Fusion\n",
"def normalize(name):\n",
" return str(name).upper().strip() if not pd.isna(name) else ''\n",
"\n",
"df_coords['Commune_norm'] = df_coords['Nom fran\u00e7ais'].apply(normalize)\n",
"df_territoires['Commune_norm'] = df_territoires['Commune'].apply(normalize)\n",
"\n",
"df = pd.merge(df_coords, df_territoires[['Commune_norm', 'Territoire de projet']], \n",
" on='Commune_norm', how='inner')\n",
"df['Commune'] = df['Nom fran\u00e7ais']\n",
"df_clean = df.dropna(subset=['Latitude', 'Longitude', 'Territoire de projet']).copy()\n",
"\n",
"print(f\"\u2705 {len(df_clean)} communes fusionn\u00e9es\")\n",
"print(f\"\\nMicro-r\u00e9gions: {len(df_clean['Territoire de projet'].unique())}\")\n",
"for region in sorted(df_clean['Territoire de projet'].unique()):\n",
" count = (df_clean['Territoire de projet'] == region).sum()\n",
" print(f\" \u2022 {region}: {count} communes\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Entra\u00eenement k-NN"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Mod\u00e8le k-NN\n",
"X = df_clean[['Latitude', 'Longitude']].values\n",
"y = df_clean['Territoire de projet'].values\n",
"\n",
"knn = KNeighborsClassifier(n_neighbors=5, weights='distance', metric='haversine')\n",
"X_rad = np.radians(X)\n",
"knn.fit(X_rad, y)\n",
"\n",
"print(f\"\u2705 Mod\u00e8le k-NN entra\u00een\u00e9\")\n",
"print(f\"\u2705 {len(df_clean)} communes\")\n",
"print(f\"\u2705 {len(np.unique(y))} micro-r\u00e9gions\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Configuration des couleurs"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Couleurs par micro-r\u00e9gion\n",
"microregions = sorted(df_clean['Territoire de projet'].unique())\n",
"colors = ['red', 'blue', 'green', 'purple', 'orange', 'darkred', \n",
" 'lightcoral', 'beige', 'darkblue', 'darkgreen', 'cadetblue', \n",
" 'darkviolet', 'pink', 'lightblue', 'lightgreen', 'gray']\n",
"\n",
"color_map = {region: colors[i % len(colors)] for i, region in enumerate(microregions)}\n",
"\n",
"print(\"\u2705 Couleurs configur\u00e9es:\")\n",
"for region, color in sorted(color_map.items()):\n",
" print(f\" \u2022 {region}: {color}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Cr\u00e9ation de la carte interactive"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Carte ipyleaflet\n",
"center_lat = df_clean['Latitude'].mean()\n",
"center_lon = df_clean['Longitude'].mean()\n",
"\n",
"m = Map(\n",
" center=(center_lat, center_lon),\n",
" zoom=9,\n",
" scroll_wheel_zoom=True\n",
")\n",
"\n",
"# Ajouter les communes\n",
"commune_layer = LayerGroup(name='Communes')\n",
"\n",
"for idx, row in df_clean.iterrows():\n",
" marker = CircleMarker(\n",
" location=(row['Latitude'], row['Longitude']),\n",
" radius=3,\n",
" color=color_map[row['Territoire de projet']],\n",
" fill_color=color_map[row['Territoire de projet']],\n",
" fill_opacity=0.7,\n",
" weight=1\n",
" )\n",
" # Popup avec info\n",
" marker.popup = HTML(f\"<b>{row['Commune']}</b><br>{row['Territoire de projet']}\")\n",
" commune_layer.add_layer(marker)\n",
"\n",
"m.add_layer(commune_layer)\n",
"\n",
"print(f\"\u2705 {len(df_clean)} communes ajout\u00e9es \u00e0 la carte\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5. Interface interactive avec widgets"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Widgets\n",
"k_slider = IntSlider(\n",
" value=5,\n",
" min=1,\n",
" max=15,\n",
" step=1,\n",
" description='k voisins:',\n",
" continuous_update=False\n",
")\n",
"\n",
"info_html = HTML(\n",
" value=\"<div style='background:#e3f2fd;padding:10px;border-radius:5px;'>\"\n",
" \"<b>\ud83d\uddb1\ufe0f Cliquez sur la carte</b> pour pr\u00e9dire la micro-r\u00e9gion.<br>\"\n",
" \"Ajustez <b>k</b> pour changer le nombre de voisins.\"\n",
" \"</div>\"\n",
")\n",
"\n",
"result_output = Output()\n",
"\n",
"# Stocker les coordonn\u00e9es courantes\n",
"current_coords = {'lat': None, 'lon': None}\n",
"\n",
"# Layer pour la pr\u00e9diction\n",
"prediction_layer = LayerGroup(name='Pr\u00e9diction')\n",
"m.add_layer(prediction_layer)\n",
"\n",
"print(\"\u2705 Widgets cr\u00e9\u00e9s\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Fonction de pr\u00e9diction\n",
"def update_classification(lat, lon, k_value):\n",
" \"\"\"Met \u00e0 jour la classification pour un point donn\u00e9.\"\"\"\n",
" # Pr\u00e9diction\n",
" coords_rad = np.radians([[lat, lon]])\n",
" knn.n_neighbors = k_value\n",
" knn.fit(X_rad, y) # R\u00e9entra\u00eener avec nouveau k\n",
" predicted_region = knn.predict(coords_rad)[0]\n",
" \n",
" # Trouver les k plus proches voisins\n",
" distances, indices = knn.kneighbors(coords_rad)\n",
" distances_km = distances[0] * 6371 # Conversion en km\n",
" \n",
" # Nettoyer la couche de pr\u00e9diction\n",
" prediction_layer.clear_layers()\n",
" \n",
" # Cr\u00e9er une ic\u00f4ne personnalis\u00e9e avec AwesomeIcon\n",
" custom_icon = AwesomeIcon(\n",
" name='star',\n",
" marker_color=color_map[predicted_region],\n",
" icon_color='white',\n",
" spin=False\n",
" )\n",
" \n",
" # Ajouter le marqueur de pr\u00e9diction\n",
" prediction_marker = Marker(\n",
" location=(lat, lon),\n",
" draggable=False,\n",
" icon=custom_icon\n",
" )\n",
" \n",
" # Popup d\u00e9taill\u00e9\n",
" popup_html = f\"\"\"<div style='min-width:220px;'>\n",
" <h4 style='margin:5px 0;color:{color_map[predicted_region]};'>\ud83c\udfaf {predicted_region}</h4>\n",
" <p style='margin:5px 0;font-size:11px;'>\n",
" <b>Coordonn\u00e9es:</b><br>\n",
" Lat: {lat:.5f}\u00b0<br>\n",
" Lon: {lon:.5f}\u00b0\n",
" </p>\n",
" <hr style='margin:5px 0;'>\n",
" <p style='margin:5px 0;font-size:11px;'><b>{k_value} plus proches communes:</b></p>\n",
" <ul style='margin:5px 0;padding-left:20px;font-size:10px;'>\"\"\"\n",
" \n",
" for i, idx in enumerate(indices[0]):\n",
" commune_info = df_clean.iloc[idx]\n",
" popup_html += f\"<li><b>{commune_info['Commune']}</b> ({distances_km[i]:.2f} km)</li>\"\n",
" \n",
" popup_html += \"</ul></div>\"\n",
" prediction_marker.popup = HTML(popup_html)\n",
" prediction_layer.add_layer(prediction_marker)\n",
" \n",
" # Ajouter les lignes vers les k plus proches voisins\n",
" for i, idx in enumerate(indices[0]):\n",
" commune_info = df_clean.iloc[idx]\n",
" line = Polyline(\n",
" locations=[\n",
" (lat, lon),\n",
" (commune_info['Latitude'], commune_info['Longitude'])\n",
" ],\n",
" color='gray',\n",
" weight=2,\n",
" opacity=0.6,\n",
" dash_array='8, 8'\n",
" )\n",
" prediction_layer.add_layer(line)\n",
" \n",
" # Afficher le r\u00e9sultat\n",
" with result_output:\n",
" result_output.clear_output()\n",
" print(f\"\\n\ud83c\udfaf Pr\u00e9diction: {predicted_region}\")\n",
" print(f\"\ud83d\udccd Coordonn\u00e9es: ({lat:.5f}, {lon:.5f})\")\n",
" print(f\"\\n{k_value} plus proches communes:\")\n",
" for i, idx in enumerate(indices[0]):\n",
" commune_info = df_clean.iloc[idx]\n",
" print(f\" {i+1}. {commune_info['Commune']:30s} - {distances_km[i]:6.2f} km\")\n",
"\n",
"print(\"\u2705 Fonction de pr\u00e9diction d\u00e9finie\")\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Gestionnaires d'\u00e9v\u00e9nements\n",
"def handle_click(**kwargs):\n",
" \"\"\"Gestionnaire de clic sur la carte.\"\"\"\n",
" if kwargs.get('type') == 'click':\n",
" coords = kwargs.get('coordinates')\n",
" lat, lon = coords\n",
" current_coords['lat'] = lat\n",
" current_coords['lon'] = lon\n",
" update_classification(lat, lon, k_slider.value)\n",
"\n",
"def on_k_change(change):\n",
" \"\"\"Gestionnaire de changement de k.\"\"\"\n",
" if current_coords['lat'] is not None:\n",
" update_classification(current_coords['lat'], current_coords['lon'], change['new'])\n",
"\n",
"# Connecter les \u00e9v\u00e9nements\n",
"m.on_interaction(handle_click)\n",
"k_slider.observe(on_k_change, names='value')\n",
"\n",
"print(\"\u2705 Gestionnaires d'\u00e9v\u00e9nements connect\u00e9s\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6. Affichage de l'interface compl\u00e8te"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# L\u00e9gende HTML\n",
"legend_html = \"<div style='background:white;padding:10px;border-radius:5px;max-height:300px;overflow-y:auto;'>\"\n",
"legend_html += \"<h4 style='margin-top:0;'>\ud83d\uddfa\ufe0f Micro-r\u00e9gions</h4>\"\n",
"for region, color in sorted(color_map.items()):\n",
" legend_html += f\"<div style='margin:3px 0;'><span style='display:inline-block;width:12px;height:12px;background:{color};border-radius:50%;margin-right:5px;'></span>{region}</div>\"\n",
"legend_html += \"</div>\"\n",
"\n",
"legend_widget = HTML(legend_html)\n",
"legend_control = WidgetControl(widget=legend_widget, position='topright')\n",
"m.add_control(legend_control)\n",
"\n",
"# Afficher l'interface compl\u00e8te\n",
"display(VBox([\n",
" info_html,\n",
" HBox([Label(''), k_slider]),\n",
" m,\n",
" result_output\n",
"]))\n",
"\n",
"print(\"\\n\" + \"=\"*60)\n",
"print(\"\u2705 CARTE INTERACTIVE PR\u00caTE!\")\n",
"print(\"=\"*60)\n",
"print(\"\\n\ud83d\uddb1\ufe0f Cliquez n'importe o\u00f9 sur la carte pour pr\u00e9dire la micro-r\u00e9gion.\")\n",
"print(\"\ud83c\udf9a\ufe0f Utilisez le slider pour changer k (nombre de voisins).\")\n",
"print(\"\\n\u2b50 Cette version utilise ipyleaflet avec interactions Python natives.\")\n",
"print(\" Pas de JavaScript inject\u00e9 = Fonctionne parfaitement dans Jupyter!\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 7. Fonction de test (optionnel)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Test manuel\n",
"def test_prediction(lat, lon, k_val=5):\n",
" \"\"\"Tester une pr\u00e9diction avec des coordonn\u00e9es sp\u00e9cifiques.\"\"\"\n",
" print(f\"\\nTest de pr\u00e9diction pour ({lat}, {lon}) avec k={k_val}\")\n",
" print(\"=\"*60)\n",
" update_classification(lat, lon, k_val)\n",
"\n",
"# Exemples\n",
"# test_prediction(42.15, 9.15, 5) # Centre Corse\n",
"# test_prediction(42.55, 8.85, 5) # Nord\n",
"# test_prediction(41.65, 9.15, 5) # Sud"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"\u2705 **Carte interactive fonctionnelle avec ipyleaflet**\n",
"\n",
"**Avantages par rapport \u00e0 Folium :**\n",
"- \u2705 Interactions Python natives (pas de JavaScript inject\u00e9)\n",
"- \u2705 Fonctionne parfaitement dans Jupyter\n",
"- \u2705 Pas de probl\u00e8me d'iframe ou de s\u00e9curit\u00e9\n",
"- \u2705 Slider interactif pour changer k en temps r\u00e9el\n",
"- \u2705 R\u00e9sultats affich\u00e9s sous la carte\n",
"\n",
"**Utilisation :**\n",
"1. Cliquez sur la carte \u2192 Pr\u00e9diction s'affiche avec marqueur et lignes\n",
"2. Changez k avec le slider \u2192 Pr\u00e9diction se met \u00e0 jour automatiquement\n",
"3. Les r\u00e9sultats d\u00e9taill\u00e9s s'affichent sous la carte\n",
"\n",
"**Note :** Cette version ne g\u00e9n\u00e8re pas de fichier HTML standalone car ipyleaflet n\u00e9cessite un serveur Jupyter pour les interactions Python. Pour partager, utilisez Jupyter nbviewer ou Binder."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}