{ "cells": [ { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _lcurve:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "L-curve criterion\n", "===\n", "By P.C. Hansen" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Diffinition\n", "---" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Practically, in order to solve inversion problems $Ax=b$, we need to consider the similar form of least-squeares equations as follows:\n", "\n", "$$\n", "\\begin{align}\n", " x_\\lambda &= \\arg\\min\\{||Ax - b||_2^2 + \\lambda\\cdot||L(x - x_0)||_2^2\\}\\\\\n", " & = (A^\\mathsf{T}A + \\lambda L^\\mathsf{T}L)^{-1} (A^\\mathsf{T}b + \\lambda L^\\mathsf{T}x_0).\n", "\\end{align}\n", "$$\n", "\n", "$\\lambda$ is a real regularization parameter that must be chosen by the user. The vector $b$ is the given data and the vector $x_0$ is a priori estimate of $x$ which is set to zero when no a priori information is available.\n", "$||Ax - b||_2$ is the residual term and $||L(x - x_0)||_2$ is the regularization term.\n", "$\\lambda$ is the parameter which decides the contribution to these terms.\n", "And L-curve criterion is the method to balance between these contribution and optmize the inversion solution." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "L-curve is precisely following points curve:\n", "$$\n", " (||Ax - b||_2, ||L(x - x_0)||_2)\n", "$$" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "L-curve criterion is based on the fact that the optimal reguralization paramter is acheived when the L-curve point:\n", "\n", "$$\n", " (\\log||Ax_\\lambda - b||_2,\\; \\log||L(x_\\lambda - x_0)||_2)\n", "$$\n", "\n", "lies on this corner." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "In order to obtain the strict $\\lambda$ on the L-curve \"corner\" which is defined by the mathmatical method, we can calculate the L-curve carvature $\\kappa$ given by following equation:\n", "$$\n", "\\begin{align}\n", "\\kappa &= \\frac{f^{\\prime\\prime}(x)}{\\left[1 + f^{\\prime}(x)^2\\right]^{3/2}}=-2 \\eta\\rho\\frac{\\lambda^2 \\eta + \\lambda \\rho + \\rho\\eta/\\eta^\\prime}{(\\lambda^2 \\eta^2 + \\rho^2)^{3/2}},\\\\\n", "\\rho &\\equiv ||Ax_\\lambda - b||_2^2,\\\\\n", "\\eta &\\equiv ||L(x_\\lambda - x_0)||_2^2,\\\\\n", "\\eta^\\prime &\\equiv \\frac{d\\eta}{d\\lambda}\n", "\\end{align}\n", "$$" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Using the singular value decomposion (SVD) fomula, $x_\\lambda$ is described as follows:\n", "\n", "$$\n", "\\begin{align}\n", " x_\\lambda &= L^{-1}V(\\Sigma^\\mathsf{T}\\Sigma + \\lambda I)^{-1}\\Sigma^\\mathsf{T}\\ U^\\mathsf{T}b\\\\\n", " &= L^{-1}V\n", " \\begin{pmatrix}\n", " w_1(\\lambda) / \\sigma_1 \\\\\n", " w_2(\\lambda) / \\sigma_2 \\\\\n", " \\vdots\n", " \\end{pmatrix}\n", " \\circ U^\\mathsf{T}b\\\\\n", " &= \\sum_{i=1}^K w_i(\\lambda) \\frac{(u_i, b)}{\\sigma_i}L^{-1}v_i,\n", "\\end{align}\n", "$$" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "where,\n", "\n", "$$\n", "\\begin{align}\n", "w_i(\\lambda) &\\equiv \\frac{1}{1 + \\lambda/\\sigma_i^2},\\\\\n", "U\\Sigma V^\\mathsf{T} &\\equiv (u_1, u_2, ...) \\cdot \\text{diag}(\\sigma_1, \\sigma_2,...) \\cdot (v_1, v_2, ...)^\\mathsf{T} = AL^{-1}.\n", "\\end{align}\n", "$$\n", "\n", "Here, $K\\equiv\\min(M, N)$, which means the size of SVD matrices: $U=\\text{Mat}(M, M)$, $\\Sigma=\\text{Mat}(M, N)$, $V=\\text{Mat}(N, N)$, and a priori estimate $x_0$ is assumed to be 0. Additionally, the symbol $\\circ$\n", "means the Hadamard product." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Also $\\rho, \\eta,$ and $\\eta^\\prime$ is described by using SVD as follows:\n", "$$\n", "\\begin{align}\n", "\\rho &= \\sum_{i=1}^K (1 - w_i)^2(u_i, b)^2,\\\\\n", "\\eta &= \\sum_{i=1}^K w_i^2 \\frac{(u_i, b)^2}{\\sigma_i^2},\\\\\n", "\\eta^\\prime &= -\\frac{2}{\\lambda}\\sum_{i=1}^K (1 - w_i)w_i^2\\frac{(u_i, b)^2}{\\sigma_i^2}.\n", "\\end{align}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Example ill-posed linear operator equation and applying L-curve criterion into this problem.\n", "---" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "As a famouse ill-posed linear equation, Fredholm integral equation is often used:\n", "$$\n", "\\int_a^b K(s, t) x(t) dt = b(s), \\quad c\\leq s\\leq d.\n", "$$\n", "Here, we think the following situation as above equation form:\n", "$$\n", "\\begin{align}\n", "K(s, t) &\\equiv (\\cos(s) + \\cos(t))\\left(\\frac{\\sin(u)}{u}\\right)^2,\\\\\n", "u &\\equiv \\pi\\left(\\sin(s) + \\sin(t) \\right),\\\\\n", "[a, b] &= [c, d] = \\left[-\\frac{\\pi}{2}, \\frac{\\pi}{2} \\right].\n", "\\end{align}\n", "$$\n", "And, the true solution $x_\\text{t}(t)$ is assumed as follows:\n", "\n", "$$\n", "x_\\text{t}(t) = 2.0 \\exp\\left[-6(t-0.8)^2 \\right] + \\exp\\left[-2(t+0.5)^2 \\right]\n", "$$\n", "\n", "let us define these function as follows:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "import numpy as np\n", "from matplotlib import pyplot as plt\n", "\n", "from cherab.phix.inversion import Lcurve\n", "\n", "plt.rcParams[\"figure.dpi\"] = 150\n", "\n", "\n", "def kernel(s: np.ndarray, t: np.ndarray):\n", " \"\"\"Kernel of Fredholm integral equation of the first kind.\"\"\"\n", " u = np.pi * (np.sin(s) + np.sin(t))\n", " if u == 0:\n", " return np.cos(s) + np.cos(t)\n", " else:\n", " return (np.cos(s) + np.cos(t)) * (np.sin(u) / u) ** 2\n", "\n", "\n", "# true solution\n", "def x_t_func(t):\n", " return 2.0 * np.exp(-6.0 * (t - 0.8) ** 2) + np.exp(-2.0 * (t + 0.5) ** 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's descritize the above integral equation. $s$ and $t$ are devided to 100 points at even intervals.\n", "$x$ is a 1D vector data $(100, )$ and $A$ is a $100\\times 100$ matrix.\n", "$A$ is defined using karnel function. When discretizing the integral, the trapezoidal approximation is applied." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# set valiables\n", "s = np.linspace(-np.pi * 0.5, np.pi * 0.5, num=100)\n", "t = np.linspace(-np.pi * 0.5, np.pi * 0.5, num=100)\n", "x_t = x_t_func(t)\n", "\n", "# Operater matrix: A\n", "A = np.zeros((s.size, t.size))\n", "A = np.array([[kernel(i, j) for j in t] for i in s])\n", "\n", "# trapezoidal rule\n", "A[:, 0] *= 0.5\n", "A[:, -1] *= 0.5\n", "A *= t[1] - t[0]\n", "\n", "# simpson rule -- option\n", "# A[:, 1:-1:2] *= 4.0\n", "# A[:, 2:-2:2] *= 2.0\n", "# A *= (t[1] - t[0]) / 3.0\n", "\n", "print(f\"condition number of A is {np.linalg.cond(A)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then excute singular value decomposition of $A$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# SVD using the numpy module\n", "u, sigma, vh = np.linalg.svd(A, full_matrices=False)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "\n", "The measured data $b$ is added white noise $\\bar{b}$, and truly converted $b_0$ is computed from $Ax_\\text{t}$.\n", "\n", "Descritized linear equation is as follows:\n", "$$\n", "Ax = b_0 + \\bar{b} = b\n", "$$\n", "\n", "The noise variance is $1.0 \\times 10^{-4}$." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b_0 = A.dot(x_t)\n", "rng = np.random.default_rng()\n", "b_noise = rng.normal(0, 1.0e-4, b_0.size)\n", "b = b_0 + b_noise" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "In term of regularization, let us think tikhonov regularization, that is, :math:`L = I` and :math:`x_0 = 0`.\n", "And as an optimization method of regulariation parameters, use L-curve method as described above.\n", ":obj:`.Lcurve` is defined in :py:mod:`cherab.phix.inversion` module." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lcurv = Lcurve(sigma, u, vh, b)\n", "lcurv.lambdas = 10 ** np.linspace(-20, 2, 100)\n", "lcurv.optimize(10) # iteration times to find the maximum carvature" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot measured data w or w/o noise\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The noise level is so mute that there is no clear difference between those. However this causes to arise the ill-posed problem due to the kernel function." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "plt.plot(s, b_0)\n", "plt.plot(s, b)\n", "plt.legend([\"w/o noise\", \"w noise\"])\n", "plt.xlabel(\"s\")\n", "plt.ylabel(\"b(s)\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot L-curve\n", "---" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "nbsphinx-thumbnail" ] }, "outputs": [], "source": [ "lcurv.plot_L_curve(scatter_plot=5, scatter_annotate=True);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare true solution $x_\\text{t}$ with estimated $x_\\lambda$\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Varing regularization parameters makes the interestingly difference between the true solution with estimated one.\\\n", "Here, let us compare the solutions varied from $\\lambda=10^{-11}, \\lambda_\\text{opt}, 1.0$. ($\\lambda_\\text{opt}$ is the optimized regularization parameter.)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "lambdas = [1.5e-10, lcurv.lambda_opt, 1.0]\n", "\n", "fig, axes = plt.subplots(1, 3, figsize=(10, 3))\n", "fig.tight_layout(pad=-2.0)\n", "labels = [f\"$\\\\lambda =$ {i:.2e}\" for i in lambdas]\n", "i = 0\n", "for ax, beta, label in zip(axes, lambdas, labels):\n", " ax.plot(t, x_t, \"--\", label=\"$x_{true}$\")\n", " ax.plot(t, lcurv.inverted_solution(beta=beta), label=\"$x_\\\\lambda$\")\n", "\n", " ax.set_xlim(t.min(), t.max())\n", " ax.set_ylim(0, x_t.max() * 1.1)\n", " ax.set_xlabel(\"$t$\")\n", " ax.set_title(label)\n", " ax.tick_params(direction=\"in\", labelsize=10, which=\"both\", top=True, right=True)\n", " if i < 1:\n", " ax.legend()\n", " else:\n", " ax.set_yticklabels([])\n", " i += 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "plot L-curve curvature\n", "---" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "_, ax = lcurv.plot_curvature()\n", "ax.set_title(\"$\\\\lambda_{} = ${:.2e}\".format(\"{opt}\", lcurv.lambda_opt));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "check the error of solutions\n", "----" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The relative error is defined as follows:\n", "$$\n", "e(\\lambda) = \\frac{||x_\\text{t} - x_\\lambda||}{||x_\\text{t}||}.\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "def error(values, true):\n", " return np.linalg.norm(true - values) / np.linalg.norm(true)\n", "\n", "\n", "errors = np.asarray([error(lcurv.inverted_solution(beta), x_t) for beta in lcurv.lambdas])\n", "lambda_min = lcurv.lambdas[errors.argmin()]\n", "error_min = errors.min()\n", "error_opt = errors[np.where(lcurv.lambdas == lcurv.lambda_opt)[0][0]]\n", "\n", "# xlim, ylim\n", "xlim = 1.0e-13, 1.0e2\n", "ylim = 1.0e-02, 2.0e2\n", "\n", "fig, ax = plt.subplots()\n", "\n", "# relative errors\n", "ax.loglog(lcurv.lambdas, errors, color=\"C0\")\n", "ax.text(lcurv.lambdas[35], errors[33], \"$e(\\\\lambda)$\", color=\"C0\")\n", "\n", "# residual norms\n", "residuals = [lcurv.residual_norm(beta) for beta in lcurv.lambdas]\n", "ax.loglog(lcurv.lambdas, residuals, color=\"C1\")\n", "ax.text(\n", " lcurv.lambdas[-5], residuals[-1], \"$||Ax_\\\\lambda-b||$\", color=\"C1\", horizontalalignment=\"right\"\n", ")\n", "\n", "# regularization norms\n", "reguls = [lcurv.regularization_norm(beta) for beta in lcurv.lambdas]\n", "ax.loglog(lcurv.lambdas, reguls, color=\"C2\")\n", "ax.text(lcurv.lambdas[35], reguls[33], \"$||Lx_\\\\lambda||$\", color=\"C2\")\n", "\n", "# scattered point\n", "min_sol = ax.scatter(lambda_min, error_min, marker=\"x\", color=\"r\")\n", "ax.text(\n", " lambda_min,\n", " error_min * 0.8,\n", " \"min($e(\\\\lambda)$)\",\n", " color=\"r\",\n", " horizontalalignment=\"left\",\n", " verticalalignment=\"top\",\n", ")\n", "opt = ax.scatter(lcurv.lambda_opt, error_opt, marker=\"x\", color=\"g\")\n", "ax.text(\n", " lcurv.lambda_opt,\n", " error_opt * 0.8,\n", " \"$e(\\\\lambda_{opt})$\",\n", " color=\"g\",\n", " horizontalalignment=\"right\",\n", " verticalalignment=\"top\",\n", ")\n", "\n", "ax.set_xlim(*xlim)\n", "ax.set_ylim(*ylim)\n", "ax.set_xlabel(\"$\\\\lambda$\")\n", "ax.set_title(\n", " \"min($e(\\\\lambda)$) = {:.2%}, $e(\\\\lambda_{}) = ${:.2%}\".format(error_min, \"{opt}\", error_opt)\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare minimum error solution with Lcurve-optimized $x_\\lambda$ one\n", "---" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "ax.plot(t, x_t, \"k--\", label=\"$x_{true}$\", linewidth=1.0)\n", "ax.plot(t, lcurv.inverted_solution(beta=lambda_min), label=\"min(e($\\\\lambda$))\")\n", "ax.plot(t, lcurv.inverted_solution(beta=lcurv.lambda_opt), label=\"$\\\\lambda_{opt}$\")\n", "ax.set_xlabel(\"$t$\")\n", "ax.set_xlim(t.min(), t.max())\n", "ax.set_ylim(0, x_t.max() * 1.1)\n", "ax.legend();" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "GCV criterion\n", "===" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Diffinition\n", "---" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Generalized Cross-Validation's idea is that the best modell for the measurements is the one that best predicts each measurement as a function of the others.\n", "The GCV estimate of $\\lambda$ is chosen as follows:\n", "$$\n", "\\begin{align}\n", "\\lambda_\\text{opt} &:= \\arg \\min_{\\lambda} GCV(\\lambda)\\\\\n", "GCV(\\lambda) &:= \\frac{\\rho(\\lambda)}{|1 - \\text{tr}{T(\\lambda)}|^2},\n", "\\end{align}\n", "$$\n", "\n", "where\n", "$$\n", "\\begin{align}\n", "\\rho(\\lambda) &:= ||Ax_\\lambda - b||^2\\\\\n", "T(\\lambda)b &:= Ax_\\lambda.\n", "\\end{align}\n", "$$" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Using SVD components, $GCV(\\lambda)$ is replaced as follows:\n", "$$\n", "GCV(\\lambda) = \\frac{\\rho(\\lambda)}{\\left[1 - \\sum_{i=1}^N w_i(\\lambda) \\right]^2}.\n", "$$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from cherab.phix.inversion import GCV\n", "\n", "gcv = GCV(sigma, u, vh, b)\n", "gcv.lambdas = 10 ** np.linspace(-20, 2, 100)\n", "gcv.optimize(6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot error function and GCV\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The results of the GCV optimization shows that GCV does not work for this inversion problem." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = gcv.plot_gcv()\n", "\n", "gcvs = [gcv.gcv(beta) for beta in gcv.lambdas]\n", "\n", "# anotate minimum GCV value point\n", "ax.text(gcv.lambdas[0], gcv.gcv(gcv.lambdas[0]) * 10, \"min(gcv($\\\\lambda$))\", color=\"r\");" ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 3.10.8 ('cherab-phix-dev')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 15:55:03) \n[GCC 10.4.0]" }, "vscode": { "interpreter": { "hash": "2725905a4c02db19e04df9b8fdbbe5ec65a73ea52bebaf9474aa1cc98819834c" } } }, "nbformat": 4, "nbformat_minor": 4 }