{ "cells": [ { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _lcurve:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# L-curve criterion\n", "\n", "By P.C. Hansen\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Diffinition\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Practically, in order to solve inversion problems $Ax=b$ $(A\\in\\mathbb{R}^{m\\times n}, x\\in\\mathbb{R}^n, b\\in\\mathbb{R}^m)$, we need to consider the similar form of least-squeares equations as follows:\n", "\n", "$$\n", "\\begin{align}\n", " x_\\lambda &= \\arg\\min\\{||Ax - b||_2^2 + \\lambda\\cdot||L(x - x_0)||_2^2\\}\\\\\n", " & = (A^\\mathsf{T}A + \\lambda L^\\mathsf{T}L)^{-1} (A^\\mathsf{T}b + \\lambda L^\\mathsf{T}x_0).\n", "\\end{align}\n", "$$\n", "\n", "$\\lambda\\in\\mathbb{R}$ is a real regularization parameter that must be chosen by the user. The vector $b$ is the given data and the vector $x_0\\in\\mathbb{R}^n$ is a priori estimate of $x$ which is set to zero when no a priori information is available.\n", "$||Ax - b||_2$ is the residual term and $||L(x - x_0)||_2$ is the regularization term, where $L\\in\\mathbb{R}^{n \\times n}$ is an operator matrix like a laplacian one.\n", "$\\lambda$ is the parameter which decides the contribution to these terms.\n", "And L-curve criterion is the method to balance between these contribution and optmize the inversion solution.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "L-curve is precisely following points curve:\n", "\n", "$$\n", " (||Ax - b||_2, ||L(x - x_0)||_2).\n", "$$\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "L-curve criterion is based on the fact that the optimal reguralization paramter is acheived when the L-curve point:\n", "\n", "$$\n", " (\\log||Ax_\\lambda - b||_2,\\; \\log||L(x_\\lambda - x_0)||_2)\n", "$$\n", "\n", "lies on this corner.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "In order to obtain the strict $\\lambda$ on the L-curve \"corner\" which is defined by the mathmatical method, we can calculate the L-curve carvature $\\kappa$ given by following equation:\n", "\n", "$$\n", "\\begin{align}\n", "\\kappa &= \\frac{f^{\\prime\\prime}(x)}{\\left[1 + f^{\\prime}(x)^2\\right]^{3/2}}=-2 \\eta\\rho\\frac{\\lambda^2 \\eta + \\lambda \\rho + \\rho\\eta/\\eta^\\prime}{(\\lambda^2 \\eta^2 + \\rho^2)^{3/2}},\\\\\n", "\\rho &\\equiv ||Ax_\\lambda - b||_2^2,\\\\\n", "\\eta &\\equiv ||L(x_\\lambda - x_0)||_2^2,\\\\\n", "\\eta^\\prime &\\equiv \\frac{d\\eta}{d\\lambda}\n", "\\end{align}\n", "$$\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Using the singular value decomposion (SVD) fomula, $x_\\lambda$ is described as follows:\n", "\n", "$$\n", "\\begin{align*}\n", " x_\\lambda\n", " &= (A^\\mathsf{T}A + \\lambda L^\\mathsf{T}L)^{-1} A^\\mathsf{T}b\\\\\n", " &= L^{-1}((AL^{-1})^{\\mathsf{T}}AL^{-1} + \\lambda I_n)^{-1}(AL^{-1})^\\mathsf{T}b\\\\\n", " &= L^{-1}(V\\Sigma^{\\mathsf{T}} U^{\\mathsf{T}}U\\Sigma V^{\\mathsf{T}} + \\lambda I_n)^{-1} (U\\Sigma V^{\\mathsf{T}})^{\\mathsf{T}}b \\quad (\\because AL^{-1} = U\\Sigma V^{\\mathsf{T}}: \\text{using SVD})\\\\\n", " &= L^{-1}V^{\\mathsf{-T}}(\\Sigma^{\\mathsf{T}}\\Sigma + \\lambda I_r)^{-1}V^{\\mathsf{-1}}V\\Sigma^{\\mathsf{T}}U^\\mathsf{T}b\\\\\n", " &= L^{-1}V(\\Sigma^2 + \\lambda I_r)^{-1}\\Sigma U^\\mathsf{T}b \\quad (\\because V^{-\\mathsf{T}} = V, \\Sigma = \\Sigma^\\mathsf{T} \\in \\mathbb{R}^{r\\times r})\\\\\n", " &= \\tilde{V}(I_r + \\lambda \\Sigma^{-2})^{-1} \\Sigma^{-1} U^\\mathsf{T}b \\quad (\\because \\tilde{V} \\equiv L^{-1}V)\\\\\n", " &= \\tilde{V} W \\Sigma^{-1} U^\\mathsf{T} b\\\\\n", " &= \\tilde{V}\n", " \\begin{pmatrix}\n", " w_1(\\lambda)\\frac{1}{\\sigma_1} & & \\\\\n", " & \\ddots & \\\\\n", " & & w_r(\\lambda)\\frac{1}{\\sigma_r}\n", " \\end{pmatrix}\n", " \\begin{pmatrix}\n", " u_1^\\mathsf{T}b \\\\\n", " \\vdots \\\\\n", " u_r^\\mathsf{T}b\n", " \\end{pmatrix}\\\\\n", " &=\n", " \\begin{pmatrix}\n", " \\tilde{v}_1 & \\cdots & \\tilde{v}_r\n", " \\end{pmatrix}\n", " \\begin{pmatrix}\n", " w_1(\\lambda)\\frac{u_1^{\\mathsf{T}}b}{\\sigma_1}\\\\\n", " \\vdots\\\\\n", " w_r(\\lambda)\\frac{u_r^{\\mathsf{T}}b}{\\sigma_r}\n", " \\end{pmatrix}\\\\\n", " &= \\sum_{i=1}^r w_i(\\lambda) \\frac{u_i^\\mathsf{T}b}{\\sigma_i}\\tilde{v}_i,\n", "\\end{align*}\n", "$$\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "where,\n", "\n", "$$\n", "\\begin{align*}\n", " W &\\equiv \\text{diag}(w_1(\\lambda),..., w_r(\\lambda)),\\\\\n", " w_i(\\lambda) &\\equiv \\frac{1}{1 + \\lambda/\\sigma_i^2},\\\\\n", " U\\Sigma V^\\mathsf{T}\n", " &\\equiv\n", " \\begin{pmatrix}\n", " u_1 & \\cdots & u_r\n", " \\end{pmatrix}\n", " \\cdot\n", " \\text{diag}(\\sigma_1,..., \\sigma_r)\n", " \\cdot\n", " \\begin{pmatrix}\n", " v_1 & \\cdots & v_r\n", " \\end{pmatrix}^\\mathsf{T}\n", " = AL^{-1},\\\\\n", " \\tilde{V}\n", " &=\n", " \\begin{pmatrix}\n", " \\tilde{v}_1 & \\cdots & \\tilde{v}_r\n", " \\end{pmatrix}\n", " \\equiv L^{-1}V.\n", "\\end{align*}\n", "$$\n", "\n", "Here, $r\\leq\\min(m, n)$ is the numerical rank of $A$, so each SVD matrix has demensions like $U\\in \\mathbb{R}^{m\\times r}$, $\\Sigma\\in\\mathbb{R}^{r\\times r}$, $V\\in\\mathbb{R}^{n\\times r}$, respectively. A priori estimate $x_0$ is assumed to be 0.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Also $\\rho, \\eta,$ and $\\eta^\\prime$ is described by using SVD as follows:\n", "\n", "$$\n", "\\begin{align}\n", "\\rho &= \\sum_{i=1}^r (1 - w_i)^2(u_i^\\mathsf{T}b)^2,\\\\\n", "\\eta &= \\sum_{i=1}^r w_i^2 \\frac{(u_i^\\mathsf{T}b)^2}{\\sigma_i^2},\\\\\n", "\\eta^\\prime &= -\\frac{2}{\\lambda}\\sum_{i=1}^r (1 - w_i)w_i^2\\frac{(u_i^\\mathsf{T}b)^2}{\\sigma_i^2}.\n", "\\end{align}\n", "$$\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As can be seen from the above, $v_i$ or $\\tilde{v}_i$ are not used in the calculation of $\\rho, \\eta,$ and $\\eta^\\prime$. They affect only the inversion solution $x_\\lambda$. Therefore, $\\tilde{V}$ is often called the \"inverted solution basis\" or \"reconstruction basis\".\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example ill-posed linear operator equation and applying L-curve criterion into this problem.\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "As a famouse ill-posed linear equation, Fredholm integral equation is often used:\n", "\n", "$$\n", "\\int_a^b K(s, t) x(t) dt = b(s), \\quad c\\leq s\\leq d.\n", "$$\n", "\n", "Here, we think the following situation as above equation form:\n", "\n", "$$\n", "\\begin{align}\n", "K(s, t) &\\equiv (\\cos(s) + \\cos(t))\\left(\\frac{\\sin(u)}{u}\\right)^2,\\\\\n", "u &\\equiv \\pi\\left(\\sin(s) + \\sin(t) \\right),\\\\\n", "[a, b] &= [c, d] = \\left[-\\frac{\\pi}{2}, \\frac{\\pi}{2} \\right].\n", "\\end{align}\n", "$$\n", "\n", "And, the true solution $x_\\text{t}(t)$ is assumed as follows:\n", "\n", "$$\n", "x_\\text{t}(t) = 2.0 \\exp\\left[-6(t-0.8)^2 \\right] + \\exp\\left[-2(t+0.5)^2 \\right]\n", "$$\n", "\n", "let us define these function as follows:\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "import numpy as np\n", "from matplotlib import pyplot as plt\n", "\n", "from cherab.phix.inversion import Lcurve\n", "\n", "plt.rcParams[\"figure.dpi\"] = 150\n", "\n", "\n", "def kernel(s: np.ndarray, t: np.ndarray):\n", " \"\"\"Kernel of Fredholm integral equation of the first kind.\"\"\"\n", " u = np.pi * (np.sin(s) + np.sin(t))\n", " if u == 0:\n", " return np.cos(s) + np.cos(t)\n", " else:\n", " return (np.cos(s) + np.cos(t)) * (np.sin(u) / u) ** 2\n", "\n", "\n", "# true solution\n", "def x_t_func(t):\n", " return 2.0 * np.exp(-6.0 * (t - 0.8) ** 2) + np.exp(-2.0 * (t + 0.5) ** 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's descritize the above integral equation. $s$ and $t$ are devided to 100 points at even intervals.\n", "$x$ is a 1D vector data $(100, )$ and $A$ is a $100\\times 100$ matrix.\n", "$A$ is defined using karnel function. When discretizing the integral, the trapezoidal approximation is applied.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# set valiables\n", "s = np.linspace(-np.pi * 0.5, np.pi * 0.5, num=100)\n", "t = np.linspace(-np.pi * 0.5, np.pi * 0.5, num=100)\n", "x_t = x_t_func(t)\n", "\n", "# Operater matrix: A\n", "A = np.zeros((s.size, t.size))\n", "A = np.array([[kernel(i, j) for j in t] for i in s])\n", "\n", "# trapezoidal rule\n", "A[:, 0] *= 0.5\n", "A[:, -1] *= 0.5\n", "A *= t[1] - t[0]\n", "\n", "# simpson rule -- option\n", "# A[:, 1:-1:2] *= 4.0\n", "# A[:, 2:-2:2] *= 2.0\n", "# A *= (t[1] - t[0]) / 3.0\n", "\n", "print(f\"condition number of A is {np.linalg.cond(A)}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then excute singular value decomposition of $A$\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# SVD using the numpy module\n", "u, sigma, vh = np.linalg.svd(A, full_matrices=False)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The measured data $b$ contain both white noise $\\bar{b}$ and truly converted $b_0$ from $Ax_\\text{t}$.\n", "\n", "Descritized linear equation is as follows:\n", "\n", "$$\n", "Ax = b_0 + \\bar{b} = b\n", "$$\n", "\n", "The noise variance is $1.0 \\times 10^{-4}$.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b_0 = A.dot(x_t)\n", "rng = np.random.default_rng()\n", "b_noise = rng.normal(0, 1.0e-4, b_0.size)\n", "b = b_0 + b_noise" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "In term of regularization, let us think tikhonov regularization, that is, :math:`L = I` and :math:`x_0 = 0`.\n", "And as an optimization method, let us use L-curve method as described above.\n", ":obj:`.Lcurve` is defined in :py:mod:`cherab.phix.inversion` module." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "lcurve = Lcurve(sigma, u, vh.T, data=b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us solve the inverse problem." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bounds = (-20.0, 2.0) # bounds of log10 of regularization parameter\n", "sol, status = lcurve.solve(bounds=bounds, disp=False)\n", "print(status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot measured data w/ or w/o noise\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The noise level is so mute that there is no clear difference between those. However this causes to arise the ill-posed problem due to the kernel function.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "plt.plot(s, b_0)\n", "plt.plot(s, b)\n", "plt.legend([\"w/o noise\", \"w/ noise\"])\n", "plt.xlabel(\"s\")\n", "plt.ylabel(\"b(s)\");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot L-curve\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "nbsphinx-thumbnail" ] }, "outputs": [], "source": [ "lcurve.plot_L_curve(bounds=bounds, n_beta=500, scatter_plot=5, scatter_annotate=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compare true solution $x_\\text{t}$ with estimated $x_\\lambda$\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Varing regularization parameters makes the interestingly difference between the true solution with estimated one.\\\n", "Here, let us compare the solutions varied from $\\lambda=10^{-11}, \\lambda_\\text{opt}, 1.0$. ($\\lambda_\\text{opt}$ is the optimized regularization parameter.)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "lambdas = [1.0e-11, lcurve.lambda_opt, 1.0]\n", "\n", "fig, axes = plt.subplots(1, 3, figsize=(10, 3))\n", "fig.tight_layout(pad=-2.0)\n", "labels = [f\"$\\\\lambda =$ {i:.2e}\" for i in lambdas]\n", "i = 0\n", "for ax, beta, label in zip(axes, lambdas, labels):\n", " ax.plot(t, x_t, \"--\", label=\"$x_{true}$\")\n", " ax.plot(t, lcurve.inverted_solution(beta=beta), label=\"$x_\\\\lambda$\")\n", "\n", " ax.set_xlim(t.min(), t.max())\n", " ax.set_ylim(0, x_t.max() * 1.1)\n", " ax.set_xlabel(\"$t$\")\n", " ax.set_title(label)\n", " ax.tick_params(direction=\"in\", labelsize=10, which=\"both\", top=True, right=True)\n", " if i < 1:\n", " ax.legend()\n", " else:\n", " ax.set_yticklabels([])\n", " i += 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## plot L-curve curvature\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "_, ax = lcurve.plot_curvature(bounds=(-10, -6), n_beta=500)\n", "ax.set_title(\"$\\\\lambda_{} = ${:.2e}\".format(\"{opt}\", lcurve.lambda_opt));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot $\\|Ax_\\lambda - b\\|$, $\\|x_\\lambda\\|$, and curvature\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax1 = plt.subplots(dpi=150)\n", "fig.subplots_adjust(right=0.85)\n", "\n", "ax2 = ax1.twinx()\n", "ax3 = ax1.twinx()\n", "\n", "ax3.spines.right.set_position((\"axes\", 1.2))\n", "\n", "# calculation of the values\n", "lambdas = np.logspace(-15, 0, num=500)\n", "rhos = [lcurve.residual_norm(beta) for beta in lambdas]\n", "etas = [lcurve.regularization_norm(beta) for beta in lambdas]\n", "kappa = [lcurve.curvature(beta) for beta in lambdas]\n", "\n", "# plot lines\n", "(p1,) = ax1.loglog(lambdas, rhos, color=\"C0\")\n", "(p2,) = ax2.loglog(lambdas, etas, color=\"C1\")\n", "(p3,) = ax3.semilogx(lambdas, kappa, color=\"C2\")\n", "\n", "# set axes properties\n", "ax1.set(xlim=(lambdas[0], lambdas[-1]), xlabel=r\"$\\lambda$\", ylabel=r\"$||Ax_\\lambda - b||$\")\n", "ax2.set(ylabel=r\"$||x_\\lambda||$\")\n", "ax3.set(ylabel=\"curvature of L-curve\")\n", "\n", "ax1.yaxis.label.set_color(p1.get_color())\n", "ax2.yaxis.label.set_color(p2.get_color())\n", "ax3.yaxis.label.set_color(p3.get_color())\n", "\n", "ax1.tick_params(axis=\"y\", colors=p1.get_color())\n", "ax2.tick_params(axis=\"y\", colors=p2.get_color())\n", "ax3.tick_params(axis=\"y\", colors=p3.get_color())\n", "\n", "ax3.spines[\"left\"].set_color(p1.get_color())\n", "ax2.spines[\"right\"].set_color(p2.get_color())\n", "ax3.spines[\"right\"].set_color(p3.get_color())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## check the error of solutions\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The relative error is defined as follows:\n", "\n", "$$\n", "e(\\lambda) = \\frac{||x_\\text{t} - x_\\lambda||}{||x_\\text{t}||}.\n", "$$\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "source_hidden": true } }, "outputs": [], "source": [ "def error(values, true):\n", " return np.linalg.norm(true - values) / np.linalg.norm(true)\n", "\n", "\n", "# set regularization parameters\n", "lambdas = np.logspace(-10, -6, num=500)\n", "\n", "# calculate errors\n", "errors = np.asarray([error(lcurve.inverted_solution(beta), x_t) for beta in lambdas])\n", "lambda_min = lambdas[errors.argmin()]\n", "error_min = errors.min()\n", "error_opt = error(sol, x_t)\n", "\n", "# calculate curvatures\n", "kappa = np.asarray([lcurve.curvature(beta) for beta in lambdas])\n", "\n", "# create figure\n", "fig, ax1 = plt.subplots(dpi=150)\n", "ax2 = ax1.twinx()\n", "\n", "# plot errors and curvatures\n", "(p1,) = ax1.loglog(lambdas, errors, color=\"C0\")\n", "(p2,) = ax2.semilogx(lambdas, kappa, color=\"C1\")\n", "\n", "# plot minimum error vertical line and point\n", "ax1.vlines(lambda_min, 0.01, 1, color=\"r\", linestyle=\"--\", linewidth=0.75)\n", "ax1.scatter(lambda_min, error_min, color=\"r\", marker=\"o\", s=10, zorder=2)\n", "ax1.text(\n", " lambda_min,\n", " 1.5e-2,\n", " \"$\\\\lambda_{min}$\",\n", " color=\"r\",\n", " horizontalalignment=\"left\",\n", " verticalalignment=\"center\",\n", ")\n", "\n", "# plot maximum curvature vertical line and point\n", "ax1.vlines(lcurve.lambda_opt, 0.01, 1, color=\"g\", linestyle=\"--\", linewidth=0.75)\n", "ax2.scatter(lcurve.lambda_opt, lcurve.curvature(lcurve.lambda_opt), color=\"g\", marker=\"o\", s=10, zorder=2)\n", "ax1.text(\n", " lcurve.lambda_opt,\n", " 1.5e-2,\n", " \"$\\\\lambda_{opt}$\",\n", " color=\"g\",\n", " horizontalalignment=\"left\",\n", " verticalalignment=\"center\",\n", ")\n", "\n", "# set axes\n", "ax1.set(\n", " xlim=(lambdas[0], lambdas[-1]), ylim=(0.01, 1), xlabel=r\"$\\lambda$\", ylabel=\"Relative errors\"\n", ")\n", "ax2.set(ylabel=\"curvature of L-curve\")\n", "\n", "ax1.yaxis.label.set_color(p1.get_color())\n", "ax2.yaxis.label.set_color(p2.get_color())\n", "\n", "ax1.tick_params(axis=\"y\", colors=p1.get_color())\n", "ax2.tick_params(axis=\"y\", colors=p2.get_color())\n", "\n", "ax2.spines[\"left\"].set_color(p1.get_color())\n", "ax2.spines[\"right\"].set_color(p2.get_color())\n", "\n", "ax1.set_title(\n", " \"$e(\\\\lambda_{})$ = {:.2%}, $e(\\\\lambda_{}) = ${:.2%}\".format(\n", " \"{opt}\", error_opt, \"{min}\", error_min\n", " )\n", ");" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Compare minimum error solution with Lcurve-optimized $x_\\lambda$ one\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig, ax = plt.subplots()\n", "ax.plot(t, x_t, \"k--\", label=\"$x_{true}$\", linewidth=1.0)\n", "ax.plot(t, lcurve.inverted_solution(lambda_min), label=\"$\\\\lambda_{min}$\")\n", "ax.plot(t, sol, label=\"$\\\\lambda_{opt}$\")\n", "ax.set_xlabel(\"$t$\")\n", "ax.set_xlim(t.min(), t.max())\n", "ax.set_ylim(0, x_t.max() * 1.1)\n", "ax.legend()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "\n", "# GCV criterion\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Diffinition\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Generalized Cross-Validation's idea is that the best modell for the measurements is the one that best predicts each measurement as a function of the others.\n", "The GCV estimate of $\\lambda$ is chosen as follows:\n", "\n", "$$\n", "\\begin{align}\n", "\\lambda_\\text{opt} &:= \\arg \\min_{\\lambda} GCV(\\lambda)\\\\\n", "GCV(\\lambda) &:= \\frac{\\rho(\\lambda)}{|1 - \\text{tr}{T(\\lambda)}|^2},\n", "\\end{align}\n", "$$\n", "\n", "where\n", "\n", "$$\n", "\\begin{align}\n", "\\rho(\\lambda) &:= ||Ax_\\lambda - b||^2\\\\\n", "T(\\lambda)b &:= Ax_\\lambda.\n", "\\end{align}\n", "$$\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Using SVD components, $GCV(\\lambda)$ is replaced as follows:\n", "\n", "$$\n", "GCV(\\lambda) = \\frac{\\rho(\\lambda)}{\\left[1 - \\sum_{i=1}^r w_i(\\lambda) \\right]^2}.\n", "$$\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from cherab.phix.inversion import GCV\n", "\n", "gcv = GCV(sigma, u, vh.T, data=b)\n", "\n", "bounds = (-20.0, 2.0) # bounds of log10 of regularization parameter\n", "sol, status = gcv.solve(bounds=bounds, disp=False)\n", "print(status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plot error function and GCV\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The results of the GCV optimization shows that GCV does not work for this inversion problem.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "_, ax = gcv.plot_gcv(bounds=bounds, n_beta=500)\n", "ax.set_title(\"$\\\\lambda_{} = ${:.2e}\".format(\"{opt}\", gcv.lambda_opt));" ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 3.10.8 ('cherab-phix-dev')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" }, "vscode": { "interpreter": { "hash": "2725905a4c02db19e04df9b8fdbbe5ec65a73ea52bebaf9474aa1cc98819834c" } } }, "nbformat": 4, "nbformat_minor": 4 }