{
 "cells": [
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. _lcurve:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# L-curve criterion\n",
    "\n",
    "By P.C. Hansen\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Diffinition\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Practically, in order to solve inversion problems $Ax=b$ $(A\\in\\mathbb{R}^{m\\times n}, x\\in\\mathbb{R}^n, b\\in\\mathbb{R}^m)$, we need to consider the similar form of least-squeares equations as follows:\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "    x_\\lambda &= \\arg\\min\\{||Ax - b||_2^2 + \\lambda\\cdot||L(x - x_0)||_2^2\\}\\\\\n",
    "    & = (A^\\mathsf{T}A + \\lambda L^\\mathsf{T}L)^{-1} (A^\\mathsf{T}b + \\lambda L^\\mathsf{T}x_0).\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "$\\lambda\\in\\mathbb{R}$ is a real regularization parameter that must be chosen by the user. The vector $b$ is the given data and the vector $x_0\\in\\mathbb{R}^n$ is a priori estimate of $x$ which is set to zero when no a priori information is available.\n",
    "$||Ax - b||_2$ is the residual term and $||L(x - x_0)||_2$ is the regularization term, where $L\\in\\mathbb{R}^{n \\times n}$ is an operator matrix like a laplacian one.\n",
    "$\\lambda$ is the parameter which decides the contribution to these terms.\n",
    "And L-curve criterion is the method to balance between these contribution and optmize the inversion solution.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "L-curve is precisely following points curve:\n",
    "\n",
    "$$\n",
    "    (||Ax - b||_2, ||L(x - x_0)||_2).\n",
    "$$\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "L-curve criterion is based on the fact that the optimal reguralization paramter is acheived when the L-curve point:\n",
    "\n",
    "$$\n",
    "    (\\log||Ax_\\lambda - b||_2,\\; \\log||L(x_\\lambda - x_0)||_2)\n",
    "$$\n",
    "\n",
    "lies on this corner.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to obtain the strict $\\lambda$ on the L-curve \"corner\" which is defined by the mathmatical method, we can calculate the L-curve carvature $\\kappa$ given by following equation:\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\kappa &= \\frac{f^{\\prime\\prime}(x)}{\\left[1 + f^{\\prime}(x)^2\\right]^{3/2}}=-2 \\eta\\rho\\frac{\\lambda^2 \\eta + \\lambda \\rho + \\rho\\eta/\\eta^\\prime}{(\\lambda^2 \\eta^2 + \\rho^2)^{3/2}},\\\\\n",
    "\\rho &\\equiv ||Ax_\\lambda - b||_2^2,\\\\\n",
    "\\eta &\\equiv ||L(x_\\lambda - x_0)||_2^2,\\\\\n",
    "\\eta^\\prime &\\equiv \\frac{d\\eta}{d\\lambda}\n",
    "\\end{align}\n",
    "$$\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using the singular value decomposion (SVD) fomula, $x_\\lambda$ is described as follows:\n",
    "\n",
    "$$\n",
    "\\begin{align*}\n",
    "    x_\\lambda\n",
    "    &= (A^\\mathsf{T}A + \\lambda L^\\mathsf{T}L)^{-1} A^\\mathsf{T}b\\\\\n",
    "    &= L^{-1}((AL^{-1})^{\\mathsf{T}}AL^{-1} + \\lambda I_n)^{-1}(AL^{-1})^\\mathsf{T}b\\\\\n",
    "    &= L^{-1}(V\\Sigma^{\\mathsf{T}} U^{\\mathsf{T}}U\\Sigma V^{\\mathsf{T}} + \\lambda I_n)^{-1} (U\\Sigma V^{\\mathsf{T}})^{\\mathsf{T}}b \\quad (\\because AL^{-1} = U\\Sigma V^{\\mathsf{T}}: \\text{using SVD})\\\\\n",
    "    &= L^{-1}V^{\\mathsf{-T}}(\\Sigma^{\\mathsf{T}}\\Sigma + \\lambda I_r)^{-1}V^{\\mathsf{-1}}V\\Sigma^{\\mathsf{T}}U^\\mathsf{T}b\\\\\n",
    "    &= L^{-1}V(\\Sigma^2 + \\lambda I_r)^{-1}\\Sigma U^\\mathsf{T}b \\quad (\\because V^{-\\mathsf{T}} = V, \\Sigma = \\Sigma^\\mathsf{T} \\in \\mathbb{R}^{r\\times r})\\\\\n",
    "    &= \\tilde{V}(I_r + \\lambda \\Sigma^{-2})^{-1} \\Sigma^{-1} U^\\mathsf{T}b \\quad (\\because \\tilde{V} \\equiv L^{-1}V)\\\\\n",
    "    &= \\tilde{V} W \\Sigma^{-1} U^\\mathsf{T} b\\\\\n",
    "    &= \\tilde{V}\n",
    "    \\begin{pmatrix}\n",
    "        w_1(\\lambda)\\frac{1}{\\sigma_1} & & \\\\\n",
    "        & \\ddots & \\\\\n",
    "        & & w_r(\\lambda)\\frac{1}{\\sigma_r}\n",
    "    \\end{pmatrix}\n",
    "    \\begin{pmatrix}\n",
    "        u_1^\\mathsf{T}b \\\\\n",
    "        \\vdots \\\\\n",
    "        u_r^\\mathsf{T}b\n",
    "    \\end{pmatrix}\\\\\n",
    "    &=\n",
    "    \\begin{pmatrix}\n",
    "        \\tilde{v}_1 & \\cdots & \\tilde{v}_r\n",
    "    \\end{pmatrix}\n",
    "    \\begin{pmatrix}\n",
    "        w_1(\\lambda)\\frac{u_1^{\\mathsf{T}}b}{\\sigma_1}\\\\\n",
    "        \\vdots\\\\\n",
    "        w_r(\\lambda)\\frac{u_r^{\\mathsf{T}}b}{\\sigma_r}\n",
    "    \\end{pmatrix}\\\\\n",
    "    &= \\sum_{i=1}^r w_i(\\lambda) \\frac{u_i^\\mathsf{T}b}{\\sigma_i}\\tilde{v}_i,\n",
    "\\end{align*}\n",
    "$$\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "where,\n",
    "\n",
    "$$\n",
    "\\begin{align*}\n",
    "    W &\\equiv \\text{diag}(w_1(\\lambda),..., w_r(\\lambda)),\\\\\n",
    "    w_i(\\lambda) &\\equiv \\frac{1}{1 + \\lambda/\\sigma_i^2},\\\\\n",
    "    U\\Sigma V^\\mathsf{T}\n",
    "    &\\equiv\n",
    "    \\begin{pmatrix}\n",
    "        u_1 & \\cdots & u_r\n",
    "    \\end{pmatrix}\n",
    "    \\cdot\n",
    "    \\text{diag}(\\sigma_1,..., \\sigma_r)\n",
    "    \\cdot\n",
    "    \\begin{pmatrix}\n",
    "        v_1 & \\cdots & v_r\n",
    "    \\end{pmatrix}^\\mathsf{T}\n",
    "    = AL^{-1},\\\\\n",
    "    \\tilde{V}\n",
    "    &=\n",
    "    \\begin{pmatrix}\n",
    "        \\tilde{v}_1 & \\cdots & \\tilde{v}_r\n",
    "    \\end{pmatrix}\n",
    "    \\equiv L^{-1}V.\n",
    "\\end{align*}\n",
    "$$\n",
    "\n",
    "Here, $r\\leq\\min(m, n)$ is the numerical rank of $A$, so each SVD matrix has demensions like $U\\in \\mathbb{R}^{m\\times r}$, $\\Sigma\\in\\mathbb{R}^{r\\times r}$, $V\\in\\mathbb{R}^{n\\times r}$, respectively. A priori estimate $x_0$ is assumed to be 0.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also $\\rho, \\eta,$ and $\\eta^\\prime$ is described by using SVD as follows:\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\rho &= \\sum_{i=1}^r (1 - w_i)^2(u_i^\\mathsf{T}b)^2,\\\\\n",
    "\\eta &= \\sum_{i=1}^r w_i^2 \\frac{(u_i^\\mathsf{T}b)^2}{\\sigma_i^2},\\\\\n",
    "\\eta^\\prime &= -\\frac{2}{\\lambda}\\sum_{i=1}^r (1 - w_i)w_i^2\\frac{(u_i^\\mathsf{T}b)^2}{\\sigma_i^2}.\n",
    "\\end{align}\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As can be seen from the above, $v_i$ or $\\tilde{v}_i$ are not used in the calculation of $\\rho, \\eta,$ and $\\eta^\\prime$. They affect only the inversion solution $x_\\lambda$. Therefore, $\\tilde{V}$ is often called the \"inverted solution basis\" or \"reconstruction basis\".\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example ill-posed linear operator equation and applying L-curve criterion into this problem.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As a famouse ill-posed linear equation, Fredholm integral equation is often used:\n",
    "\n",
    "$$\n",
    "\\int_a^b K(s, t) x(t) dt = b(s), \\quad c\\leq s\\leq d.\n",
    "$$\n",
    "\n",
    "Here, we think the following situation as above equation form:\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "K(s, t) &\\equiv (\\cos(s) + \\cos(t))\\left(\\frac{\\sin(u)}{u}\\right)^2,\\\\\n",
    "u &\\equiv \\pi\\left(\\sin(s) + \\sin(t) \\right),\\\\\n",
    "[a, b] &= [c, d] = \\left[-\\frac{\\pi}{2}, \\frac{\\pi}{2} \\right].\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "And, the true solution $x_\\text{t}(t)$ is assumed as follows:\n",
    "\n",
    "$$\n",
    "x_\\text{t}(t) = 2.0 \\exp\\left[-6(t-0.8)^2 \\right] + \\exp\\left[-2(t+0.5)^2 \\right]\n",
    "$$\n",
    "\n",
    "let us define these function as follows:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from matplotlib import pyplot as plt\n",
    "\n",
    "from cherab.phix.inversion import Lcurve\n",
    "\n",
    "plt.rcParams[\"figure.dpi\"] = 150\n",
    "\n",
    "\n",
    "def kernel(s: np.ndarray, t: np.ndarray):\n",
    "    \"\"\"Kernel of Fredholm integral equation of the first kind.\"\"\"\n",
    "    u = np.pi * (np.sin(s) + np.sin(t))\n",
    "    if u == 0:\n",
    "        return np.cos(s) + np.cos(t)\n",
    "    else:\n",
    "        return (np.cos(s) + np.cos(t)) * (np.sin(u) / u) ** 2\n",
    "\n",
    "\n",
    "# true solution\n",
    "def x_t_func(t):\n",
    "    return 2.0 * np.exp(-6.0 * (t - 0.8) ** 2) + np.exp(-2.0 * (t + 0.5) ** 2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's descritize the above integral equation. $s$ and $t$ are devided to 100 points at even intervals.\n",
    "$x$ is a 1D vector data $(100, )$ and $A$ is a $100\\times 100$ matrix.\n",
    "$A$ is defined using karnel function. When discretizing the integral, the trapezoidal approximation is applied.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set valiables\n",
    "s = np.linspace(-np.pi * 0.5, np.pi * 0.5, num=100)\n",
    "t = np.linspace(-np.pi * 0.5, np.pi * 0.5, num=100)\n",
    "x_t = x_t_func(t)\n",
    "\n",
    "# Operater matrix: A\n",
    "A = np.zeros((s.size, t.size))\n",
    "A = np.array([[kernel(i, j) for j in t] for i in s])\n",
    "\n",
    "# trapezoidal rule\n",
    "A[:, 0] *= 0.5\n",
    "A[:, -1] *= 0.5\n",
    "A *= t[1] - t[0]\n",
    "\n",
    "# simpson rule  -- option\n",
    "# A[:, 1:-1:2] *= 4.0\n",
    "# A[:, 2:-2:2] *= 2.0\n",
    "# A *= (t[1] - t[0]) / 3.0\n",
    "\n",
    "print(f\"condition number of A is {np.linalg.cond(A)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then excute singular value decomposition of $A$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# SVD using the numpy module\n",
    "u, sigma, vh = np.linalg.svd(A, full_matrices=False)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The measured data $b$ contain both white noise $\\bar{b}$ and truly converted $b_0$ from $Ax_\\text{t}$.\n",
    "\n",
    "Descritized linear equation is as follows:\n",
    "\n",
    "$$\n",
    "Ax = b_0 + \\bar{b} = b\n",
    "$$\n",
    "\n",
    "The noise variance is $1.0 \\times 10^{-4}$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "b_0 = A.dot(x_t)\n",
    "rng = np.random.default_rng()\n",
    "b_noise = rng.normal(0, 1.0e-4, b_0.size)\n",
    "b = b_0 + b_noise"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "In term of regularization, let us think tikhonov regularization, that is, :math:`L = I` and :math:`x_0 = 0`.\n",
    "And as an optimization method, let us use L-curve method as described above.\n",
    ":obj:`.Lcurve` is defined in :py:mod:`cherab.phix.inversion` module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lcurve = Lcurve(sigma, u, vh.T, data=b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let us solve the inverse problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bounds = (-20.0, 2.0)  # bounds of log10 of regularization parameter\n",
    "sol, status = lcurve.solve(bounds=bounds, disp=False)\n",
    "print(status)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot measured data w/ or w/o noise\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The noise level is so mute that there is no clear difference between those. However this causes to arise the ill-posed problem due to the kernel function.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "plt.plot(s, b_0)\n",
    "plt.plot(s, b)\n",
    "plt.legend([\"w/o noise\", \"w/ noise\"])\n",
    "plt.xlabel(\"s\")\n",
    "plt.ylabel(\"b(s)\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot L-curve\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "nbsphinx-thumbnail"
    ]
   },
   "outputs": [],
   "source": [
    "lcurve.plot_L_curve(bounds=bounds, n_beta=500, scatter_plot=5, scatter_annotate=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compare true solution $x_\\text{t}$ with estimated $x_\\lambda$\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Varing regularization parameters makes the interestingly difference between the true solution with estimated one.\\\n",
    "Here, let us compare the solutions varied from $\\lambda=10^{-11}, \\lambda_\\text{opt}, 1.0$. ($\\lambda_\\text{opt}$ is the optimized regularization parameter.)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "lambdas = [1.0e-11, lcurve.lambda_opt, 1.0]\n",
    "\n",
    "fig, axes = plt.subplots(1, 3, figsize=(10, 3))\n",
    "fig.tight_layout(pad=-2.0)\n",
    "labels = [f\"$\\\\lambda =$ {i:.2e}\" for i in lambdas]\n",
    "i = 0\n",
    "for ax, beta, label in zip(axes, lambdas, labels):\n",
    "    ax.plot(t, x_t, \"--\", label=\"$x_{true}$\")\n",
    "    ax.plot(t, lcurve.inverted_solution(beta=beta), label=\"$x_\\\\lambda$\")\n",
    "\n",
    "    ax.set_xlim(t.min(), t.max())\n",
    "    ax.set_ylim(0, x_t.max() * 1.1)\n",
    "    ax.set_xlabel(\"$t$\")\n",
    "    ax.set_title(label)\n",
    "    ax.tick_params(direction=\"in\", labelsize=10, which=\"both\", top=True, right=True)\n",
    "    if i < 1:\n",
    "        ax.legend()\n",
    "    else:\n",
    "        ax.set_yticklabels([])\n",
    "    i += 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## plot L-curve curvature\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "_, ax = lcurve.plot_curvature(bounds=(-10, -6), n_beta=500)\n",
    "ax.set_title(\"$\\\\lambda_{} = ${:.2e}\".format(\"{opt}\", lcurve.lambda_opt));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot $\\|Ax_\\lambda - b\\|$, $\\|x_\\lambda\\|$, and curvature\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax1 = plt.subplots(dpi=150)\n",
    "fig.subplots_adjust(right=0.85)\n",
    "\n",
    "ax2 = ax1.twinx()\n",
    "ax3 = ax1.twinx()\n",
    "\n",
    "ax3.spines.right.set_position((\"axes\", 1.2))\n",
    "\n",
    "# calculation of the values\n",
    "lambdas = np.logspace(-15, 0, num=500)\n",
    "rhos = [lcurve.residual_norm(beta) for beta in lambdas]\n",
    "etas = [lcurve.regularization_norm(beta) for beta in lambdas]\n",
    "kappa = [lcurve.curvature(beta) for beta in lambdas]\n",
    "\n",
    "# plot lines\n",
    "(p1,) = ax1.loglog(lambdas, rhos, color=\"C0\")\n",
    "(p2,) = ax2.loglog(lambdas, etas, color=\"C1\")\n",
    "(p3,) = ax3.semilogx(lambdas, kappa, color=\"C2\")\n",
    "\n",
    "# set axes properties\n",
    "ax1.set(xlim=(lambdas[0], lambdas[-1]), xlabel=r\"$\\lambda$\", ylabel=r\"$||Ax_\\lambda - b||$\")\n",
    "ax2.set(ylabel=r\"$||x_\\lambda||$\")\n",
    "ax3.set(ylabel=\"curvature of L-curve\")\n",
    "\n",
    "ax1.yaxis.label.set_color(p1.get_color())\n",
    "ax2.yaxis.label.set_color(p2.get_color())\n",
    "ax3.yaxis.label.set_color(p3.get_color())\n",
    "\n",
    "ax1.tick_params(axis=\"y\", colors=p1.get_color())\n",
    "ax2.tick_params(axis=\"y\", colors=p2.get_color())\n",
    "ax3.tick_params(axis=\"y\", colors=p3.get_color())\n",
    "\n",
    "ax3.spines[\"left\"].set_color(p1.get_color())\n",
    "ax2.spines[\"right\"].set_color(p2.get_color())\n",
    "ax3.spines[\"right\"].set_color(p3.get_color())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## check the error of solutions\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The relative error is defined as follows:\n",
    "\n",
    "$$\n",
    "e(\\lambda) = \\frac{||x_\\text{t} - x_\\lambda||}{||x_\\text{t}||}.\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "def error(values, true):\n",
    "    return np.linalg.norm(true - values) / np.linalg.norm(true)\n",
    "\n",
    "\n",
    "# set regularization parameters\n",
    "lambdas = np.logspace(-10, -6, num=500)\n",
    "\n",
    "# calculate errors\n",
    "errors = np.asarray([error(lcurve.inverted_solution(beta), x_t) for beta in lambdas])\n",
    "lambda_min = lambdas[errors.argmin()]\n",
    "error_min = errors.min()\n",
    "error_opt = error(sol, x_t)\n",
    "\n",
    "# calculate curvatures\n",
    "kappa = np.asarray([lcurve.curvature(beta) for beta in lambdas])\n",
    "\n",
    "# create figure\n",
    "fig, ax1 = plt.subplots(dpi=150)\n",
    "ax2 = ax1.twinx()\n",
    "\n",
    "# plot errors and curvatures\n",
    "(p1,) = ax1.loglog(lambdas, errors, color=\"C0\")\n",
    "(p2,) = ax2.semilogx(lambdas, kappa, color=\"C1\")\n",
    "\n",
    "# plot minimum error vertical line and point\n",
    "ax1.vlines(lambda_min, 0.01, 1, color=\"r\", linestyle=\"--\", linewidth=0.75)\n",
    "ax1.scatter(lambda_min, error_min, color=\"r\", marker=\"o\", s=10, zorder=2)\n",
    "ax1.text(\n",
    "    lambda_min,\n",
    "    1.5e-2,\n",
    "    \"$\\\\lambda_{min}$\",\n",
    "    color=\"r\",\n",
    "    horizontalalignment=\"left\",\n",
    "    verticalalignment=\"center\",\n",
    ")\n",
    "\n",
    "# plot maximum curvature vertical line and point\n",
    "ax1.vlines(lcurve.lambda_opt, 0.01, 1, color=\"g\", linestyle=\"--\", linewidth=0.75)\n",
    "ax2.scatter(lcurve.lambda_opt, lcurve.curvature(lcurve.lambda_opt), color=\"g\", marker=\"o\", s=10, zorder=2)\n",
    "ax1.text(\n",
    "    lcurve.lambda_opt,\n",
    "    1.5e-2,\n",
    "    \"$\\\\lambda_{opt}$\",\n",
    "    color=\"g\",\n",
    "    horizontalalignment=\"left\",\n",
    "    verticalalignment=\"center\",\n",
    ")\n",
    "\n",
    "# set axes\n",
    "ax1.set(\n",
    "    xlim=(lambdas[0], lambdas[-1]), ylim=(0.01, 1), xlabel=r\"$\\lambda$\", ylabel=\"Relative errors\"\n",
    ")\n",
    "ax2.set(ylabel=\"curvature of L-curve\")\n",
    "\n",
    "ax1.yaxis.label.set_color(p1.get_color())\n",
    "ax2.yaxis.label.set_color(p2.get_color())\n",
    "\n",
    "ax1.tick_params(axis=\"y\", colors=p1.get_color())\n",
    "ax2.tick_params(axis=\"y\", colors=p2.get_color())\n",
    "\n",
    "ax2.spines[\"left\"].set_color(p1.get_color())\n",
    "ax2.spines[\"right\"].set_color(p2.get_color())\n",
    "\n",
    "ax1.set_title(\n",
    "    \"$e(\\\\lambda_{})$ = {:.2%}, $e(\\\\lambda_{}) = ${:.2%}\".format(\n",
    "        \"{opt}\", error_opt, \"{min}\", error_min\n",
    "    )\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Compare minimum error solution with Lcurve-optimized $x_\\lambda$ one\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "ax.plot(t, x_t, \"k--\", label=\"$x_{true}$\", linewidth=1.0)\n",
    "ax.plot(t, lcurve.inverted_solution(lambda_min), label=\"$\\\\lambda_{min}$\")\n",
    "ax.plot(t, sol, label=\"$\\\\lambda_{opt}$\")\n",
    "ax.set_xlabel(\"$t$\")\n",
    "ax.set_xlim(t.min(), t.max())\n",
    "ax.set_ylim(0, x_t.max() * 1.1)\n",
    "ax.legend()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "# GCV criterion\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Diffinition\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generalized Cross-Validation's idea is that the best modell for the measurements is the one that best predicts each measurement as a function of the others.\n",
    "The GCV estimate of $\\lambda$ is chosen as follows:\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\lambda_\\text{opt} &:= \\arg \\min_{\\lambda} GCV(\\lambda)\\\\\n",
    "GCV(\\lambda) &:= \\frac{\\rho(\\lambda)}{|1 - \\text{tr}{T(\\lambda)}|^2},\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "where\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "\\rho(\\lambda) &:= ||Ax_\\lambda - b||^2\\\\\n",
    "T(\\lambda)b &:= Ax_\\lambda.\n",
    "\\end{align}\n",
    "$$\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using SVD components, $GCV(\\lambda)$ is replaced as follows:\n",
    "\n",
    "$$\n",
    "GCV(\\lambda) = \\frac{\\rho(\\lambda)}{\\left[1 - \\sum_{i=1}^r w_i(\\lambda) \\right]^2}.\n",
    "$$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cherab.phix.inversion import GCV\n",
    "\n",
    "gcv = GCV(sigma, u, vh.T, data=b)\n",
    "\n",
    "bounds = (-20.0, 2.0)  # bounds of log10 of regularization parameter\n",
    "sol, status = gcv.solve(bounds=bounds, disp=False)\n",
    "print(status)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot error function and GCV\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results of the GCV optimization shows that GCV does not work for this inversion problem.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "_, ax = gcv.plot_gcv(bounds=bounds, n_beta=500)\n",
    "ax.set_title(\"$\\\\lambda_{} = ${:.2e}\".format(\"{opt}\", gcv.lambda_opt));"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Raw Cell Format",
  "kernelspec": {
   "display_name": "Python 3.10.8 ('cherab-phix-dev')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.16"
  },
  "vscode": {
   "interpreter": {
    "hash": "2725905a4c02db19e04df9b8fdbbe5ec65a73ea52bebaf9474aa1cc98819834c"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}