{
 "cells": [
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    ".. _lcurve:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "L-curve criterion\n",
    "===\n",
    "By P.C. Hansen"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Diffinition\n",
    "---"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Practically, in order to solve inversion problems $Ax=b$, we need to consider the similar form of least-squeares equations as follows:\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "    x_\\lambda &= \\arg\\min\\{||Ax - b||_2^2 + \\lambda\\cdot||L(x - x_0)||_2^2\\}\\\\\n",
    "    & = (A^\\mathsf{T}A + \\lambda L^\\mathsf{T}L)^{-1} (A^\\mathsf{T}b + \\lambda L^\\mathsf{T}x_0).\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "$\\lambda$ is a real regularization parameter that must be chosen by the user. The vector $b$ is the given data and the vector $x_0$ is a priori estimate of $x$ which is set to zero when no a priori information is available.\n",
    "$||Ax - b||_2$ is the residual term and $||L(x - x_0)||_2$ is the regularization term.\n",
    "$\\lambda$ is the parameter which decides the contribution to these terms.\n",
    "And L-curve criterion is the method to balance between these contribution and optmize the inversion solution."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "L-curve is precisely following points curve:\n",
    "$$\n",
    "    (||Ax - b||_2, ||L(x - x_0)||_2)\n",
    "$$"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "L-curve criterion is based on the fact that the optimal reguralization paramter is acheived when the L-curve point:\n",
    "\n",
    "$$\n",
    "    (\\log||Ax_\\lambda - b||_2,\\; \\log||L(x_\\lambda - x_0)||_2)\n",
    "$$\n",
    "\n",
    "lies on this corner."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In order to obtain the strict $\\lambda$ on the L-curve \"corner\" which is defined by the mathmatical method, we can calculate the L-curve carvature $\\kappa$ given by following equation:\n",
    "$$\n",
    "\\begin{align}\n",
    "\\kappa &= \\frac{f^{\\prime\\prime}(x)}{\\left[1 + f^{\\prime}(x)^2\\right]^{3/2}}=-2 \\eta\\rho\\frac{\\lambda^2 \\eta + \\lambda \\rho + \\rho\\eta/\\eta^\\prime}{(\\lambda^2 \\eta^2 + \\rho^2)^{3/2}},\\\\\n",
    "\\rho &\\equiv ||Ax_\\lambda - b||_2^2,\\\\\n",
    "\\eta &\\equiv ||L(x_\\lambda - x_0)||_2^2,\\\\\n",
    "\\eta^\\prime &\\equiv \\frac{d\\eta}{d\\lambda}\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using the singular value decomposion (SVD) fomula, $x_\\lambda$ is described as follows:\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "    x_\\lambda &= L^{-1}V(\\Sigma^\\mathsf{T}\\Sigma + \\lambda I)^{-1}\\Sigma^\\mathsf{T}\\ U^\\mathsf{T}b\\\\\n",
    "    &= L^{-1}V\n",
    "    \\begin{pmatrix}\n",
    "        w_1(\\lambda) / \\sigma_1 \\\\\n",
    "        w_2(\\lambda) / \\sigma_2 \\\\\n",
    "        \\vdots\n",
    "    \\end{pmatrix}\n",
    "    \\circ U^\\mathsf{T}b\\\\\n",
    "    &= \\sum_{i=1}^K w_i(\\lambda) \\frac{(u_i, b)}{\\sigma_i}L^{-1}v_i,\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "where,\n",
    "\n",
    "$$\n",
    "\\begin{align}\n",
    "w_i(\\lambda) &\\equiv \\frac{1}{1 + \\lambda/\\sigma_i^2},\\\\\n",
    "U\\Sigma V^\\mathsf{T} &\\equiv (u_1, u_2, ...) \\cdot \\text{diag}(\\sigma_1, \\sigma_2,...) \\cdot (v_1, v_2, ...)^\\mathsf{T} = AL^{-1}.\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "Here, $K\\equiv\\min(M, N)$, which means the size of SVD matrices: $U=\\text{Mat}(M, M)$, $\\Sigma=\\text{Mat}(M, N)$, $V=\\text{Mat}(N, N)$, and a priori estimate $x_0$ is assumed to be 0. Additionally, the symbol $\\circ$\n",
    "means the Hadamard product."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also $\\rho, \\eta,$ and $\\eta^\\prime$ is described by using SVD as follows:\n",
    "$$\n",
    "\\begin{align}\n",
    "\\rho &= \\sum_{i=1}^K (1 - w_i)^2(u_i, b)^2,\\\\\n",
    "\\eta &= \\sum_{i=1}^K w_i^2 \\frac{(u_i, b)^2}{\\sigma_i^2},\\\\\n",
    "\\eta^\\prime &= -\\frac{2}{\\lambda}\\sum_{i=1}^K (1 - w_i)w_i^2\\frac{(u_i, b)^2}{\\sigma_i^2}.\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Example ill-posed linear operator equation and applying L-curve criterion into this problem.\n",
    "---"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As a famouse ill-posed linear equation, Fredholm integral equation is often used:\n",
    "$$\n",
    "\\int_a^b K(s, t) x(t) dt = b(s), \\quad c\\leq s\\leq d.\n",
    "$$\n",
    "Here, we think the following situation as above equation form:\n",
    "$$\n",
    "\\begin{align}\n",
    "K(s, t) &\\equiv (\\cos(s) + \\cos(t))\\left(\\frac{\\sin(u)}{u}\\right)^2,\\\\\n",
    "u &\\equiv \\pi\\left(\\sin(s) + \\sin(t) \\right),\\\\\n",
    "[a, b] &= [c, d] = \\left[-\\frac{\\pi}{2}, \\frac{\\pi}{2} \\right].\n",
    "\\end{align}\n",
    "$$\n",
    "And, the true solution $x_\\text{t}(t)$ is assumed as follows:\n",
    "\n",
    "$$\n",
    "x_\\text{t}(t) = 2.0 \\exp\\left[-6(t-0.8)^2 \\right] + \\exp\\left[-2(t+0.5)^2 \\right]\n",
    "$$\n",
    "\n",
    "let us define these function as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from matplotlib import pyplot as plt\n",
    "\n",
    "from cherab.phix.inversion import Lcurve\n",
    "\n",
    "plt.rcParams[\"figure.dpi\"] = 150\n",
    "\n",
    "\n",
    "def kernel(s: np.ndarray, t: np.ndarray):\n",
    "    \"\"\"Kernel of Fredholm integral equation of the first kind.\"\"\"\n",
    "    u = np.pi * (np.sin(s) + np.sin(t))\n",
    "    if u == 0:\n",
    "        return np.cos(s) + np.cos(t)\n",
    "    else:\n",
    "        return (np.cos(s) + np.cos(t)) * (np.sin(u) / u) ** 2\n",
    "\n",
    "\n",
    "# true solution\n",
    "def x_t_func(t):\n",
    "    return 2.0 * np.exp(-6.0 * (t - 0.8) ** 2) + np.exp(-2.0 * (t + 0.5) ** 2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's descritize the above integral equation. $s$ and $t$ are devided to 100 points at even intervals.\n",
    "$x$ is a 1D vector data $(100, )$ and $A$ is a $100\\times 100$ matrix.\n",
    "$A$ is defined using karnel function. When discretizing the integral, the trapezoidal approximation is applied."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set valiables\n",
    "s = np.linspace(-np.pi * 0.5, np.pi * 0.5, num=100)\n",
    "t = np.linspace(-np.pi * 0.5, np.pi * 0.5, num=100)\n",
    "x_t = x_t_func(t)\n",
    "\n",
    "# Operater matrix: A\n",
    "A = np.zeros((s.size, t.size))\n",
    "A = np.array([[kernel(i, j) for j in t] for i in s])\n",
    "\n",
    "# trapezoidal rule\n",
    "A[:, 0] *= 0.5\n",
    "A[:, -1] *= 0.5\n",
    "A *= t[1] - t[0]\n",
    "\n",
    "# simpson rule  -- option\n",
    "# A[:, 1:-1:2] *= 4.0\n",
    "# A[:, 2:-2:2] *= 2.0\n",
    "# A *= (t[1] - t[0]) / 3.0\n",
    "\n",
    "print(f\"condition number of A is {np.linalg.cond(A)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then excute singular value decomposition of $A$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# SVD using the numpy module\n",
    "u, sigma, vh = np.linalg.svd(A, full_matrices=False)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "The measured data $b$ is added white noise $\\bar{b}$, and truly converted $b_0$ is computed from $Ax_\\text{t}$.\n",
    "\n",
    "Descritized linear equation is as follows:\n",
    "$$\n",
    "Ax = b_0 + \\bar{b} = b\n",
    "$$\n",
    "\n",
    "The noise variance is $1.0 \\times 10^{-4}$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "b_0 = A.dot(x_t)\n",
    "rng = np.random.default_rng()\n",
    "b_noise = rng.normal(0, 1.0e-4, b_0.size)\n",
    "b = b_0 + b_noise"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {
    "raw_mimetype": "text/restructuredtext"
   },
   "source": [
    "In term of regularization, let us think tikhonov regularization, that is, :math:`L = I` and :math:`x_0 = 0`.\n",
    "And as an optimization method of regulariation parameters, use L-curve method as described above.\n",
    ":obj:`.Lcurve` is defined in :py:mod:`cherab.phix.inversion` module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lcurv = Lcurve(sigma, u, vh, b)\n",
    "lcurv.lambdas = 10 ** np.linspace(-20, 2, 100)\n",
    "lcurv.optimize(10)  # iteration times to find the maximum carvature"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot measured data w or w/o noise\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The noise level is so mute that there is no clear difference between those. However this causes to arise the ill-posed problem due to the kernel function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "plt.plot(s, b_0)\n",
    "plt.plot(s, b)\n",
    "plt.legend([\"w/o noise\", \"w noise\"])\n",
    "plt.xlabel(\"s\")\n",
    "plt.ylabel(\"b(s)\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot L-curve\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": [
     "nbsphinx-thumbnail"
    ]
   },
   "outputs": [],
   "source": [
    "lcurv.plot_L_curve(scatter_plot=5, scatter_annotate=True);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compare true solution $x_\\text{t}$ with estimated $x_\\lambda$\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Varing regularization parameters makes the interestingly difference between the true solution with estimated one.\\\n",
    "Here, let us compare the solutions varied from $\\lambda=10^{-11}, \\lambda_\\text{opt}, 1.0$. ($\\lambda_\\text{opt}$ is the optimized regularization parameter.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "lambdas = [1.5e-10, lcurv.lambda_opt, 1.0]\n",
    "\n",
    "fig, axes = plt.subplots(1, 3, figsize=(10, 3))\n",
    "fig.tight_layout(pad=-2.0)\n",
    "labels = [f\"$\\\\lambda =$ {i:.2e}\" for i in lambdas]\n",
    "i = 0\n",
    "for ax, beta, label in zip(axes, lambdas, labels):\n",
    "    ax.plot(t, x_t, \"--\", label=\"$x_{true}$\")\n",
    "    ax.plot(t, lcurv.inverted_solution(beta=beta), label=\"$x_\\\\lambda$\")\n",
    "\n",
    "    ax.set_xlim(t.min(), t.max())\n",
    "    ax.set_ylim(0, x_t.max() * 1.1)\n",
    "    ax.set_xlabel(\"$t$\")\n",
    "    ax.set_title(label)\n",
    "    ax.tick_params(direction=\"in\", labelsize=10, which=\"both\", top=True, right=True)\n",
    "    if i < 1:\n",
    "        ax.legend()\n",
    "    else:\n",
    "        ax.set_yticklabels([])\n",
    "    i += 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "plot L-curve curvature\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "_, ax = lcurv.plot_curvature()\n",
    "ax.set_title(\"$\\\\lambda_{} = ${:.2e}\".format(\"{opt}\", lcurv.lambda_opt));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "check the error of solutions\n",
    "----"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The relative error is defined as follows:\n",
    "$$\n",
    "e(\\lambda) = \\frac{||x_\\text{t} - x_\\lambda||}{||x_\\text{t}||}.\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "jupyter": {
     "source_hidden": true
    }
   },
   "outputs": [],
   "source": [
    "def error(values, true):\n",
    "    return np.linalg.norm(true - values) / np.linalg.norm(true)\n",
    "\n",
    "\n",
    "errors = np.asarray([error(lcurv.inverted_solution(beta), x_t) for beta in lcurv.lambdas])\n",
    "lambda_min = lcurv.lambdas[errors.argmin()]\n",
    "error_min = errors.min()\n",
    "error_opt = errors[np.where(lcurv.lambdas == lcurv.lambda_opt)[0][0]]\n",
    "\n",
    "# xlim, ylim\n",
    "xlim = 1.0e-13, 1.0e2\n",
    "ylim = 1.0e-02, 2.0e2\n",
    "\n",
    "fig, ax = plt.subplots()\n",
    "\n",
    "# relative errors\n",
    "ax.loglog(lcurv.lambdas, errors, color=\"C0\")\n",
    "ax.text(lcurv.lambdas[35], errors[33], \"$e(\\\\lambda)$\", color=\"C0\")\n",
    "\n",
    "# residual norms\n",
    "residuals = [lcurv.residual_norm(beta) for beta in lcurv.lambdas]\n",
    "ax.loglog(lcurv.lambdas, residuals, color=\"C1\")\n",
    "ax.text(\n",
    "    lcurv.lambdas[-5], residuals[-1], \"$||Ax_\\\\lambda-b||$\", color=\"C1\", horizontalalignment=\"right\"\n",
    ")\n",
    "\n",
    "# regularization norms\n",
    "reguls = [lcurv.regularization_norm(beta) for beta in lcurv.lambdas]\n",
    "ax.loglog(lcurv.lambdas, reguls, color=\"C2\")\n",
    "ax.text(lcurv.lambdas[35], reguls[33], \"$||Lx_\\\\lambda||$\", color=\"C2\")\n",
    "\n",
    "# scattered point\n",
    "min_sol = ax.scatter(lambda_min, error_min, marker=\"x\", color=\"r\")\n",
    "ax.text(\n",
    "    lambda_min,\n",
    "    error_min * 0.8,\n",
    "    \"min($e(\\\\lambda)$)\",\n",
    "    color=\"r\",\n",
    "    horizontalalignment=\"left\",\n",
    "    verticalalignment=\"top\",\n",
    ")\n",
    "opt = ax.scatter(lcurv.lambda_opt, error_opt, marker=\"x\", color=\"g\")\n",
    "ax.text(\n",
    "    lcurv.lambda_opt,\n",
    "    error_opt * 0.8,\n",
    "    \"$e(\\\\lambda_{opt})$\",\n",
    "    color=\"g\",\n",
    "    horizontalalignment=\"right\",\n",
    "    verticalalignment=\"top\",\n",
    ")\n",
    "\n",
    "ax.set_xlim(*xlim)\n",
    "ax.set_ylim(*ylim)\n",
    "ax.set_xlabel(\"$\\\\lambda$\")\n",
    "ax.set_title(\n",
    "    \"min($e(\\\\lambda)$) = {:.2%}, $e(\\\\lambda_{}) = ${:.2%}\".format(error_min, \"{opt}\", error_opt)\n",
    ");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compare minimum error solution with Lcurve-optimized $x_\\lambda$ one\n",
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = plt.subplots()\n",
    "ax.plot(t, x_t, \"k--\", label=\"$x_{true}$\", linewidth=1.0)\n",
    "ax.plot(t, lcurv.inverted_solution(beta=lambda_min), label=\"min(e($\\\\lambda$))\")\n",
    "ax.plot(t, lcurv.inverted_solution(beta=lcurv.lambda_opt), label=\"$\\\\lambda_{opt}$\")\n",
    "ax.set_xlabel(\"$t$\")\n",
    "ax.set_xlim(t.min(), t.max())\n",
    "ax.set_ylim(0, x_t.max() * 1.1)\n",
    "ax.legend();"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "GCV criterion\n",
    "==="
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Diffinition\n",
    "---"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generalized Cross-Validation's idea is that the best modell for the measurements is the one that best predicts each measurement as a function of the others.\n",
    "The GCV estimate of $\\lambda$ is chosen as follows:\n",
    "$$\n",
    "\\begin{align}\n",
    "\\lambda_\\text{opt} &:= \\arg \\min_{\\lambda} GCV(\\lambda)\\\\\n",
    "GCV(\\lambda) &:= \\frac{\\rho(\\lambda)}{|1 - \\text{tr}{T(\\lambda)}|^2},\n",
    "\\end{align}\n",
    "$$\n",
    "\n",
    "where\n",
    "$$\n",
    "\\begin{align}\n",
    "\\rho(\\lambda) &:= ||Ax_\\lambda - b||^2\\\\\n",
    "T(\\lambda)b &:= Ax_\\lambda.\n",
    "\\end{align}\n",
    "$$"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using SVD components, $GCV(\\lambda)$ is replaced as follows:\n",
    "$$\n",
    "GCV(\\lambda) = \\frac{\\rho(\\lambda)}{\\left[1 - \\sum_{i=1}^N w_i(\\lambda) \\right]^2}.\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from cherab.phix.inversion import GCV\n",
    "\n",
    "gcv = GCV(sigma, u, vh, b)\n",
    "gcv.lambdas = 10 ** np.linspace(-20, 2, 100)\n",
    "gcv.optimize(6)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Plot error function and GCV\n",
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The results of the GCV optimization shows that GCV does not work for this inversion problem."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, ax = gcv.plot_gcv()\n",
    "\n",
    "gcvs = [gcv.gcv(beta) for beta in gcv.lambdas]\n",
    "\n",
    "# anotate minimum GCV value point\n",
    "ax.text(gcv.lambdas[0], gcv.gcv(gcv.lambdas[0]) * 10, \"min(gcv($\\\\lambda$))\", color=\"r\");"
   ]
  }
 ],
 "metadata": {
  "celltoolbar": "Raw Cell Format",
  "kernelspec": {
   "display_name": "Python 3.10.8 ('cherab-phix-dev')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 15:55:03) \n[GCC 10.4.0]"
  },
  "vscode": {
   "interpreter": {
    "hash": "2725905a4c02db19e04df9b8fdbbe5ec65a73ea52bebaf9474aa1cc98819834c"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}