This work is licensed under a Creative\n", "Commons Attribution-ShareAlike 4.0 International License.\n", "\n", "### About this document {-}\n", "\n", "This document was created using Weave.jl. The code is available in\n", "[the course github\n", "repository](https://bitbucket.org/paulschrimpf/econ526). The same\n", "document generates both static webpages and associated jupyter\n", "notebooks. This is meant to accompany [the lecture notes for 526](http://faculty.arts.ubc.ca/pschrimpf/526/526.html). \n", "\n", "$$\n", "\\def\\indep{\\perp\\!\\!\\!\\perp}\n", "\\def\\Er{\\mathrm{E}}\n", "\\def\\R{\\mathbb{R}}\n", "\\def\\En{{\\mathbb{E}_n}}\n", "\\def\\Pr{\\mathrm{P}}\n", "\\newcommand{\\norm}[1]{\\left\\Vert {#1} \\right\\Vert}\n", "\\newcommand{\\abs}[1]{\\left\\vert {#1} \\right\\vert}\n", "\\DeclareMathOperator*{\\argmax}{arg\\,max}\n", "\\DeclareMathOperator*{\\argmin}{arg\\,min}\n", "\\def\\inprob{\\,{\\buildrel p \\over \\rightarrow}\\,} \n", "\\def\\indist{\\,{\\buildrel d \\over \\rightarrow}\\,} \n", "$$\n", "\n", "# Introduction\n", "\n", "The goal of this notebook is to give you some familiarity with numeric\n", "optimization. Understanding the code in this notebook will require\n", "some knowledge of Julia. \n", "\n", "Therefore you might find the following other resources about\n", "Julia useful:\n", "\n", "- [Julia manual](https://docs.julialang.org/)\n", "- [QuantEcon](https://lectures.quantecon.org/jl/)\n", "- [List of learning resources](https://julialang.org/learning/)\n", "\n", "\n", "Numeric optimization is important because many (most) models cannot be\n", "fully solved analytically. Numeric results can be used to complement\n", "analytic ones. Numeric optimization plays a huge role in econometrics. \n", "\n", "In our notes on optimization, we focused on maximization problems\n", "because utility and profit maximization are arguably the most\n", "fundamental optimization problems in economics. In this notebook, we\n", "will focus on minimization problems following the convention in\n", "mathematics, engineering, and most numerical libraries. It is easy\n", "to convert between minimization and maximization, and we hope that\n", "this does not lead to any confusion.\n", "\n", "# Heuristic searches\n", "\n", "The simplest type of optimization algorithm are heuristic\n", "searches. Consider the problem: \n", "\n", "$$\n", "\\min_x f(x)\n", "$$\n", "\n", "with $f:\\R^n \\to \\R$. Heuristic search algorithms consist of \n", "\n", "1. Evaluate $f$ at a collection of points \n", "2. Generate a new candidate point, $x^{new}$. Replace a point\n", " in the current collection with $x^{new}$ if $f(x^{new})$ is small enough. \n", "3. Stop when function value stops decreasing and/or collection of\n", " points become too close together. \n", " \n", "There are many variants of such algorithms with different ways of\n", "generating new points, deciding whether to accept the new point, and\n", "deciding when to stop. Here is a simple implementation and animation\n", "of the above idea. In the code below, new points are drawn randomly\n", "from a normal distribution, and new points are accepted whenever\n", "$f(x^{new})$ is smaller than the worst existing function value." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2.3032762485211047e-8, [0.9998834413084575, 0.9996697013783358], 78, \"Convergence.\", Animation(\"/tmp/jl_M6WsCz\", [\"000001.png\", \"000002.png\", \"000003.png\", \"000004.png\", \"000005.png\", \"000006.png\", \"000007.png\", \"000008.png\", \"000009.png\", \"000010.png\" … \"000069.png\", \"000070.png\", \"000071.png\", \"000072.png\", \"000073.png\", \"000074.png\", \"000075.png\", \"000076.png\", \"000077.png\", \"000078.png\"]))" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "using Distributions # julia functions come in packages, we make them available with using\n", "# to install a package type `!add Distributions`\n", "# or `using Pkg; Pkg.add(\"Distributions\")`\n", "using Plots\n", "\n", "# Lines starting with # are comments.\n", "\n", "# The following block is a docstring. It is what would be displayed if\n", "# you enter `?minrandomsearch\n", "\"\"\"\n", " minrandomsearch(f, x0, npoints; var0=1.0, ftol = 1e-6,\n", " vtol = 1e-4, maxiter = 1000,\n", " vshrink=0.9, xrange=[-2., 3.],\n", " yrange=[-2.,6.])\n", " \n", " Find the minimum of function `f` by random search. \n", " \n", " Creates an animation illustrating search progress.\n", " \n", " Inputs:\n", " \n", " - `f` function to minimizie\n", " - `x0` starting value\n", " - `npoints` number of points in cloud\n", " - `var0` initial variance of points\n", " - `ftol` convergence tolerance for function value. Search terminates if both function change is less than ftol and variance is less than vtol.\n", " - `vtol` convergence tolerance for variance. Search terminates if both function change is less than ftol and variance is less than vtol.\n", " - `maxiter` maximum number of iterations\n", " - `vshrink` after every 100 iterations with no function improvement, the variance is reduced by this factor\n", " - `xrange` range of x-axis in animation\n", " - `yrange` range of y-axis in animation\n", " - `animate` whether to create animation\n", "\n", " Output:\n", "\n", " - `(fmin, xmin, iter, info, anim)` tuple consisting of minimal function\n", " value, minimizer, number of iterations, convergence info, and an animation\n", "\n", "\"\"\"\n", "function minrandomsearch(f, x0, npoints; var0=1.0, ftol = 1e-6,\n", " vtol = 1e-4, maxiter = 1000,\n", " vshrink=0.9, xrange=[-2., 3.],\n", " yrange=[-2.,6.], animate=true)\n", " var = var0 # current variance for search\n", " oldbest = Inf # smallest function value\n", " xbest = x0 # x with smallest function vale\n", " newbest = f(xbest) \n", " iter = 0 # number of iterations\n", " noimprove = 0 # number of iterations with no improvement\n", "\n", " animate = (animate && length(x0)==2)\n", "\n", " if animate\n", " # make a contour plot of the function we're minimizing. This is for\n", " # illustrating; you wouldn't have this normally\n", " x = range(xrange[1],xrange[2], length=100)\n", " y = range(yrange[1],yrange[2], length=100)\n", " c = contour(x,y,(x,y) -> log(f([x,y])))\n", " anim = Animation()\n", " else\n", " anim = nothing\n", " end\n", " \n", " while ((oldbest - newbest > ftol || var > vtol) && iter<=maxiter)\n", " oldbest = newbest\n", " x = rand(MvNormal(xbest, var),npoints)\n", "\n", " if animate\n", " # plot the search points\n", " p = deepcopy(c)\n", " scatter!(p, x[1,:], x[2,:], markercolor=:black, markeralpha=0.5, legend=false, xlims=xrange, ylims=yrange)\n", " end\n", " \n", " fval = mapslices(f,x, dims=[1])\n", " (newbest, i) = findmin(fval)\n", " if (newbest > oldbest)\n", " noimprove+=1\n", " newbest=oldbest\n", " else\n", " xbest = x[:,i[2]]\n", " noimprove = 0\n", " end\n", "\n", " if animate\n", " # plot the best point so far\n", " scatter!(p, [xbest[1]],[xbest[2]], markercolor=:red, legend=false)\n", " end\n", " \n", " if (noimprove > 10) # shrink var\n", " var *= vshrink\n", " end\n", "\n", " frame(anim) # add frame to animation\n", " \n", " iter += 1\n", " end\n", " if (iter>maxiter)\n", " info = \"Maximum iterations reached\"\n", " else\n", " info = \"Convergence.\"\n", " end\n", " return(newbest, xbest, iter, info, anim)\n", "end\n", "\n", "\"\"\"\n", " banana(a,b)\n", " \n", " Returns the Rosenbrock function with parameters a, b.\n", "\"\"\"\n", "function banana(a,b)\n", " x->(a-x[1])^2+b*(x[2]-x[1]^2)^2\n", "end\n", "f = banana(1.0,1.0)\n", "\n", "x0 = [-2.0, 3.0]\n", "result = minrandomsearch(f, x0, 20, var0=0.1, vshrink=0.5, vtol=1e-3 )" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "┌ Info: Saved animation to \n", "│ fn = /home/paul/econ526/01-optimization/julia/randsearch.gif\n", "└ @ Plots /home/paul/.julia/packages/Plots/h3o4c/src/animation.jl:95\n" ] }, { "data": { "text/html": [ "" ], "text/plain": [ "Plots.AnimatedGif(\"/home/paul/econ526/01-optimization/julia/randsearch.gif\")" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import Base64:stringmime\n", "gif(result[5], \"randsearch.gif\", fps=5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![random search](./randsearch.gif \"Random search\")\n", "\n", "There are many other heuristic search algorithms. A popular\n", "deterministic one is the Nelder-Mead simplex. Popular heuristic\n", "search algorithms that include some randomness include simulated\n", "annealing and particle swarm. Each of the three algorithms just\n", "mentioned are available in\n", "[Optim.jl](https://julianlsolvers.github.io/Optim.jl/stable/#algo/nelder_mead/). These\n", "heuristic searches have the advantage that they only function values\n", "(as opposed to also requiring gradients or hessians, see\n", "below). Some heuristic algorithms, like simulated annealing, can be\n", "shown to converge to a global (instead of local) minimum under\n", "appropriate assumptions. Compared to algorithms that use more\n", "information, heuristic algorithms tend to require many more function\n", "evaluations. \n", "\n", "# Gradient descent\n", "\n", "Gradient descent is an iterative algorithm to find a local minimum. As\n", "the name suggests, it consists of descending toward a minimum in the\n", "direction opposite the gradient. Each step, you start at some $x$ and\n", "compute $x_{new}$\n", "\n", "1. Given current $x$, compute $x_{new} = x - \\gamma Df_{x}$\n", "2. Adjust $\\gamma$ depending on whether $f(x_{new})