国产一级爱片,性色吧,成人福利在线观看免费视频

在求最優(yōu)解時(shí)，前面很多地方都用梯度下降(Gradient Descent)的方法，但由于最優(yōu)步長很難確定，可能會(huì)出現(xiàn)總是在最優(yōu)解附近徘徊的情況，致使最優(yōu)解的搜索過程很緩慢。牛頓法(Newton's Method)在最優(yōu)解的搜索方面有了較大改進(jìn)，它不僅利用了目標(biāo)函數(shù)的一階導(dǎo)數(shù)，還利用了搜索點(diǎn)處的二階導(dǎo)數(shù)，使得搜索算法能更準(zhǔn)確地指向最優(yōu)解。

我們結(jié)合下圖所示的一個(gè)實(shí)例來描述牛頓法的思想。假設(shè)我們想要求得參數(shù)\(\theta\)，使得\(f(\theta)=0\)。算法的描述如下：

隨機(jī)猜測一個(gè)解\(\theta^{(0)}\)，并令\(t=0\)；
在\(\theta^{(t)}\)處用一根切線來近似\(f(\theta)\);
求得切線與橫坐標(biāo)的交點(diǎn)\(\theta^{(t+1)}\)，作為下一個(gè)可能的解;
\(t=t+1\)；
重復(fù)2-4，直到收斂，即\(f(\theta^{(t)})\approx 0\)。

那么\(\theta^{(t+1)}\)與\(\theta^{(t)}\)之間存在怎樣的迭代關(guān)系呢？由切線的斜率可知 \begin{equation} f'(\theta)=\frac{f(\theta)}{\vartriangle}\Rightarrow \vartriangle=\frac{f(\theta)}{f'(\theta)} \end{equation}

觀察\(\theta^{(t+1)}\)與\(\theta^{(t)}\)在橫坐標(biāo)上的關(guān)系，可知 \begin{equation} \theta^{(t+1)}=\theta^{(t)}-\vartriangle=\theta^{(t)}-\frac{f(\theta)}{f'(\theta)} \end{equation}

牛頓法給出了\(f(\theta)=0\)的求解算法，那么怎樣將其運(yùn)用到求使似然函數(shù)\(\mathcal{L}(\theta)\)最大化的參數(shù)上呢？一般最優(yōu)參數(shù)\(\theta^{\star}\)在\(\mathcal{L}(\theta)\)的極值點(diǎn)出取得，即\(\mathcal{L}'(\theta^{\star})=0\)。那么，令上面的\(f(\theta)=\mathcal{L}'(\theta)\)，我們很容易就得出了下列的迭代法則 \begin{equation} \theta^{(t+1)}=\theta^{(t)}-\frac{\mathcal{L}'(\theta^{(t)})}{\mathcal{L}''(\theta^{(t)})} \end{equation} 最終求得使\(\mathcal{L}'(\theta)=0\)的參數(shù)\(\theta^\star\)，也就是令似然函數(shù)\(\mathcal{L}(\theta)\)最大的參數(shù)。

上面討論的參數(shù)\(\theta\in\mathbb{R}\)，我們現(xiàn)在將牛頓法則推廣到\(n\)維向量\(\theta\in\mathbb{R}^n\)，對應(yīng)的迭代法則形式如下： \begin{equation} \theta^{(t+1)}=\theta^{(t)}-H^{-1}\nabla_{\theta}\mathcal{L} \end{equation} 其中\(zhòng)(H\)為\(\mathcal{L}\)對向量\(\theta^{(t)}\)的二階偏導(dǎo)，稱為Hessian矩陣,\(H_{ij}=\frac{\partial^2\mathcal{L}}{\partial\theta^{(t)}_i\partial\theta^{(t)}_j}\)。

接下來，我們從另外一個(gè)角度來考察牛頓法。用似然函數(shù)\(\mathcal{L}(\theta)\)的二階泰勒展開\(\mathcal{F}(\theta)\)來對其進(jìn)行逼近。 \begin{equation} \mathcal{L}(\theta)\approx\mathcal{F}(\theta)=\mathcal{L}(\theta^{(t})+\nabla_{\theta^{(t)}}\mathcal{L}(\theta-\theta^{(t)})+\frac{1}{2}(\theta-\theta^{(t)})^TH(\theta-\theta^{(t)}) \end{equation} 令\(\theta=\theta^{(t+1)}\)，可得 \begin{equation} \begin{array}{ll} \mathcal{F}(\theta^{(t+1)})=&\mathcal{L}(\theta^{(t)})+\nabla_{\theta^{(t)}}\mathcal{L}^T(\theta^{(t+1)}-\theta^{(t)})\\ &+\frac{1}{2}(\theta^{(t+1)}-\theta^{(t)})^TH(\theta^{(t+1)}-\theta^{(t)}) \end{array} \end{equation} 現(xiàn)在，我們的目的是求得使\(\mathcal{F}(\theta^{(t+1)})\)最小的參數(shù)\(\theta^{(t+1)}\)。將上式對\(\theta^{(t+1)}\)求導(dǎo)并令導(dǎo)數(shù)為0，可得 \begin{equation} \frac{\partial\mathcal{F}(\theta^{(t+1)})}{\partial\theta^{(t+1)}}=\nabla_{\theta^{(t)}}\mathcal{L}+H(\theta^{(t+1)}-\theta^{(t)})=0 \end{equation} 等式兩側(cè)同時(shí)左乘\(H^{-1}\)，化簡得 \begin{equation} \theta^{(t+1)}=\theta^{(t)}-H^{-1}\nabla_{\theta}\mathcal{L} \end{equation}

我們用的是二階泰勒展開式\(\mathcal{F}(\theta)\)逼近似然函數(shù)\(\mathcal{L}(\theta)\)。如果\(\mathcal{L}(\theta)\)確實(shí)為二次函數(shù)，那么\(\mathcal{F}(\theta)\)就是\(\mathcal{L}(\theta)\)的準(zhǔn)確展開式，利用牛頓法一步就可以直接求得最優(yōu)解。一般情況下，\(\mathcal{L}(\theta)\)并非二次函數(shù)，那么\(\mathcal{F}(\theta)\)也就存在逼近誤差，使得一次迭代不能求得最優(yōu)解，當(dāng)\(\mathcal{L}(\theta)\)的次數(shù)很高時(shí)，往往要經(jīng)歷很多次迭代。一般而言，因?yàn)榕ｎD法利用了二階導(dǎo)數(shù)來修正搜索方向和步長，收斂速度很更快。但是這同樣也是要付出代價(jià)的，相比梯度下降而言，我們需要額外計(jì)算Hessian矩陣并求其逆，這兩步的計(jì)算代價(jià)都很大。只要參數(shù)\(\theta\)的維度\(n\)不是很大，可以考慮用牛頓迭代。另外還有一點(diǎn)，如果目標(biāo)函數(shù)不是嚴(yán)格的凸函數(shù)，Hessian矩陣\(H\)很可能是奇異矩陣，也就是存在特征值為0的情況，那么它的逆矩陣是不存在的，也就無法用牛頓法。

今年有一道面試題是要求我們寫出一段程序，求解\(\sqrt{n}\)。如果把牛頓法用上去，問題就迎刃而解了。我們設(shè)定目標(biāo)函數(shù)為\(f(x)=x^2-n\)，那么令\(f(x)=0\)的解很顯然就是\(\pm\sqrt{n}\)。要注意的是，我們要選擇合理的迭代起始點(diǎn)，如果我們從正數(shù)開始迭代，求得的是\(\sqrt{n}\);如果從負(fù)數(shù)開始迭代，求得的就是\(-\sqrt{n}\)；如果從0開始迭代，會(huì)出現(xiàn)未定義的計(jì)算(0作為除數(shù))。我們根據(jù)前面講的牛頓迭代法則，直接給出該題的迭代法則 \begin{equation} x^{(t+1)}=x^{(t)}-\frac{f(x^{(t)})}{f'(x^{(t)})}=x^{(t)}-\frac{(x^{(t)})^2-n}{2x^{(t)}}=\frac{1}{2}\left(x^{(t)}+\frac{n}{x^{(t)}}\right) \end{equation} 下面是由該算法寫出的一段精簡的code，濃縮了牛頓算法的精髓

      
         1
      
      
        double
      
       mysqrt1(
      
        double
      
      
         n)

      
      
         2
      
      
        {

      
      
         3
      
      
        if
      
       (n<
      
        0
      
      ) 
      
        return
      
       -
      
        1
      
      
        ;

      
      
         4
      
      
        if
      
      (n==
      
        0
      
      ) 
      
        return
      
      
        0
      
      
        ;

      
      
         5
      
      
        double
      
       eps=1e-
      
        5
      
      
        ;

      
      
         6
      
      
        double
      
       x=
      
        0.1
      
      ;
      
        //
      
      
        start from a positive value
      
      
         7
      
      
        while
      
      (fabs(x*x-n)>=
      
        eps)

      
      
         8
      
               x=(x+n/x)/
      
        2
      
      ;
      
        //
      
      
        Newton's method
      
      
         9
      
      
        return
      
      
         x;

      
      
        10
      
       }

這道題我還想了另外一個(gè)算法，算法的啟發(fā)點(diǎn)來源于\((x-1)(x+1)+1=x^2=n\)。用這個(gè)算法，我們的迭代起始點(diǎn)可以是0。算法的基本思想如下：給定一個(gè)初始步長step，從起始點(diǎn)開始每次向前走一個(gè)步長，直到超過了\(\sqrt{n}\)；一旦超過了\(\sqrt{n}\)，就要開始慢慢向最終解靠近，每次前進(jìn)或后退的步長都縮減為以前的一半。很明顯，這個(gè)算法沒有牛頓迭代法快。我只用了少數(shù)幾個(gè)測試用例，兩段程序的計(jì)算結(jié)果都和sqrt庫函數(shù)的計(jì)算結(jié)果一致。代碼如下：

      
         1
      
      
        double
      
       mysqrt2(
      
        double
      
      
         n)

      
      
         2
      
      
        {

      
      
         3
      
      
        if
      
       (n<
      
        0
      
      ) 
      
        return
      
       -
      
        1
      
      
        ;

      
      
         4
      
      
        double
      
       x=
      
        0
      
      
        ;

      
      
         5
      
      
        double
      
       step=
      
        10
      
      
        ;

      
      
         6
      
      
        int
      
       threshold=
      
        0
      
      
        ;

      
      
         7
      
      
        double
      
       eps=1e-
      
        5
      
      
        ;

      
      
         8
      
      
        double
      
       res=x*
      
        x;

      
      
         9
      
      
        while
      
      (fabs(res-n)>=
      
        eps)

      
      
        10
      
      
            {

      
      
        11
      
      
        if
      
      (res<
      
        n)

      
      
        12
      
      
                {

      
      
        13
      
      
        //
      
      
        once we have passed the solution,we must walk forward slowly 
      
      
        14
      
      
        if
      
      (threshold) step/=
      
        2
      
      
        ;

      
      
        15
      
                   x+=step;
      
        //
      
      
        walking forward
      
      
        16
      
      
                }

      
      
        17
      
      
        else
      
      
        //
      
      
        walk
      
      
        18
      
      
                {

      
      
        19
      
                   threshold=
      
        1
      
      ;
      
        //
      
      
        indicating we have passed the solution
      
      
        20
      
                   step/=
      
        2
      
      ;
      
        //
      
      
        reducing the step size to its half
      
      
        21
      
                   x-=step;
      
        //
      
      
        walking back
      
      
        22
      
      
                }

      
      
        23
      
               res=x*x;
      
        //
      
      
        compute x*x to estimate real n
      
      
        24
      
           }
      
        //
      
      
        end while
      
      
        25
      
      
        return
      
      
         x;

      
      
        26
      
       }

Newton's Method

更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主

微信掃碼或搜索：z360901061

微信掃一掃加我為好友

QQ號聯(lián)系： 360901061

您的支持是博主寫作最大的動(dòng)力，如果您喜歡我的文章，感覺我的文章對您有幫助，請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧，狠狠點(diǎn)擊下面給點(diǎn)支持吧，站長非常感激您！手機(jī)微信長按不能支付解決辦法：請將微信支付二維碼保存到相冊，切換到微信，然后點(diǎn)擊微信右上角掃一掃功能，選擇支付二維碼完成支付。

【本文對您有幫助就好】元

2元

5元

10元

20元

自定義