Skip to content

Optimizer ignores specified maximum number of epochs under certain conditions #27

@houstonhaynes

Description

@houstonhaynes

Optimizer ignores specified maximum number of epochs under certain conditions

Issue #432 | Created by @RyushiAok | 2022-06-18 09:19:53 UTC |

Suppose the last epoch is reached and the output of the loss function in all mini-batches continues to update its maximum or minimum value. In that case, the conditional statement https://github.com/DiffSharp/DiffSharp/blob/2422b28e38e3b3a33aa0acde8ba01ace13c9ae3b/src/DiffSharp.Core/Optim.fs#L209 is skipped, and the optimizer does not terminate after the specified number of times.

module Program =  
    open DiffSharp
    open DiffSharp.Compose
    open DiffSharp.Model
    open DiffSharp.Data
    open DiffSharp.Optim
    open DiffSharp.Util  
    [<EntryPoint>]
    let main _ =  
        dsharp.seed(0)
        dsharp.config(backend=Backend.Torch, device=Device.GPU)
        let xs = dsharp.tensor [|0.0..0.001..10.0|] 
        let data = // https://qiita.com/niisan-tokyo/items/a94dbd3134219f19cab1
            sin xs + sin (3.*xs) + sin(10.0 * xs) + cos(5.0*xs) + 0.1 * dsharp.randn (xs.shape[0])  
        let input, target = 
            let inputN, targetN = 64, 16
            data.toArray() :?> float32[]
            |> Array.windowed(inputN + targetN)
            |> Array.map(fun ary ->  ary[..inputN-1], ary[inputN..])
            |> Array.unzip
            |> fun (i,t) ->
                i |> dsharp.tensor |> dsharp.view[-1;64;1] ,
                t |> dsharp.tensor |> dsharp.view[-1;16;1] 
        let dataset = TensorDataset(input, target).loader 4096  
        for _ in 0..1000 do
            System.Console.Clear()
            let model =
                Conv1d(64, 64, 8, padding=4)
                --> dsharp.relu 
                --> dsharp.maxpool1d (2,padding=1)
                --> Conv1d(64, 64, 8,padding=3)
                --> dsharp.relu
                --> dsharp.maxpool1d (2,padding=1)
                --> Conv1d(64,32,8,padding=4)
                --> dsharp.relu
                --> Conv1d(32,16,8,padding=3) 
                --> dsharp.tanh   
            optim.adam(model, dataset, dsharp.mseLoss, epochs=10)  
        0
Duration   |Iters| Ep|Minib| Loss
0.00:00:01 |   1 | 1 | 1/2 | 1.896947e+000 - New min
0.00:00:01 |   2 | 1 | 2/2 | 1.935213e+000 + New max
0.00:00:01 |   3 | 2 | 1/2 | 1.887589e+000 - New min
0.00:00:01 |   4 | 2 | 2/2 | 1.927640e+000 +
0.00:00:01 |   5 | 3 | 1/2 | 1.876644e+000 - New min
0.00:00:01 |   6 | 3 | 2/2 | 1.914024e+000 +
0.00:00:01 |   7 | 4 | 1/2 | 1.858355e+000 - New min
0.00:00:02 |   8 | 4 | 2/2 | 1.889651e+000 +
0.00:00:02 |   9 | 5 | 1/2 | 1.827203e+000 - New min
0.00:00:02 |  10 | 5 | 2/2 | 1.847010e+000 +
0.00:00:02 |  11 | 6 | 1/2 | 1.775441e+000 - New min
0.00:00:02 |  12 | 6 | 2/2 | 1.777198e+000 +
0.00:00:02 |  13 | 7 | 1/2 | 1.694408e+000 - New min
0.00:00:02 |  14 | 7 | 2/2 | 1.670419e+000 - New min
0.00:00:02 |  15 | 8 | 1/2 | 1.574603e+000 - New min
0.00:00:02 |  16 | 8 | 2/2 | 1.516655e+000 - New min
0.00:00:02 |  17 | 9 | 1/2 | 1.408510e+000 - New min
0.00:00:02 |  18 | 9 | 2/2 | 1.314690e+000 - New min
0.00:00:02 |  19 | 10 | 1/2 | 1.198786e+000 - New min
0.00:00:02 |  20 | 10 | 2/2 | 1.082564e+000 - New min
0.00:00:02 |  21 | 11 | 1/2 | 9.675691e-001 - New min
0.00:00:02 |  22 | 11 | 2/2 | 8.604462e-001 - New min
0.00:00:02 |  23 | 12 | 1/2 | 7.585786e-001 - New min
0.00:00:02 |  24 | 12 | 2/2 | 6.942253e-001 - New min
0.00:00:02 |  25 | 13 | 1/2 | 6.135688e-001 - New min
0.00:00:02 |  26 | 13 | 2/2 | 6.018288e-001 - New min
0.00:00:03 |  27 | 14 | 1/2 | 5.401553e-001 - New min
0.00:00:03 |  28 | 14 | 2/2 | 5.665148e-001 +
0.00:00:03 |  29 | 15 | 1/2 | 5.149547e-001 - New min

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions