[Mini-courseAn] Numerical methods for delay and fractional differential equations | Neville J. Ford (University of Chester, UK)
16 October 2024 2:15 pm – 4:15 pm
Room 1.4, building VII.
Title: Numerical methods for delay and fractional differential equations.
Speaker: Neville J. Ford* (University of Chester, UK).
Date| Time: Wednesday, 16 October 2024, from 14:15 to 16:15 – Room 1.4, building VII.
Abstract: We will consider how to solve delay differential equations and fractional differential equations using numerical schemes. To understand how to do this, we will begin by considering the fundamental theory associated with these equations. We look at the dimension of the underlying dynamical systems to explain how approximation schemes should be set up, and how their performance should be judged. In particular, we shall look at initial and boundary conditions needed to ensure a unique solution and we will see how these must be related to the numerical schemes. If time permits, we shall consider some fractional problems with delays and see how the insights from both types of problems combine in this case.
[SOR] Enhancing the Efficiency and Stability of Deep Neural Network Training through Controlled Mini-batch Algorithms | Corrado Coppola (Sapienza University of Rome, Italy)
16 October 2024 4:15 pm – 5:15 pm
Room 1.11 – VII
Enhancing the Efficiency and Stability of Deep Neural Network Training through
Controlled Mini-batch Algorithms
The exponential growth of trainable parameters in state-of-the-art deep neural networks
(DNNs), driven by innovations such as self-attention layers and over-parameterization, has led
to the development of models containing billions or even trillions of parameters. As training
datasets grow larger and tasks become more complex, the current challenge lies in balancing
convergence guarantees with the increasing need for efficient training. In this work, we focus on
supervised deep learning, where the training problem is formulated as the unconstrained
minimization of a smooth, potentially non-convex objective function with respect to network
weights.
We propose an approach based on Incremental Gradient (IG) and Random Reshuffling (RR)
algorithms, enhanced with derivative-free extrapolation line-search procedures. Specifically, we
present the Controlled Mini-batch Algorithm (CMA), proposed in [1], which incorporates
sufficient decrease conditions for the objective function and allows for line-search procedures to
ensure convergence, without assuming any further hypotheses on the search direction. We also
present computational results on large-scale regression problems.
We further introduce CMA Light, proposed in [2], an enhanced variant of CMA with
convergence guarantees within the IG framework. Using an approximation of the real objective
function to verify sufficient decrease, CMA Light drastically reduces the number of function
evaluations needed and achieves notable performance gains. We discuss computational results
both against CMA and against state-of-the-art optimizers for neural networks, showing a
significant advantage of CMA Light in large-scale classification tasks using residual
convolutional networks.
Finally, we present the Fast-Controlled Mini-batch Algorithm (F-CMA), extending the
convergence theory of CMA Light to the case where samples are reshuffled at each epoch. We
develop a new line-search procedure, and demonstrate F-CMA's superior performance when
training ultra-deep architectures, such as transformers SwinB and SwinT with up to 130 millions
of trainable parameters. Our results show significant advantages in both stability and
generalization compared to state-of-the-art deep learning optimizers.
NOVA Math‘s focus is on cutting edge research, in both pure and applied mathematics, valuing the use of mathematics in the solution of real-world problems at the industrial level and of social relevance.
One of the main strategies developed by NOVA Math is to promote the exchange of knowledge with other sciences. It is important to engage with the users of mathematics, given them the support for their research on one hand, and on another hand, to direct mathematical researchers that seek real-life problems.
Funded by national funds through the FCT – Fundação para a Ciência e a Tecnologia, I.P., under the scope of the following projects:
UIDB/00297/2020, UIDP/00297/2020, UID/MAT/00297/2019, UID/MAT/00297/2013, PEst-OE/MAT/UI0297/2014, PEst-OE/MAT/UI0297/2011.